Using R to Create a Winner-Takes-All Raster from Point Data

 

I am to the point in my dissertation where I need to summarize a fair amount of data.  The full data structure will be in the final manuscript, but for now suffice it to say that I am storing data as latitude/longitude pairs in a flat-file database (and using Panorama Sheets to query and modify it as needed).  Point data can be exported from this database as a CSV (actually tab-separated) file and converted to a shapefile in QGIS.

For this project, the globe is divided into 1×1 degree cells.  Each cell in my study area contains at least one vector point in my locality shapefile (exported from te database, above).  Many cells contain multiple points, each of which may represent a locality interpreted as representing a different paleoenvironment than the others.  I am using R to summarize these data points into a raster for each time slice I wish to study.

The method below emplys a winner-takes-all method, which is to say that during conversion from point data to raster, the points within each cell are evaluated for the values in the field ‘env_id’ (environment ID number, since we can’t use text in a raster), and the most frequent value of ‘env_id’ “wins” the cell and becomes the value burned into the output raster.

### Converting a point shapefile (with environment in numeric
### field env_id) into a raster
### Winner-takes-all.

# Load libraries for raster, maptools and mode
library(raster);
library(maptools);
library(modeest);

# Read in the locality point shapefile
locs<-readShapePoints("shapefile.shp");

# Make the base raster.
basegrid<-raster(ncols=360, nrows=180);

# Create a subset for the age you are looking at
locs_subset<-subset(locs,locs$time_slice==10);

# Convert the locality shapefile to raster.  Use the most frequent
# value of numeric field 'env_id'.
r<-rasterize(locs_subset, basegrid, field='env_id', fun=function(x, ...){mfv(x)[1]});

# View the plot (centered on SA)
plot(r,xlim=c(-100,-30),ylim=c(-70,52));

# View the plot (all)
# plot(r);

# Write the raster to file
writeRaster(r,"10 Ma.gtiff","GTiff");

 

Output from R looks something like the top image, and can be manipulated as any other raster in QGIS etc.  Scale bar shows the numeric values used to represent different paleoenvironments in the raster; it’s not actually a continuous scale but I didn’t have a need to change the default R output.

Additional notes:

  • In rasterize(), to use a function other than the ones enumerated in the docs, it needs to be defined within the call to rasterize() is used above and here.
  • One example of reading in shapefile data to R is here.
  • I learned about package modeest here.

GPlates Sample Data

If you are looking for the official Sample Data for GPlates, included automatically with the Linux installation but not for OS X or Windows, it is available from the Sourceforge site.

GPlates 1.2 download folder – gplates-1.2-sample-data.zip

I am including a list of files here (below the break) in case others like me were wondering where they got some data.  I am not responsible for these data, I just needed to re-download them and couldn’t find the source.

SampleData

CptFiles
DataBundleForNovices
FeatureCollections
LICENSE.txt
Rasters
contents.txt

./CptFiles:
feature_age.cpt
isochrons_and_ridges.cpt
plate_id_categorical.cpt
plate_id_regular.cpt

./DataBundleForNovices:
Global_EarthByte_GPlates_Coastlines_20111013.gpml
Global_EarthByte_GPlates_Coastlines_GridMarks_20110624.gpml
Global_EarthByte_GPlates_PresentDay_COBs_20110610.gpml
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927.gpml
Global_EarthByte_GPlates_PresentDay_Ridges_20100927.gpml
Global_EarthByte_GPlates_Rotation_20100927.rot
README

./FeatureCollections:
COBs
Coastlines
Isochrons
Palaeomagnetism
Rotations
SpreadingRidges
StaticPolygons

./FeatureCollections/COBs:
Global_EarthByte_GPlates_PresentDay_COBs_20110610.dat
Global_EarthByte_GPlates_PresentDay_COBs_20110610.gpml
Global_EarthByte_GPlates_PresentDay_COBs_20110610.xy
README
Shapefile

./FeatureCollections/COBs/Shapefile:
Global_EarthByte_GPlates_PresentDay_COBs_20110610.dbf
Global_EarthByte_GPlates_PresentDay_COBs_20110610.prj
Global_EarthByte_GPlates_PresentDay_COBs_20110610.shp
Global_EarthByte_GPlates_PresentDay_COBs_20110610.shp.gplates.xml
Global_EarthByte_GPlates_PresentDay_COBs_20110610.shx

./FeatureCollections/Coastlines:
Global_EarthByte_GPlates_Coastlines_20111013.dat
Global_EarthByte_GPlates_Coastlines_20111013.gmt
Global_EarthByte_GPlates_Coastlines_20111013.gpml
Global_EarthByte_GPlates_Coastlines_20111013.xy
Global_EarthByte_GPlates_Coastlines_GridMarks_20110624.dat
Global_EarthByte_GPlates_Coastlines_GridMarks_20110624.gpml
Global_EarthByte_GPlates_Coastlines_GridMarks_20110624.xy
README
Shapefile

./FeatureCollections/Coastlines/Shapefile:
Global_EarthByte_GPlates_Coastlines_20111013.dbf
Global_EarthByte_GPlates_Coastlines_20111013.prj
Global_EarthByte_GPlates_Coastlines_20111013.sbn
Global_EarthByte_GPlates_Coastlines_20111013.sbx
Global_EarthByte_GPlates_Coastlines_20111013.shp
Global_EarthByte_GPlates_Coastlines_20111013.shp.gplates.xml
Global_EarthByte_GPlates_Coastlines_20111013.shp.xml
Global_EarthByte_GPlates_Coastlines_20111013.shx

./FeatureCollections/Isochrons:
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927.dat
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927.gpml
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927.xy
README
Shapefile

./FeatureCollections/Isochrons/Shapefile:
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927_polyline.dbf
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927_polyline.prj
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927_polyline.shp
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927_polyline.shp.gplates.xml
Global_EarthByte_GPlates_PresentDay_Isochrons_20100927_polyline.shx

./FeatureCollections/Palaeomagnetism:
LICENSE.txt
README
gpml
vgp

./FeatureCollections/Palaeomagnetism/gpml:
Europe2004_RM_Npoles.vgp.gpml
Gondwana2010_RM_NPoles.vgp.gpml
Laurussia2010_RM_NPoles.vgp.gpml
NorthAmerica2004_RM_Npoles.vgp.gpml

./FeatureCollections/Palaeomagnetism/vgp:
Europe2004_RM_Npoles.vgp
Gondwana2010_RM_NPoles.vgp
Laurussia2010_RM_NPoles.vgp
NorthAmerica2004_RM_Npoles.vgp

./FeatureCollections/Rotations:
Global_EarthByte_GPlates_Rotation_20100927.rot
Global_EarthByte_PlateIDs_20071218.pdf

./FeatureCollections/SpreadingRidges:
Global_EarthByte_GPlates_PresentDay_Ridges_20100927.dat
Global_EarthByte_GPlates_PresentDay_Ridges_20100927.gpml
Global_EarthByte_GPlates_PresentDay_Ridges_20100927.xy
README
Shapefile

./FeatureCollections/SpreadingRidges/Shapefile:
Global_EarthByte_GPlates_PresentDay_Ridges_20100927.dbf
Global_EarthByte_GPlates_PresentDay_Ridges_20100927.prj
Global_EarthByte_GPlates_PresentDay_Ridges_20100927.shp
Global_EarthByte_GPlates_PresentDay_Ridges_20100927.shx

./FeatureCollections/StaticPolygons:
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.gmt
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.gpml
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.xy
Shapefile

./FeatureCollections/StaticPolygons/Shapefile:
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.dbf
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.prj
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.sbn
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.sbx
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.shp
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.shp.gplates.xml
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.shp.xml
Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons_20111012.shx

./Rasters:
DNSC08GRA_6m.gpml
DNSC08GRA_6m.jpg
Time-dependent raster sequences
agegrid_6m.gpml
agegrid_6m.nc
agegrid_6m.nc.aux.xml
color_etopo1_ice_low.gpml
color_etopo1_ice_low.jpg

./Rasters/Time-dependent raster sequences:
dynamic topography

./Rasters/Time-dependent raster sequences/dynamic topography:
LICENSE.txt
README.txt
jpg

./Rasters/Time-dependent raster sequences/dynamic topography/jpg:
credits.txt
dynto-0.jpg
dynto-1.jpg
dynto-10.jpg
dynto-100.jpg
dynto-11.jpg
dynto-12.jpg
dynto-13.jpg
dynto-14.jpg
dynto-15.jpg
dynto-16.jpg
dynto-17.jpg
dynto-18.jpg
dynto-19.jpg
dynto-2.jpg
dynto-20.jpg
dynto-21.jpg
dynto-22.jpg
dynto-23.jpg
dynto-24.jpg
dynto-25.jpg
dynto-26.jpg
dynto-27.jpg
dynto-28.jpg
dynto-29.jpg
dynto-3.jpg
dynto-30.jpg
dynto-31.jpg
dynto-32.jpg
dynto-33.jpg
dynto-34.jpg
dynto-35.jpg
dynto-36.jpg
dynto-37.jpg
dynto-38.jpg
dynto-39.jpg
dynto-4.jpg
dynto-40.jpg
dynto-41.jpg
dynto-42.jpg
dynto-43.jpg
dynto-44.jpg
dynto-45.jpg
dynto-46.jpg
dynto-47.jpg
dynto-48.jpg
dynto-49.jpg
dynto-5.jpg
dynto-50.jpg
dynto-51.jpg
dynto-52.jpg
dynto-53.jpg
dynto-54.jpg
dynto-55.jpg
dynto-56.jpg
dynto-57.jpg
dynto-58.jpg
dynto-59.jpg
dynto-6.jpg
dynto-60.jpg
dynto-61.jpg
dynto-62.jpg
dynto-63.jpg
dynto-64.jpg
dynto-65.jpg
dynto-66.jpg
dynto-67.jpg
dynto-68.jpg
dynto-69.jpg
dynto-7.jpg
dynto-70.jpg
dynto-71.jpg
dynto-72.jpg
dynto-73.jpg
dynto-74.jpg
dynto-75.jpg
dynto-76.jpg
dynto-77.jpg
dynto-78.jpg
dynto-79.jpg
dynto-8.jpg
dynto-80.jpg
dynto-81.jpg
dynto-82.jpg
dynto-83.jpg
dynto-84.jpg
dynto-85.jpg
dynto-86.jpg
dynto-87.jpg
dynto-88.jpg
dynto-89.jpg
dynto-9.jpg
dynto-90.jpg
dynto-91.jpg
dynto-92.jpg
dynto-93.jpg
dynto-94.jpg
dynto-95.jpg
dynto-96.jpg
dynto-97.jpg
dynto-98.jpg
dynto-99.jpg
dynto.gpml
 

“Improving” GPlates output with Automator and Applescript

When you export reconstructed shapefiles from GPlates, you get two options: either dump all the shapefiles into one (but then not include an attribute field that can be used to separate the input files, or dump each shapefile into a folder named for its input file.  Since I’ve been working with multiple input shapefiles, this results in multiple folders of output:

Folder_Top
--my_first_shapefile
----reconstruction.00Ma.dbf
----reconstruction.00Ma.prj
----reconstruction.00Ma.shp
----reconstruction.00Ma.shx
--my_second_shapefile
----reconstruction.00Ma.dbf
----reconstruction.00Ma.prj
----reconstruction.00Ma.shp
----reconstruction.00Ma.s
etc.

I would rather have this:

Folder_Top
–my_first_shapefile.dbf
–my_first_shapefile.prj
–my_first_shapefile.shp
–my_first_shapefile.shx
–my_second_shapefile.dbf
–my_second_shapefile.prj
–my_second_shapefile.shp
–my_second_shapefile.shx
etc.

The attached [email me if you need it, I need a reason to dig it up and share it again. 2014-02-18] Automator script (Mac only) takes whatever folders you input (e.g., my_first_shapefile and my_second_shapefile), renames the contents of each to match the folder name, and then moves all of these files to the parent of the input folders (e.g., Folder_Top).  It has a bit of applescript based on this post and this one.

Tested on: Mac OS X 10.6.8 (Snow Leopard)
Requires: Dispense Items Incrementally Automator action
Input: One or more folders, dragged onto script icon (can be modified in Automator to pop up a dialog box)
Released under: CRAPL
Download: rename files to folder name automator.zip
Installation: Unzip file.  I think you need ro run Automator, open the unzipped file, and then “Save As” to hook everything up to your system correctly.

 

 

Quickly access journal PDFs for which UND has a subscription

If you're using Google or another engine to search online for journal articles, seven times out of ten you'll end up at the site where you can get the PDF (via institutional subscription) but you won't be recognized as the institution.  This also will happen if you are off-campus.  One way to get access is to head back to the UND Libraries site, find the eJournal search, type the name, and then navigate to the right issue.  There is a better way.

Take this URL for example.  I wound up here after Googling for one of the authors and looking for his email address:

http://jgs.geoscienceworld.org/content/163/4/707.abstract

In order to get that PDF (or see if we even have access), just add the UND proxy to the middle of the URL (bolded for convenience):

http://jgs.geoscienceworld.org.ezproxy.library.und.edu/content/163/4/707.abstract

You'll get bumped to the UND Libraries off-campus login page, and then back to the article page when you've logged in.  Now if you click on the PDF link, you'll find out whether we have subscription access (the PDF will open) or not (you'll hit a paywall).  This has worked well for me for the past year or so.

Mosh Pit

In case you’ve missed them, here are some stories that have been in the Grand Forks news lately:

Additionally, we’ve had a wonderful collection of drug lords, prostitution rings, child pornography arrests, and needless automobile deaths in the past few weeks.  Growing city?  Certainly.  In need of another WalMart?  Probably not.

Comps prep: visualizing tectonic plate interactions

Since I don’t have a blank map handy, I was playing around with a different way to visualize tectonic plate interactions so I can remember them more easily.  This sketch shows which plates touch other plates, but doesn’t go so far as to describe the direction of movement or the type of each boundary.  

 

Under this model (which is all sixteen of the plates shown in the textbook I’m using), there are 34 specific plate-plate interactions.  Luckily I don’t have to explicitly remember them as a list: just knowing what a few plates are doing makes it pretty simple to hypothesize the movement of everything else (in theory, at least).

Comps prep: Unionoida evolutionary relationships

Introduction

Note: This post is a work in progress that can and will change over the next few days.  I'm making it available for critique, so comment away!  MBK 2012-08-07 1629

With an oral exam coming up in a week, it's nose to the books time!  One of my committee members has posed a certain question relating to soon-to-be-published research on the phylogeny of Order Unionoida (Class Bivalvia1).  This post is an effort to figure out just what has been going on regarding the relative positions of the families in this order.  Were it a simple factual question, I might have more luck, but it happened to be one of those darn "describe the implications of…" inquiries.

According to Bogan and Roe (2008), below is the general arrangement of superfamilies and families within Unionoida.  There are other arrangements that have been made, however, notably placing Family Hyriidae in Superfamily Unionoidea (e.g., Roe and Hoeh, 2003; Walker et al., 2006) according to Simpson's (1900) original classification (Figure 1).

(Figure 1): Historical trees presented by Roe and Hoeh (2003).  Note that the Family Etheriidae is missing from these evolutionary hypotheses: where does it fit in?

  • Order Unionoida
    • Superfamily Unionoidea
      • Family Margaritifera
      • Family Unionidae
    • Superfamily Etherioidea
      • Family Etheriidae
      • Family Iridinidae
      • Family Mycetopodidae
      • Family Hyriidae 

Although the order and the families are relatively solid (according to the work so far), the contents of each superfamily (indeed, the number of superfamilies) is highly debatable.  The families themselves are not all entirely monophyletic, which leads to additional questions: what relationship do the ragged edges (those genera that belong to a different branch of the tree than the rest of the family) actually have to the families they are assigned?  Is it a poor selection of characters, or bad phylogenetic analysis, or wrong identification, or what?  

I'm digressing here, but I think it's important to point out again that the way we group taxa in order to keep things organized may not reflect the actual evolutionary relationships among those taxa.  For example, using the name "Unionidae" for a certain branch of the tree minus genus Psuedomulleria as in Figure 3 of Hoeh et al. (2009) might make sense from a classification standpoint but not from an evolutionary standpoint (Figure 2)2.  Ideally (and I can't think of a counterargument to this), classification systems would follow evolutionary relationships.


(Figure 2): One of three trees from Hoeh et al. (2009).  So what's up with Pseudomulleria?  Turns out Coelatura works better in another position (see Whelan et al., 2011).

Previous Discussion

Returning to the point of this exercise, what are the evolutionary relationships among the six families in Order Unionoida?  (Ordered as above for consistency.)  Interestingly enough as a paleontologist, the characters used for phylogenetic analyses at the family level are all soft-part anatomy and life history (some of which are listed below) rather than hard-part shell morphology.  Although shell morphology in Order Unionida is suspect due to convergence at both the evolutionary and ontogenetic levels, I'm not sure how much thought has been given to the rate at which this morphology evolves in comparison to soft-part anatomy (I spoke too soon).

Superfamily Unionoidea
Lack of "mantle fusion that results in incurrent and excurrent apertures" (Bogan and Roe, 2008).

Family Margaritiferidae
Considered to be monophyletic.  Soft-part morphology 'primitive' or basal to the rest of Order Unionoida; molecular analysis places as sister to Family Unionidae to form Superfamily Unionoidea (Graf and Cummings, 2006).  Skawina and Dzik (2011) support the basal ("least derived") hypothesis based on morphological similarity to Neotrigonia.  
Glochidial larvae.  
Brood in all four demibranchs.
Northern hemisphere (widespread) (Bogan and Roe, 2008).

Family Unionidae
Considered by different studies to be both monophyletic and paraphyletic.  Most recently, Whelan et al., 2011 claim a monophyletic Unionidae very similar to the Linnean classification produced by Bieler et al. (2010) but do not examine relationships with other families.  
Glochidial larvae.  
Brood in all four demibranchs or just outer pair.
Northern hemisphere (widespread) (Bogan and Roe, 2008).

Superfamily Etherioidea
"Some degree of fusion that results in, at least, a completely fused excurrent siphon and, often, a completely fused incurrent siphon" (Bogan and Roe, 2008).

Family Hyriidae
Monophyletic according to all studies so far.  Soft-part morphology places this family as sister to "anything with a lasidium" (Superfamily Etherioidea as listed at the top of this post), but mtDNA analysis breaks the Hyriidae out entirely and places it as sister of and basal to all other families in Order Unionoida (Graf and Cummings, 2006; Bogan and Roe, 2008).  This disagreement comes down to methods, which means it matters with whom you're speaking at the moment.  
Glochidial larvae.  

Brood in inner demibranchs.
Southern hemisphere (South America and Australasia) (Bogan and Roe, 2008).

Family Mycetopodidae
Lasidial larvae.  
Brood in inner demibranchs.
Southern hemisphere (South America) (Bogan and Roe, 2008).

Family Iridinidae
Lasidial larvae.  
Brood in inner demibranchs.
Southern hemisphere (Africa) (Bogan and Roe, 2008).

Family Etheriidae
Lasidial larvae.  
Brood in inner demibranchs.
Southern hemisphere (Africa) (Bogan and Roe, 2008).

Walker et al., 2006 summarizes the possible familial relationships as follows (although there may be others):

  • basal Margaritiferidae, paraphyletic Unionidae, derived Hyriidae+Mycetopodidae+Iridinidae.  "Analyses of morphological characters presented by Graf (2000) and Hoeh et al. (2001) both return the Margaritiferidae as the basal unionoid lineage, a paraphyletic Unionidae, and the Hyriidae + Mycetopodidae + Iridinidae as a derived lineage." 
  • basal Hyriidae, paraphyletic Unionidae. "In contrast, both the molecular (i.e., cox1 DNA sequences) and the combined analysis of morphological and molecular data presented by Hoeh et al. (1998a, 2001) (Fig. 1C) place the Hyriidae as the basal unionoid lineage, with the Unionidae returned as paraphyletic." 
  • basal Margaritiferidae, monophyletic Unionidae, sister Hyriidae+Iridinidae+Mycetopodidae. "Yet another combined analysis of morphological and molecular (i.e., cox1 DNA sequences) characters (Roe & Hoeh, 2003; Fig. 1D), this time using binary coding of the morphological data and a posteriori character weighting, returns the Margaritiferidae as basal and a monophyletic Unionidae as sister to a Hyriidae + Iridinidae + Mycetopodidae clade."  It is pointed out by Graf and Cummings, 2006 that the binary coding of characters resulted in the unforseen weighting of certain character pairs, i.e., coding "1" for one character necessitated coding "0" for another.

If we consider the methods used in each case to be equally plausible, each of these three scenarios supports the close relationship between Iridinidae and Mycetopodidae (Etheriidae, being represented by a single(?) genus, seems to be left out, but I believe is generally considered to be closely related to these two families as well).  This leaves Unionidae (paraphyletic or not), Hyriidae, and Margaritiferidae (the latter two of which are considered by different character sets as both basal and derived).  While you think on this, check out Figure 3.

(Figure 3):  Graf's interpretation of some relationships within Order Unionoida.

As a final thought on past work, click here to see what Graf and others have come up with for solutions, and a summary figure for the last several papers on this subject.  Confusing, isn't it?

 

New Discussion

I think I will wait a month or so before I finish this section up.  Some new information should be coming down the pipeline.

Works Cited

Many of the publications cited have been linked to in the text, often to a page on an academic social network.  By following these links, you are letting the authors know you are interested in their work.  Publications that have no official online presence are listed here.

Roe, K. J. and Hoeh, W. R. (2003). Systematics of freshwater mussels (bivalvia: Unionoida). In Lydeard, C. and Lindberg, D. R., editors, Molecular systematics and phylogeography of mollusks, chapter 4, pages p. 91–122. Smithsonian Books.

 

 

 

 

 


1 Why Bivalvia and not Pelecypoda?  Bivalvia has nomenclatural priority, unfortunately, so as cool as studying "hatchet-footed" animals is, we don't get to call them that in polite company.
2 This has supposedly been resolved by Whelan et al. (2011) by improving the position of Coelatura.