My dissertation is dependent on the assumption that people make mistakes in identification and naming of fossil and modern organisms. In particular, I am proposing that certain freshwater mussel genera in the family Hyriidae have supposed taxon ranges that are far longer than they should be, however in order to be taken at least somewhat seriously I need to show that this could be the case and select candidates for further investigation.
The first figure is a simple taxon range diagram for several genera that are agreeably within the Hyriidae. The mean indicators can be ignored. It is apparent that three genera in particular stand out as being long-lived. This could be for a number of reasons: the genera may actually have survived for such long periods of time, certain specimens may have been misidentified, or certain nomenclatural lumping may have occurred inappropriately.
The second figure includes more information. The width of each bean is proportional to the number of occurrences of each taxon through time. Note that the full range of each taxon is not displayed on the second figure because it was produced from age estimates from (in most cases) surrounding stage boundaries.
I will leave it to the reader to determine whether I am on the right track.
Plots were produced in R using the function below, which is being released under the CRAPL. Data is from my personal locality occurrence database, which will become available on the completion of my dissertation.
ranges<-function(locfile,genera=c("Alathyria","Velesunio"),type="box",columns=c("D_no_dissertation_id","genus_bogan","species_source","ref1","age_start_ma","age_end_ma")) {
# Make sure beanplot is available.
library("beanplot");
# Read in the data from a CSV file.
localities<-read.csv(locfile);
# List the genera you want.
# genera<-c("Alathyria","Velesunio");
# Create a place to store the selections.
selection<-list();
start<-list();
end<-list();
select<-data.frame();
## For bean plots
# Grab the whole selection
select<-subset(localities,localities$genus_bogan %in% genera & localities$age_start_ma!="NA" & localities$age_end_ma!="NA",select=columns);
# Add mean dates to a data frame.
select$mean<-ave(select$age_start_ma, select$age_end_ma);
# Get rid of the unneeded genus names in the subset.
select$genus_bogan<-factor(select$genus_bogan);
if(type=="box") {
## For box plots
# Loop through the genera
for (i in 1:length(genera)) {
# Grab the columns you want.
selection[[i]]<-subset(localities,localities$genus_bogan==genera[i] & localities$age_start_ma!="NA" & localities$age_end_ma!="NA",select=columns);
# Sort by column age_start_ma (not needed at the moment).
# selection[[i]]<-sort(selection[[i]],by=~"age_start_ma")
# Find the start and end dates.
start[[i]]<-max(selection[[i]]["age_start_ma"]);
end[[i]]<-min(selection[[i]]["age_end_ma"]);
}
# Make the start and end lists into a matrix...the long way around.
df<-data.frame(start=unlist(start),end=unlist(end));
# Transpose the matrix.
toplot<-t(as.matrix(df));
# Make a box plot. Don't need to worry about whiskers because there are only two values. The y-axis is reversed.
boxplot(toplot,names=genera,ylim=rev(range(toplot)),ylab=c("Ma"));
# Show the data being plotted.
print(toplot);
} else if(type=="bean") {
# Make a bean plot. This is more complicated. The y-axis is reversed.
beanplot(select$mean~select$genus_bogan, ylim=rev(range(select$mean)), cut=0, log="", names=levels(select$genus_bogan), what=c(0,1,0,0),bw=20,col = c("#CAB2D6", "#33A02C", "#B2DF8A"), border = "#CAB2D6",ylab=c("Ma"));
}
}