New Preprint: Comparing size of morphospace occupation among extant and cretaceous fossil freshwater mussels using Elliptical Fourier Analysis

A new preprint of some of the work from my Master’s thesis is now available at PeerJ, authored by myself and my MS and PhD advisor, Joseph Hartman.  We’re looking for honest, science-y feedback in order to improve the paper before publication, so please check it out!

Burton-Kelly M, Hartman JH. (2014) Comparing size of morphospace occupation among extant and cretaceous fossil freshwater mussels using Elliptical Fourier Analysis. PeerJ PrePrints 2:e626v1 http://dx.doi.org/10.7287/peerj.preprints.626v1

If something can be improved, why not improve it?

Had someone else read something I wrote recently.  I criticized a system of closed data, and then suggested a crowdsourced solution to get around the data guardians (not illegally, just through data mining from published works).  This person’s response?  To say that closed data should be closed because someone paid for it, and to ignore my solution.  In another section, I lamented various issues with data digitization projects, because I wanted to add another (citable) voice that could be used for digitization project funding.  Response: these issues aren’t new, and you don’t need to waste your time with them.

Who is it going to hurt to try to improve things?  It’s no skin off anyone else’s nose.

Searching the PDF Library

[EDIT: The PDF search described below no longer exists, but the mention of a preprint server for other sciences is becoming a reality with PeerJ Preprints.]

For those who are looking for paleontology or geology papers in PDF format, you might be able to find them with the full-text search I’ve installed here. There are 40 GB or so of files to access. If you find a file there that you can’t access any other way, drop me an email and I can send it.

This is the easiest way I can think of to share my PDF library at the moment. In the past I’ve experimented with Alliance, OneSwarm, and even torrenting, but the first two applications require a critical mass of users to make viable (something I’ve never been able to get) and the last is difficult to update.

While a preprint server such as arxiv.org (but for other sciences [than physics, 2014-02-04]) would be useful for the future, it wouldn’t help to distribute the vast knowledge contained in works that are out of print. For this purpose we, as scientists, need to form our own distribution network. I will keep this directory up for myself and those who need it, but for complete sharing of published works I still think we need a P2P network devoted to that purpose.

Trying to Focus on Specimen Databasing

This post will explore some fairly specific topics, but I hope the thought process will be instructional (or inspiring) to others. Additionally I think it’s worthwhile to talk about the concepts of specimen/biological collection database management with reference to funding, not schemas and platforms.

At the UND Department of Geology and Geological Engineering, a small number of us have been pursuing an overall upgrade of the paleontological specimen and lithotype collection consisting of improved facilities (compactor cabinets) and a comprehensive online database. We’ve applied for funding from NSF and been denied twice, and the project would be dead in the water except for the quarter-time assistantship I’m receiving from the Dean’s office at the School of Engineering and Mines. Development has been slow, mostly due to the conversion between the existing databases (stored as flat text files) and the online system (I will not mention the name of the new system because events today have made me question (again) the cost/benefit ratio of utilizing it), and I’ve been importing locality data so we can use the new system to analyze locality distribution, among other things.

The question today is how to proceed. As useful as locality data are to paleontological and geological researchers, locality information is, at its core, supplementary to the specimens themselves. (I’ll avoid an argument right here: I believe that locality data are essential to proper context, and I’m not advocating the dissociation of these data from specimens.) Specimens are the core of the paleontological sciences, and it is from specimens and their assigned taxonomic identities that researchers work toward understanding past life. Rather than browsing locality lists and then looking at specimens, given a database most researchers will search by taxon or in special cases by specimen number, and then they will look at the associated locality data. In my opinion, we’ve been doing it wrong.

The above point regards usability, and I promised to talk about funding issues, so here we go: in order for such an online database (and more importantly, the effort to digitize specimen data and provide specimen imagery) to keep getting funding, it needs to be usable so it will be used! That’s the whole point. If the Dean (or any other UND administrator) wants to put us on the map for having a world-class collection, we need to get the data out there that people want, we need to tell them about it, and we need to encourage them to use it. From the administration’s perspective, numbers are going to determine how successful we are: number of unique visitors the online database gets every year, number of publications that reference specimens held in our collections, and number of researchers who visit or request material loans.

What can I do today that will improve our chances? In my opinion, we need to improve usability by others before we can improve usability by ourselves. This means a focus on specimen-data entry, the postponement of certain analytical capabilities we (as UND researchers) would like, and beginning with those specimens referenced in peer-reviewed articles, dissertations, and theses. These specimens have already gotten the most attention and they are likely to get more attention in the future because of their “published” status. The associated material can come next, and then we can start adding data systematically. At this point, to show that this is possible and that it shows our research collections in a good light, we need to get the bare bones online first and follow with everything else later.

That’s what I think, and what I will discuss with others here later today. Has anyone else come across such a crux of funding issues? How about with specimen collections that are even less sexy than ours (which are primarily freshwater mollusks, and are pretty darn sexy in my opinion)? Am I on the right track, or should we back this train up again?

[publication] A new occurrence of /Protichnites/ Owen, 1852, in the Late Cambrian Potsdam Sandstone of the St. Lawrence Lowlands

BURTON-KELLY, M.E. and J.M. ERICKSON. 2010. A new occurrence of Protichnites Owen, 1852, in the Late Cambrian Potsdam Sandstone of the St. Lawrence Lowlands. The Open Paleontology Journal 3:1-13.

You can download a PDF from here 1MB. You can follow this publication on Academia.edu or ResearchGate.

Buying PDFs: Commentary

This post was originally a comment on Andy’s post “Buying PDFs: Truth and Consequences” at The Open Paleontologist blog. The text grew too long, so I’m devoting a full post to it, even though it’s a bit rough. The topic is how much we pay for PDFs of published articles, and why this is so disproportionate to physical copies.

People who know me already know what my suggested “solution” is, which is to share as many PDFs with as many people as possible in order to help the publishers reevaluate their prices, however…legality prevents me from supporting taking such action. This is modeled after the philosophy of Downhill Battle: in order to get radio stations to play music beyond the mainstream (paid for by the record companies), we need to bankrupt the record companies, essentially by quitting buying music, or at least music produced by the largest companies who pay the biggest bucks toward keeping their music on the air.

I’m not sure if Andy has a citation for his observation that publishers like Elsevier that continue “to post profits in the midst of the recession”? Having someone play with those numbers a bit would be interesting to do.

This ends up being like gas prices. I get that as a business you get to set your prices as the market will bear, but the strategy of moving more merchandise rather than more expensive merchandise should always be something to consider. How much research do these publishers do as far as sub-fields go? As you say, hospitals can pay top dollar for a single article, but more paleontologists will buy an article if it’s cheaper (especially if they are unaffiliated), will be able to do the research they want, and will be looking for a place to publish.

On that note, I hope people continue to vote with their feet when it comes to open-access vs. closed-access, or even if some journals have slightly lower per-PDF fees. I’ve had the discussion recently about what “high impact” means anymore: nothing. It used to mean that the physical journal was available in more libraries and hence better-read and better-cited, but since everything goes to PDF now, everything (new) is equally available to someone who can do a halfway decent job of searching. This gives us all the freedom to publish in journals with whose practices we agree, rather than who has a wider physical distribution.

Filling

One of my interests is building a PDF library for myself and fellow graduate and undergraduate students. Which means that it’s very hard to pass up PDFs when I come across them on the web. So right now I’m downloading anything I even look at during my research, to keep and to pass along.

This may well fill up my hard drive this semester.

Open-Access Journals

The current annoyance on the VRTPALEO list is the academic publishing industry, who will publish your work in exchange for owning the copyright (meaning that you, as an author, cannot distribute your own work without permission). A simplified but good analogy is made by Scott Aaronson here:
 

I have an ingenious idea for a company. My company will be in the business of selling computer games. But, unlike other computer game companies, mine will never have to hire a single programmer, game designer, or graphic artist. Instead I’ll simply find people who know how to make games, and ask them to donate their games to me. Naturally, anyone generous enough to donate a game will immediately relinquish all further rights to it. From then on, I alone will be the copyright-holder, distributor, and collector of royalties. This is not to say, however, that I’ll provide no “value-added.” My company will be the one that packages the games in 25-cent cardboard boxes, then resells the boxes for up to $300 apiece.

But why would developers donate their games to me? Because they’ll need my seal of approval. I’ll convince developers that, if a game isn’t distributed by my company, then the game doesn’t “count” — indeed, barely even exists — and all their labor on it has been in vain.

Admittedly, for the scheme to work, my seal of approval will have to mean something. So before putting it on a game, I’ll first send the game out to a team of experts who will test it, debug it, and recommend changes. But will I pay the experts for that service? Not at all: as the final cherry atop my chutzpah sundae, I’ll tell the experts that it’s their professional duty to evaluate, test, and debug my games for free!

We need to figure out a way to exchange information without making people pay exorbitant fees for it, but in the current situation we could be sued for distributing our own work in PDF format. I’m no opponent of paper copies of Journals, but if all you want is a PDF of a work that is peer-reviewed, there’s no reason you should have to pay for it.

EDIT: This person has something to say about it too, with an analogy to the QWERTY keyboard.

 

a brief answer to a comment

Anonymous says:

In order to expect data sharing, you have to be open to collaboration, yes? Just wanted to point that out. Science is ruthless in its own way, or at least, the scientists and publishers make it that way.

I can be completely opposed to collaboration and still expect data sharing. This does not necessarily mean that I will get it. It also doesn’t mean that I won’t. Now to address what you think you said: it depends on what data you want to share, and how open everyone is with it, depending on what the value of the information is to each person. If I found some interesting new metamorphic structure somewhere, there is no way that I would be able to publish on it or even collaborate, simply because I don’t have the background to deal with the technical side of things past an elementary level. But someone else could. There is no need to sit on something you find out of jealousy if NO ONE is ever going to get a paper out of it because of you.