GIS: Raster cell centroids as vector points with GDAL and R

Short intro: I’m using raster data for part of my dissertation, but I’m storing the paleoenvironment data in a database as the latitude and longitude of the center of each grid cell. This lets me query my environment localities, add new ones, and make specific datasets.

Sometimes I take existing maps of other people’s environmental interpretations and digitize them, leaving me with vector polygon data. I need to convert the polygons into the same kind of data I’ve been storing. This is surprisingly hard in both QGIS and GDAL alone, so I finally figured out how to do it in R today courtesy of Simbamangu over at StackExchange.

See the attached images for how things look at each stage.

Below are the general terminal commands, first in GDAL and then R. I will try to add more rationalization and explanation later. My operating system is Mac OS X but these should be modifiable.

GDAL in your terminal

gdal_rasterize -at -where 'freshwater = 1' -a freshwater -ts 360 180 -te -180 -90 180 90 -l "source_layer_name_in_shapefile" "shapefile.shp" "output_raster.tif"
Produces a geoTiff of your input polygons where the attribute ‘freshwater’ is equal to ‘1’. For more detail I invite you to read the GDAL documentation, it’s quite good.

R

To run through a single file:
# load necessary libraries
library(maptools);
library(raster);
library(rgdal);

# Load the raster from a file
r <- raster("rastername.tif") # Convert to spatial points p <- as(r, "SpatialPointsDataFrame") # Set name of column of raster values for later use names(p)[1]<-"value" # Get only the values we want (values above zero) in that column psubset<-subset(p,p$value>0);

# Save as a shapefile (extension appended automatically)
writeSpatialShape(psubset, "outputfile");

You can also loop through a directory:
# load necessary libraries
library(maptools);
library(raster);
library(rgdal);

# Set working directory
setwd("/my/working/directory/");

# list all .tif files in working directory
for (file in dir(pattern="*.tif")) {
r <- raster(file); p <- as(r, "SpatialPointsDataFrame"); # Set name of column of raster values names(p)[1]<-"value"; # Get only the values we want (values above zero) psubset<-subset(p,p$value>0);

# Make a new filename
oldname<-substr(file, start = 1, stop = nchar(file) -4); newname<-paste(oldname,"pts",sep=" "); # Make a shapefile writeSpatialShape(psubset, newname) ; # Make a CSV file # Convert to data frame so you can drop extra columns psubsetdf<-as.data.frame(psubset) write.table(psubsetdf[,c(2,3)],file=paste(newname,".csv",sep=""),sep=",", row.names=FALSE); }

Ephemeral ladybug trace fossil

I just came across this image of a ladybug making tracks in condensation on a window. As far as I can tell, this is an original photo by user 77dlsd at reddit. Fascinating! This might be a decent way of capturing different kinds of arthropod trackways, although you may have to be something of a Snowflake Bentley to frost a glass plate, drop a critter, catch a critter, and photograph the plate without losing anything.

[notes] Converting MRSID to GeoTIFF using GDAL in Mac OS 10.6 (Snow Leopard)

Just found out how to convert MrSID files (from NZ geology) to other formats using GDAL.

  • Followed instructions here: http://www.photo-mark.com/notes/2008/nov/12/converting-mrsid-gis-files/
  • Needed most recent version of GDAL with frameworks (1.8 Complete): http://www.kyngchaos.com/software/frameworks
  • Had to change command to “/Library/Frameworks/GDAL.framework/Versions/Current/Programs/gdal_translate -of PNG /Users/Matt/Desktop/whangarei/inputmap.sid /Users/Matt/Desktop/whangarei/outputmap.png”
  • That makes an image file, but I think I’ll convert to geotiff for GIS:
    • /Library/Frameworks/GDAL.framework/Versions/Current/Programs/gdal_translate /Users/Matt/Desktop/whangarei/inputmap.sid /Users/Matt/Desktop/whangarei/outputmap.tif
  • Map source: http://www.gns.cri.nz/Home/Our-Science/Energy-Resources/Geological-Mapping/Geological-Maps/1-250-000-QMAP/Map-downloads
  • Map coordinate system: New Zealand Map Grid (Geodetic Datum 1949)
  • gdal_translate man: http://www.gdal.org/gdal_translate.html

Comment on “How Bicycling Connects Us to a Healthier Community and Stronger Economy”

I submitted this comment this morning, but it hasn’t been approved yet. Hopefully it will be.

In response to How Bicycling Connects Us to a Healthier Community and Stronger Economy by Tyler Pell, I had these thoughts on how to improve these arguments:

“I think these are good points, but in the interest of helping to win over people who are stuck in a car monoculture, I’d suggest two things.

First, you should point out exactly why “keeping cars off the road” is good for community. It removes congestion, reduces noise, allows people to stop and chat while they are commuting, all of which strengthen the ties between people who live in the community.

Second, you should move all the environmental issues to a separate section. There are more than enough reasons for people to ride bicycles without shoving “the environment” or “climate change” down people’s throats. Note that I agree completely that it’s good for the environment to commute by bicycle, but I know it’s a sensitive subject with some people; if you give these people other reasons that they can personally get behind as human beings or consumers, you have a better chance at getting them into cycling.

Thanks for all the citations, this is a good resource overall.

Matt”

I think my suggestions are meaningful because when you’re trying to debate someone, it works best to start from a common ground and work toward the point your trying to make, all the while explaining why your points make sense. “The environment” is a nonstarter for some people. “Community” is a great thing to aim for, and I think Pell did a good job expressing that. Economics is a good place to start for some people, as long as you don’t get into too much theory.

Even though I frame this as a debate, remember that anyone you’re trying to convince to commute by bicycle is a future comrade-in-arms. Even if they don’t agree to do it themselves now, they may consider it in the future after your conversation, and they may be just a bit more understanding, which is good for community as well.

Thematic focus and other considerations in science blogging

Protichnoctem (in both blogspot and current form) is supposed to be a science blog. It’s supposed to raise the big questions and answer the little ones. It’s supposed to bring people together to solve the mysteries of the universe. It’s supposed to make other people as excited about research as I get when I read SV-POW!, The Open Source Paleontologist, and a host of others.

It’s obviously not doing these things very well. I’ve come up with some possible reasons why:

New posts don’t occur frequently enough.

Even though I added this to the list, I’m not sure it is as important as some of the other ideas. If a post is valid, the idea is that someone will find it eventually and it will get some views, especially if it contains some new ideas on an old subject. “New ideas” of course refers to originality of thought, not originality of opinion. Posting at the “just right” frequency would encourage people to come back once they know about the blog, however.

Posts are too short.

In many cases, I end up posting snippets of information. This is great for those things that make sense off the bat (if you’re searching for a particular code snippet, you don’t want to read a novel), but bad for involving readers. Ideally I would post original (not rehashed or updated) tutorials for the software or methodology I’m using. Off the top of my head this could include a better explanation of exactly how my M.S. thesis methods work, but with the advances in software like PAST (regarding EFA) in the last three years, it seems like redoing things for publication (as I am now) will be quicker and easier than ever.

In the end, though, I think it comes down to engagement. If the post is engaging enough, people who read it will hopefully comment and/or write their own posts in reply. In order to make things understandable to a wide audience as well as bring up original ideas, the posts need to be of a certain length. One tenet of science blogging not mentioned above is “if the public can’t understand it, it’s not worth writing.”

Posts are not interesting to anyone else.

I think this is a real problem in science blogging. Not for all science blogs, but for many people who are working in fields that are slightly to awesomely esoteric. To be clear: I’m not laying this on the subject matter. Invertebrate paleontology is interesting to many people, worldwide. Even molluscan paleontology is interesting to modern malacologists, and by extrapolation I assume that there are others (especially graduate students) who want to know more about fossil freshwater mussels. These people are the easy ones to reel in if you can have a) original posts that are b) understandable and c) engaging. No, the issue with science blogging is that scientists (in the United States, at least) always need a “hook” to get the people who are interested in science in general, and an especially large hook to get the people who don’t care about science at all.
But* do I talk about fossil freshwater mussels? Not really. And do I have hooks on my posts to grab non-scientists? Not really. So whose fault is that? My own.

Nobody knows about the blog.

This is one of the key things about blogging. If you build it, maybe one or two will come . . . eventually. If you aren’t periodically plugging your blog to your personal and professional contacts, the only way people are going to find it is through a search engine. To get a coveted blogroll position on certain influential blogs requires hard work and more than a single decent post to your name. To get retweeted takes a hook that’s less than 140 characters. To get RSS subscribers seems to take even more engaging posts. So how does one get out of this catch-22? Promote, promote, promote, but not before you have good content and a solid publishing schedule you intend to keep.

Posts are not aligned with a single theme.

This is the specific problem here at Protichnoctem. I’ve been blogging for a long, long time, but on a variety of different subjects. Put that all together in one place and you won’t have an audience because the content is aimed at yourself: only you (and maybe your best friend, or your spouse if you guilt him or her into reading) is going to want to read everything you post, because you are the only person in the world interested in that collection of things. If you are relatively prolific, you probably have material for several blogs. In my case, I have posts relating to my personal life, my hobbies, general photography or video projects I wanted to show off, code snippets, things I thought would be helpful to other people someday, and, oh yeah, a little bit of original science stuff. Don’t do this–I’m in the process of splitting it all up.

Your theme doesn’t have to be so specific that you should feel bad about putting in things that make sense. SV-POW!, for example, doesn’t necessarily show a sauropod vertebra every post, but all the posts relate to the academic research areas of the authors. I’ll even admit that I read the blog primarily for the non-sauropod posts but I get to learn more about dinosaurs (and how dinosaur paleontologists think) with every back-on-topic post they do. All I’m saying is that your themes should be amenable to each other. Sure, Andy Farke could have written a blog called “The Open Source Paleontologist Who Brews His Own Beer,” but that would have alienated half the audience** with every new post. The more themes you have, the more fragmentation of audience you’ll get, which translates into less frequent interesting updates for every individual reader.

A note on personal blogs: personal blogs are great, but I’m forced to give the disclaimer that if you want to use your blogging as an example of outreach, you want to feel confident about the link you’re sending to the hiring or scholarship committee. If you don’t feel right exposing a potential funding source to photos of your kids interspersed with pictures of fossils, maybe you should split your themes. That being said, it’s okay (and encouraged) to be yourself, because a blog is not a research paper.


These are the ideas I’ve come up with, and I’ll be implementing some changes once I find some additional blogging time. In summary, these are the points that I think make up the good, worthwhile science blog I hope this site can be someday.

  • Posts with original ideas
  • Posts that are understandable to the public
  • Posts that are engaging to the audience
  • Commitment by the author
  • Promotion until traffic goals are met
  • A single theme


*I’ve struggled with the question of using things like “But,” “And,” and “So” to begin a sentence for years now, but I see more and more “real” writers doing it, so I’m going to use it for emphasis. So there.
**Although there may not be paleontologists who don’t brew or drink beer, I’m pretty sure there are brewers who aren’t paleontologists.
This blog post took 1:20 to write and edit and post.

Some translation help for the Brazil 1:1,000,000 geology map from CPRM

This is not complete, but hopefully helps out other English-but-not-Portuguese speakers like myself.

Here are translations (as best as I can figure them out) of the headers in the attribute table for the Brazil 1:1,000,000 geology map.
SIGLA_UNID = unit code
COD_UNI_ES = ?
SIGLAS_ANT = acronym something
NOME_UNIDA = unit name
HIERARQUIA = unit type (formation, lithofacies, unit, etc.)
IDADE_MAX = maximum age (years)
ERRO_MAX = assuming the plus/minus on IDADE_MAX
EON_IDADE_M = eon (Proterozoic, Phanerozoic, etc.)
ERA_MAXIMA = maximum possible era
PERIODO_MA = maximum possible period
EPOCA_MAX = maximum possible epoch
SISTEMA_GE = maybe what defines the type of unit it is? things like paleontology, structure, isotope, etc.
METODO_GEO = ?
QLDE_INFER = something about direct versus inferred measurements
IDADE_MIN = minimum age (years)
ERRO_MIN = assuming the plus/minus on IDADE_MAX (years)
EON_IDADE_1 = eon (not sure how this differs from EON_IDADE_M)
ERA_MINIMA = minimum possible era
PERIODO_MI = minimum possible period
EPOCA_MIN = minimum possible epoch
SISTEMA_1 = see SISTEMA_GE
METODO_G_1 = see METODO_GEO
QLDE_INF_1 = see QLDE_INFER
AMBSEDIMEN = first-order environment, so “continental,” “marine,” “transitional,” etc. values (“ambiente” is environment)
SISTSEDIME = second-order environment, so “deep,” “siliciclastic platform,” “fluvial,” etc. values
TIPO_DEPOS = third-order environment, so “pelagic,” “delta plain,” etc. values
ASSOC_MAGM = maybe magma source type?
NIVEL_CRUS = crustal level?
TEXTURA_IG = igneous texture
FONTE_MAGM = magma source
MORFOLOGIA = igneous morphology (like “batholith”)
AMBIENTE_T = tectonic environment
METAMORFIS = metamorphic type
METODO_G_2 = see METODO_G
TEMP_PICO = peak temperature
ERRO_TEMP_ = plus/minus on TEMP_PICO
PRESSAO_PL = pressure something
ERROR_PRESS = plus/minus on PRESSAO_PL
TIPO_BARIC = how something was determined
TRAJETORIA = how something was determined
AMBIENTE_1 = ?
LITOTIPO1 = lithotype
LITOTIPO2 = secondary lithotype
CLASSE_ROC = igneous, sedimentary, or metamorphic
CLASSE_R_1 = see CLASSE_ROC, with less values filled in
BB_SUBCLAS = subclass of CLASSE_ROC
BB_SUBCL_1 = subclass of CLASSE_R_1, I think
OBJECTID = the object ID
IDADE_MA_1 = see IDADE_MAX (all NULL)
ERRO_MAX_1 = see ERRO_MAX (all NULL)
EON_ID_MAX = maximum possible eon (EON_IDADE_M?)
PERIOD_MAX = see PERIODO_MA?
MET_ID_MAX = looks like method of determining maximum age
MET_DAT_MA = looks like the details of MET_ID_MAX, like what type of dating was used
QLD_ID_MAX = (quality?) whether the maximum age was determined with direct or indirect methods
IDADE_MI_1 = see IDADE_MIN
ERRO_MIN_1 = see ERRO_MIN
EON_ID_MIN = maximum possible eon (EON_IDADE_M?)
MET_ID_MIN = looks like method of determining minimum age
MET_DAT_MI = looks like details of MET_ID_MIN, like what type of dating was used
QLD_ID_MIN = (quality?) whether the minimum age was determined with direct or indirect methods
AMBSED = see AMBSEDIMEN
SISTSED = see SISTSEDIME
TIPO_DEP = see TIPO_DEPOS
TEXT_IGNEA = see TEXTURA_IG
AMB_TECTO = see AMBIENTE_T
METAMORF = see METAMORFIS
TRAJET_PT = see TRAJETORIA
CLASSE_RX1 = see CLASSE_ROC
CLASSE_RX2 = see CLASSE_RX1
SUBCLA_RX1 = see BB_SUBCLAS (of CLASSE_RX1)
SUBCLA_RX2 = see BB_SUBCL_1 (of CLASSE_RX2)
Shape_Leng = length of polygon (circumference?)
Shape_Area = area of polygon

Changes to First Avenue North

This one should be quick. I noticed on my way into school this morning that the signs that have, for so long, dedicated Thursdays to a street free of cars (“No Parking Thursday 8 AM – 4 PM”) seem to be gone along 1st Ave N. Additionally, on the EERC end of the street (between 22nd and 23rd street at least) there are new “No Parking This Side of Street” signs on the south (eastbound) side of the street. 

I was considering that maybe this move was a way to save money by not cleaning/plowing this street, but it doesn’t leave the city a time in which to sweep or plow if it is necessary. I’m thinking now that the removal of parking on one side of the street (if in fact it goes all the way to South Washington) might be a move to encourage traffic flow along 1st Ave in addition to 2nd Ave (which is the high-traffic and quite narrow next street over).

Anyway, what do I know? Hopefully someone can find an answer. If more traffic is going down 1st Avenue, we need stop signs on all the cross streets so people don’t die.



View Larger Map

yweather help

In case you want more out of your yweather script when you use it for Geektool, here are some other switches you can use:

I’ve figured out some of it, so I guess my only tip is to not try to use more than one argument at a time (except as shown below with the units example) and therefore to use a bunch of geeklets if you want to show a lot of information.

Also, I believe that ‘weather.png’ in the images folder is the current weather, so you should be able to show it using an image geeklet.

Yahoo Weather Information

Usage:
./yweather.pl -ct Displays current temperature

In case you want more out of your yweather script when you use it for Geektool, here are some other switches you can use:

I’ve figured out some of it, so I guess my only tip is to not try to use more than one argument at a time (except as shown below with the units example) and therefore to use a bunch of geeklets if you want to show a lot of information.

Also, I believe that ‘weather.png’ in the images folder is the current weather, so you should be able to show it using an image geeklet.

Yahoo Weather Information

Usage:
	./yweather.pl -ct	Displays current temperature

Arguments:
	-lc 			City
	-lr			Region
	-lt			Country
	-cc			Weather Code (used for images)
	-ct			Current Temperature
	-cw			Current Weather Description
	-cd			Current Date
	-ah			Current Humidity
	-av			Current Visibilty
	-ap			Current Barometric Pressure
	-ar			Change in Barometric Pressure
	-sr			Time of Sunrise
	-ss			Time of Sunset
	-wc			Current Wind Chill
	-wd			Current Wind Direction
	-ws			Current Wind Speed
	-ut			Temperature Unit
	-ud			Distance Unit
	-up			Pressure Unit
	-us			Speed Unit
	-fd1			Today's Day
	-fg1			Today's Date
	-fl1			Today's Low Temp
	-fh1			Today's High Temp
	-ft1			Today's Description
	-fc1			Today's Weather Code
	-fd2			Tomorrow's Day
	-fg2			Tomorrow's Date
	-fl2			Tomorrow's Low Temp
	-fh2			Tomorrow's High Temp
	-ft2			Tomorrow's Description
	-fc2			Tomorrow's Weather Code
	--copyimage		Copy Appropriate Image to Current Image (deprecated)
	--update		Update xml source file

All data is returned without units. To get data with units,
use a combination of commands.

Example: (Displays Current temperature with unit)
	./yweather.pl -ct && ./yweather.pl -ut

http://macenstein.com/default/2009/11/how-to-add-a-bunch-of-useless-stuff-to-your-desktop-with-geektool-yahoo-widgets-and-more/
http://code.google.com/p/yweather/wiki/howtousage

East Grand Forks to add parking–on the Greenway

The city of East Grand Forks, MN is divided on whether to add parking. “Add parking,” you say, “why wouldn’t you want to add parking?” In this case, as reported by the Grand Forks Herald, the additional parking lot would add 32 stalls on the river side of the Boardwalk, a popular downtown collection of restaurants and bars. The problem is that the river side of the existing parking lot (as shown in the image below) is the Greenway, a park/wilderness area created as a buffer from the Red River after the devastating 1997 flood.
 



View Larger Map 

Map centered on proposed parking lot site.

Obviously, many business owners and city council members are excited at the prospect of additional parking because it is (at face value) expected to bring even more people to downtown EGF. (I’ve personally experienced the lack of space on a busy night.)

The mayor said he supports the plan to add parking. He said it is the best thing for the city and he would vote for it with his tiebreaking vote if the city council was deadlocked. Mayor Stauss said his support for the plan is not influenced by his brother and son owning the Boardwalk building.

Some owners, however, are seeing the long-term consequences. 

Dave Homstad, the co-owner and general manager of the Blue Moose Bar & Grill, says he spent an extra $80,000 to $100,000 to build his exterior deck to a height high enough to provide a view of the river and the Greenway, and the new parking lot would block that view for his customers. “If they extend that parking lot out there it’s going to take away my view of the river,” he said

Additional parking is available across DeMers Avenue in the Cabela’s parking lot, in other lots and streets near downtown, and across the Red River in downtown Grand Forks, ND. Why another lot, and why here?

Wanted (and some found): Geologic maps (formation contacts) of South America countries, for use in GIS

Here’s a good ol’ plea for help from the scientific community. As my questions to “the scientific community” via Academia.edu have gone unnoticed, I’m posting this out here to see if anyone else searching for the same thing has had any luck. I’m building a GIS (geographic information system) model to determine the possible biogeographic districbution of a genus through time. What I need for this, since the fossils are from South America, is a good geologic map, either of most of the continent or of the countries of Brazil, Argentina, Ecuador, and Peru. Hunting around online hasn’t shown me anything that I want to shell out a bunch of money for, site unseen, and I’m honestly trying to avoid having to scan paper maps and register them (additionally, I haven’t been able to access our worldwide collection of paper geologic maps recently). 1:500,000 or 1:100,000 would be great, but I’d even take 1:1,000,000 at this point. I need something with formational contacts so I can plot possible distributions. So that’s the challenge of the day: have any geologists in South America discovered a good source for this type of material? Would you be willing to share or trade? Drop me an email at one of the addresses on the right sidebar if you can help.

Brazil

UPDATE 2011-08-15: I found (finally) an online version of the Geological Map of Brazil at 1:1,000,000 scale. Unfortunately it is only available in a viewer, broken up into small sections. Anyone know where I can download the full dataset, or at least that data for those sections? UPDATE 2011-08-15 1431: Thanks to Sidney Goveia over at Geosaber, I tracked down the Brazil data to Geobank, so here goes nothing! If you are using a Mac, make sure you extract the contents of the ZIP files using StuffIt Expander (if you have it) rather than Archive Utility, otherwise you don’t end up with a folder.

Peru

UPDATE 2011-08-16: You can download a formation-level geological map of Peru at the 1:1,000,000 scale from INGEMMET – the Instituto Geologico Minero Y Metalurgico. There are a few steps through the online viewer (note: this is the new viewer, so the following steps may not work the same), which may appear in Spanish at first but for some reason decided to reload partway through in English. If you want, you can try to figure out how to turn on the layer under the Map–>Geodatabase menu, but really when that menu comes up you want to click the Download/Descargas folder icon, then select the Geologia layer (SHP icon next to it). It also looks like you can download the data as a KML file for Google Earth. A structural layer (Dominios Estructurales) also looks available, and I checked out the radiometric date layer (Dataciones Radiometricas) as well, which could be useful. You might need to check the projection on the geologic map once you download it. I imported it to QGIS at first and it came up as WGS 84 (probably because of the existing project), but I dug around and figured out that it works as PSAD_1956_UTM_Zone_18S. Need a map key for formation symbols? You can download scans of the paper version of this map from here, one of which has the legend. If the correct map doesn’t come up at that link, click on “Ministerio de Energia y Minas. Instituto de Geologia y Mineria” so see the others. The symbology is different. An alternative large-scale (1:100,000) but low-quality set of maps is available here but will not easily go into GIS.