I’m consistently amazed at the capabilities of R: if it can be done, it can be done in R. And so is the case with mapping. Recently, I had the need to do some complicated geospatial analysis, and I wanted to do it in R for the obvious reasons: it’s free, it’s open-source, and there is a great support community. As it turns out, R has much of the functionality of ArcGIS, albeit with a lot less flash and a lot more hair-pulling. But once you’ve done the legwork to get the data in and formatted, and the functions set, it’s a snap to run through all sorts of data.
Better to start simple, though. Lately, I’ve been asked to create a lot of maps showing the location of different experimental sites, partners, etc. It turns out that R has a lot of great maps built-in, if you download the “maps” package. If your coordinates are in decimals (as opposed to degrees/minutes/seconds), it’s actually quite simple to plot them on various levels (counties, states, or countries).
The following script uses LTER sites as an example, plotting those that fall within the contiguous US and shading the states that contain one or more sites. This script helps with a common problem I have, which is determining whether a point falls within a region. Like all code I provide, it’s completely self-contained!
library(maps) #Calls: map_data library(maptools) #Calls: map2SpatialPolygons library(ggplot2) #Calls: ggplot library(plyr) #Calls: ddply library(sp) #Calls: over # The goal is to map the LTER sites in the US, and shade the states in which they # occur # Read in coordinates of LTER sites (From: http://www.lternet.edu/sites/coordinates) sites=read.delim(textConnection(" 0 0 Andrews Forest LTER (AND) 44.21 -122.26 Arctic LTER (ARC) 66.63 -149.60 Baltimore Ecosystem Study (BES) 39.10 -76.30 Bonanza Creek LTER (BNZ) 64.86 -147.85 California Current Ecosystem LTER (CCE) 32.87 -120.28 Cedar Creek LTER (CDR) 45.40 -93.20 Central Arizona - Phoenix LTER (CAP) 33.43 -111.93 Coweeta LTER (CWT) 35.00 -83.50 Florida Coastal Everglades LTER (FCE) 25.47 -80.85 Georgia Coastal Ecosystems LTER (GCE) 31.43 -81.37 Harvard Forest LTER (HFR) 42.53 -72.19 Hubbard Brook LTER (HBR) 43.94 -71.75 Jornada Basin LTER (JRN) 32.62 -106.74 Kellogg Biological Station LTER (KBS) 42.40 -85.40 Konza Prairie LTER (KNZ) 39.09 -96.58 LTER Network Office (LNO) 35.08 -106.62 Luquillo LTER (LUQ) 18.30 -65.80 McMurdo Dry Valleys LTER (MCM) -77.00 162.52 Moorea Coral Reef LTER (MCR) -17.49 -149.83 Niwot Ridge LTER (NWT) 39.99 -105.38 North Temperate Lakes LTER (NTL) 46.01 -89.67 Palmer Antarctica LTER (PAL) -64.77 -64.05 Plum Island Ecosystems LTER (PIE) 42.76 -70.89 Santa Barbara Coastal LTER (SBC) 34.41 -119.84 Sevilleta LTER (SEV) 34.35 -106.88 Shortgrass Steppe (SGS) 40.83 -104.72 Virginia Coast Reserve LTER (VCR) 37.28 -75.91 "),header=F,strip.white=T) sites=droplevels(sites[-1,]); names(sites)=c("sites","lat","long") # Load US map states=map("state",fill=T,plot=F) # In order to understand it a site falls within a state, we need to convert both # to spatial objects # Convert US map to SpatialPolygon states_sp=map2SpatialPolygons(states, IDs=sapply(strsplit(states$names, ":"),function(x) x[1]), proj4string=CRS("+proj=longlat +datum=wgs84")) # Convert sites to SpatialPoints object sites_sp=SpatialPoints(sites[,c(3,2)],proj4string=CRS("+proj=longlat +datum=wgs84")) # Use function `over` to obtain a list of SpatialPolygons that contain the sites index=over(sites_sp,states_sp) # Return the state names of the SpatialPolygons containing each site states_names=sapply(states_sp@polygons, function(x) x@ID)[index] # Note that not all sites occur in the continental US (e.g., Arctic, Alaska, etc.) # so they occur as NAs # Create a dataframe for plotting from the SpatialPoints object using `fortify` states.fort=fortify(states_sp) # And bind a column "fill" denoting whether the site is present states.fort$fill=ifelse(states.fort$id %in% states_names,"true","false") # Use ggplot to plot the dataframe as a polygon ggplot()+ # Fill the state polygons where fill=="true" and set state borders to "black" geom_polygon(data=states.fort,aes(x=long,y=lat,group=group,fill=fill),col="black")+ # Set the fill colors scale_fill_manual(values=c("white","cornflowerblue"))+ # Plot the individual sites as red dots geom_point(data=sites,aes(x=long,y=lat),col="red",size=4,alpha=0.9)+ # Set labels labs(x="",y="",title="US LTER Sites")+ # Zoom in on the contiguous US, otherwise points are plotted elsewhere coord_cartesian(ylim=c(24,50),xlim=c(-65,-130))+ # Set base theme and remove legend, and fill grid with grey background theme_bw(base_size=18)+theme(legend.position="none", panel.background=element_rect(fill="grey90"))
Hi Jon, just curious, what happened to California? It does not appear to be shaded. Thanks for providing the code!
Ah, good catch Wes! In this case, both CA sites are coastal LTERS (California Current and Santa Barbara), which means they don’t occur within the spatial polygon. In this case you would have to manually add “california” to the `states_names` object. R does what you tell it to do, which is not always what you want it to do!