Mapping in R, Part I

I’m consistently amazed at the capabilities of R: if it can be done, it can be done in R. And so is the case with mapping. Recently, I had the need to do some complicated geospatial analysis, and I wanted to do it in R for the obvious reasons: it’s free, it’s open-source, and there is a great support community. As it turns out, R has much of the functionality of ArcGIS, albeit with a lot less flash and a lot more hair-pulling. But once you’ve done the legwork to get the data in and formatted, and the functions set, it’s a snap to run through all sorts of data.

Better to start simple, though. Lately, I’ve been asked to create a lot of maps showing the location of different experimental sites, partners, etc. It turns out that R has a lot of great maps built-in, if you download the “maps” package. If your coordinates are in decimals (as opposed to degrees/minutes/seconds), it’s actually quite simple to plot them on various levels (counties, states, or countries).

The following script uses LTER sites as an example, plotting those that fall within the contiguous US and shading the states that contain one or more sites. This script helps with a common problem I have, which is determining whether a point falls within a region. Like all code I provide, it’s completely self-contained!

library(maps) #Calls: map_data
library(maptools) #Calls: map2SpatialPolygons
library(ggplot2) #Calls: ggplot
library(plyr) #Calls: ddply
library(sp) #Calls: over

# The goal is to map the LTER sites in the US, and shade the states in which they 
# occur

# Read in coordinates of LTER sites (From: http://www.lternet.edu/sites/coordinates)
sites=read.delim(textConnection(" 	0 	0
  Andrews Forest LTER (AND) 	44.21 	-122.26
  Arctic LTER (ARC) 	66.63 	-149.60
  Baltimore Ecosystem Study (BES) 	39.10 	-76.30
  Bonanza Creek LTER (BNZ) 	64.86 	-147.85
  California Current Ecosystem LTER (CCE) 	32.87 	-120.28
  Cedar Creek LTER (CDR) 	45.40 	-93.20
  Central Arizona - Phoenix LTER (CAP) 	33.43 	-111.93
  Coweeta LTER (CWT) 	35.00 	-83.50
  Florida Coastal Everglades LTER (FCE) 	25.47 	-80.85
  Georgia Coastal Ecosystems LTER (GCE) 	31.43 	-81.37
  Harvard Forest LTER (HFR) 	42.53 	-72.19
  Hubbard Brook LTER (HBR) 	43.94 	-71.75
  Jornada Basin LTER (JRN) 	32.62 	-106.74
  Kellogg Biological Station LTER (KBS) 	42.40 	-85.40
  Konza Prairie LTER (KNZ) 	39.09 	-96.58
  LTER Network Office (LNO) 	35.08 	-106.62
  Luquillo LTER (LUQ) 	18.30 	-65.80
  McMurdo Dry Valleys LTER (MCM) 	-77.00 	162.52
  Moorea Coral Reef LTER (MCR) 	-17.49 	-149.83
  Niwot Ridge LTER (NWT) 	39.99 	-105.38
  North Temperate Lakes LTER (NTL) 	46.01 	-89.67
  Palmer Antarctica LTER (PAL) 	-64.77 	-64.05
  Plum Island Ecosystems LTER (PIE) 	42.76 	-70.89
  Santa Barbara Coastal LTER (SBC) 	34.41 	-119.84
  Sevilleta LTER (SEV) 	34.35 	-106.88
  Shortgrass Steppe (SGS) 	40.83 	-104.72
  Virginia Coast Reserve LTER (VCR) 	37.28 	-75.91 "),header=F,strip.white=T) 
sites=droplevels(sites[-1,]); names(sites)=c("sites","lat","long")

# Load US map
states=map("state",fill=T,plot=F)

# In order to understand it a site falls within a state, we need to convert both 
# to spatial objects

# Convert US map to SpatialPolygon 
states_sp=map2SpatialPolygons(states,
   IDs=sapply(strsplit(states$names, ":"),function(x) x[1]),
         proj4string=CRS("+proj=longlat +datum=wgs84"))
# Convert sites to SpatialPoints object
sites_sp=SpatialPoints(sites[,c(3,2)],proj4string=CRS("+proj=longlat +datum=wgs84"))
# Use function `over` to obtain a list of SpatialPolygons that contain the sites
index=over(sites_sp,states_sp)
# Return the state names of the SpatialPolygons containing each site
states_names=sapply(states_sp@polygons, function(x) x@ID)[index]
# Note that not all sites occur in the continental US (e.g., Arctic, Alaska, etc.)
# so they occur as NAs

# Create a dataframe for plotting from the SpatialPoints object using `fortify`
states.fort=fortify(states_sp)
# And bind a column "fill" denoting whether the site is present 
states.fort$fill=ifelse(states.fort$id %in% states_names,"true","false")

# Use ggplot to plot the dataframe as a polygon
ggplot()+
# Fill the state polygons where fill=="true" and set state borders to "black"
  geom_polygon(data=states.fort,aes(x=long,y=lat,group=group,fill=fill),col="black")+
# Set the fill colors
  scale_fill_manual(values=c("white","cornflowerblue"))+
# Plot the individual sites as red dots
  geom_point(data=sites,aes(x=long,y=lat),col="red",size=4,alpha=0.9)+
# Set labels
  labs(x="",y="",title="US LTER Sites")+
# Zoom in on the contiguous US, otherwise points are plotted elsewhere
  coord_cartesian(ylim=c(24,50),xlim=c(-65,-130))+
# Set base theme and remove legend, and fill grid with grey background
  theme_bw(base_size=18)+theme(legend.position="none",
  panel.background=element_rect(fill="grey90"))

Created by Pretty R at inside-R.org

3 thoughts on “Mapping in R, Part I”

hwhudson3 says:
February 13, 2013 at 1:54 PM


Hi Jon, just curious, what happened to California? It does not appear to be shaded. Thanks for providing the code!
- jslefche says:
  February 13, 2013 at 2:12 PM
  
  
  Ah, good catch Wes! In this case, both CA sites are coastal LTERS (California Current and Santa Barbara), which means they don’t occur within the spatial polygon. In this case you would have to manually add “california” to the `states_names` object. R does what you tell it to do, which is not always what you want it to do!
Pingback: Mapping in R, Part II | sample(ECOLOGY)

sample(ECOLOGY)

Random thoughts on ecology, biodiversity, and science in general

Mapping in R, Part I

3 thoughts on “Mapping in R, Part I”

Leave a comment Cancel reply

Share this:

Related

3 thoughts on “Mapping in R, Part I”

Leave a comment Cancel reply