Compose true color Landsat 7 images using R

If you are familiar with Landsat 7 (1999 - 2013), you know that there was failure with the Scan Line Corrector instrument early in the mission deployment. The end result is that a sizable portion (~22 %) of each “scene” has been lost. People interested in creating true color images without these missing data gaps might find themselves at the USGS page describing how to fill gaps for display.

landsat

However, after reading the page you might think that the only solutions for gap-filling involve using very expensive proprietary software (ERDAS, Photoshop) or paying for corrected imagery from a Landsat business partner. Not having access to these software programs, I unsuccessfully attempted to use the Dust-and-Scratches solution albeit with open-source imaging programs. After quite a bit of struggling I discoved an easy alternative which I will now share.

First we need to acquire some Landsat data. I found this guide by Robert Simmon to be an excellent introduction to the topic. (On a side-note, the landsat-util program looks excellent. It is too bad it only provides access to Landsat 8 data.)

In this case we have downloaded our compressed achive to the /home folder. Landsat 7 data archives have the following file/folder structure and for the purposes of creating true color images we are only concerned with bands 1-3. One important thing to note is the presence of a gap_mask folder. The tif files within the gap_mask archives contain the missing data from the top level band files.

    LE70150432010.tar.gz
    │   LE70150432010_B1.TIF
    │   LE70150432010_B2.TIF
    │   LE70150432010_B3.TIF
    │
    └───gap_mask
        ├───LE70150432010_GM_B1.TIF.gz
        │   │   LE70150432010_GM_B1.TIF
        ├───LE70150432010_GM_B2.TIF.gz
        │   │   LE70150432010_GM_B2.TIF
        ├───LE70150432010_GM_B3.TIF.gz
        │   │   LE70150432010_GM_B3.TIF

The following R code chunk loops through each band/mask combination and uses the gdal_fillnodata.py script (ships with gdal by default) to replace missing data in each band with the corresponding data in the mask file. Next, the resulting rasters are cropped according to a custom defined extent. Finally, the bands are combined and plotted using the plotRGB function.

Before running this workflow on your own landsat 7 data ensure that you have gdal installed and available to the base::system command. You can check this by running gdal-config --version from the command line. Also make sure that you have the rgdal and raster packages installed.

    untar(paste0("~/LE70260412007180EDC00.tar.gz"), exdir = "landsat")
    
    tifs <- list.files("landsat", pattern = "*.TIF$",
      include.dirs = TRUE, full.names = TRUE)[1:3]
    ctifs<-paste0(unlist(strsplit(tifs,".TIF")), "_f", ".TIF")
    masks <- list.files("landsat/gap_mask", include.dirs = TRUE, full.names = TRUE)[1:3]
    
    library(raster)
    library(rgdal)
            
    for(i in 1:length(tifs)){
      system(paste("gdal_fillnodata.py -mask", paste0("/vsigzip/", masks[i]), "-of GTiff", tifs[i], ctifs[i]))
      }
            
    rstack <- raster::stack(ctifs)
            
    #raster::drawExtent()
    extent <- raster::extent(625508, 673853, 3072252, 3096110)
    rstack <- raster::crop(rstack, extent)
              
    raster::plotRGB(rstack,r=3,g=2,b=1)
    rect(extent[1], extent[3], extent[2], extent[4], lwd = 1.5)
    
    #file.remove(list.files("landsat", include.dirs = TRUE, full.names = TRUE,
      recursive = TRUE))

landsat

Convert electrical conductivity to salinity

I recently authored my first R Shiny app. What follows is a reproduction of the intro tab of the app which can be accessed “live” at: https://jsta.shinyapps.io/cond2sal_shiny. The source for this app is on github where you can report issues and post feature requests.

What is Salinity?

Salinity is a numeric measure of water saltiness. Standard oceanic seawater is about 35. The salinty of natural waters can vary from 0 in freshwater lakes, river, and streams to upwards of 50 in hypersaline brine pools.

Calculation from Conductivity

Modern salinity measurements are generally calculated from direct measurement of electrical conductivity. These direct conductivity measurements are sometimes supplemented by other measurements such as temperature and pressure when high prescision is desired. High precision salinity measurements are important in many oceanographic applications such as density, mixing, and flow modelling.

This app calculates salinity from conductivity data using the PSS-78 practical salinity equation. In its current form temperature and hydrostatic pressure are assumed to be 25 deg C and 0 dBar respectively.

condmeaurement
Image credit: http://ian.umces.edu/imagelibrary/displayimage-1996.html
Figure: Direct measurement of conductivity using a handheld meter

Use the navigation links at https://jsta.shinyapps.io/cond2sal_shiny in order to convert your data.

References

IOC, SCOR and IAPSO, 2010: The international thermodynamic equation of seawater – 2010: Calculation and use of thermodynamic properties. Intergovernmental Oceanographic Commission, Manuals and Guides No. 56, UNESCO (English), 196 pp

Alan D. Jassby and James E. Cloern (2015). wq: Some tools for exploring water quality monitoring data. R package version 0.4.4. See the wq::ec2pss function.

GIS "in the cloud" with R

After running across this tutorial by Dean Attali, I attempted to setup my own R GIS platform “in the cloud”. Dean’s tutorial was excellent for the most part. I was able to spin up my own server with RStudio installed. The two difficulties that I ran into dealt with gdal and choosing a server size.

It was very difficult to track down the Ubuntu system-level dependencies for rgdal. The solution ultimately was to install the gdal-bin and libgdal-dev packages using apt-get.

I had to go through a drawn-out trial and error process to find the correct amount of RAM required for my processing. Luckily, it was not too difficult because Digital Ocean offers flexible server resizing. This enabled me to expand the RAM basically “on the fly”. You pay more for larger servers and flexible resizing is nice because you can drop it down to a smaller (cheaper) server when not running intensive computations.

Once your system level dependencies are set, you can use rsync to transfer datasets and R packages. A command like the following can be used to transfer the contents of your Data folder to your remote server.

    rsync -av /home/<user_name>/Data/ <user_name>@<ip_address>:~/Data

I encourage people to try this solution if you find yourself running long computations on your local machine. If you sign up at Digital Ocean using my referer link, you will recieve $10 in starter money. This is esentially a free trial!

Update

I learned several more details that really made my workflow possible:

  1. Using the Rscript program rather than direct use of RStudio was neccessary in order to prevent memory issues.
    • Combining Rscript with Makefiles allowed for processing comparmentalization.
    • Adding a - character before a Makefile command will enable the Makefile to proceed even if errors occur.

2. The screen program will allow you to log out of your server while keeping your processing running.

Landscape resistance from spatial randomization

I was intrigued by the idea of running simulations to define landscape resistance as presented in Cushman et al. (2010). Their basic strategy is to use measured track data as a proxy for landscape resistance. They randomly shift and rotate their measured tracks in order to generate support for their paramaterization of landscape resistance by comparing original and transformed tracks. The R code below takes an approximation of their example track and puts it through the same random shifting/rotating procedure.

trackrot

    library(raster)
    library(sp)
    library(maptools)
    library(grDevices)
    library(XML)
    
    path.mat<-eval(parse(text=gsub('\r\n','\n',xpathApply(htmlTreeParse('http://pastebin.com/HN1u8fpW',useInternal=T),'//textarea',xmlValue))))
    names(path.mat)<-c("x","y")
    path.mat<-path.mat/10000
    
    path<-Lines(list(Line(path.mat)),ID="p")
    pathSL<-SpatialLines(list(path))
    
    rot.list<-list()
    for(i in 1:199){
      #randomly shift nodes between 0 and 30km 
      shiftx<-sample(seq(-3,3,0.01),1,replace=T)
      shifty<-sample(seq(-3,3,0.01),1,replace=T)
      npath.mat<-cbind(path.mat[,1]+shiftx,path.mat[,2]+shifty)
      npath<-Lines(list(Line(npath.mat)),ID="np")
      npathSL<-SpatialLines(list(npath))
      
      #randomly rotate path
      npath.rot<-elide(npathSL,rotate=sample(seq(0,360,0.01),1,replace=T),center=apply(bbox(pathSL),1,mean))
      npath.rot.mat<-coordinates(npath.rot)[[1]][[1]]
      rot.list[[i]]<-npath.rot<-Lines(list(Line(npath.rot.mat)),ID=paste("npr",i,sep=""))
    }
    rot.list$original<-path
    all.lines<-SpatialLines(rot.list)
    
    
    #generate simulated resistance surface
    textent<-bbox(all.lines)
    nrows<-(textent[1,2]-textent[1,1])
    ncols<-(textent[2,2]-textent[2,1])
    rsurf<-raster(matrix(NA,nrow=nrows,ncol=ncols))
    extent(rsurf)<-textent
    rsurf[]<-runif(nrows*ncols)
    rsurf<-rsurf>0.8
    
    plot(rsurf)
    plot(all.lines,col=c(rainbow(199),"black"))

###Reference

Cushman, S. A., Chase, M., & Griffin, C. (2010). Mapping landscape resistance to identify corridors and barriers for elephant movement in southern Africa. In Spatial complexity, informatics, and wildlife conservation (pp. 349-367). Springer Japan.

Timescales for detecting SLR acceleration

Recent interest in a paper on sea level rise acceleration (Haigh et al. 2014) has led me to attempt to reproduce their analysis.

Below is R code that I wrote to examine one of the stations from the paper (Key West). In future posts I will expand on this code to check the “sliding” window portion of the paper.

kw_haigh

  ##obtain annual mean sea level for Key West from:
  #http://www.psmsl.org/data/obtaining/rlr.annual.data/188.rlrdata
  
  dt<-read.table("http://www.psmsl.org/data/obtaining/rlr.annual.data/188.rlrdata",sep=";",header=FALSE,na.strings="-99999")
  names(dt)<-c("date","level")
  dt<-dt[3:97,] #restrict date range to same as reference 1915-2009
  dt[39,2]<-(dt[38,2]+dt[40,2])/2 #fill missing value at 1953 by linear interpolation
  dt[,2]<-dt[,2]-min(dt[,2])#center series

  linearfit<-lm(dt$level~dt$date)
  dt$detrend<-dt$level-(linearfit$coefficients[2]*dt$date)      
  ar1fit<-arima(dt$detrend,order=c(1,0,0)) #fit ar1
  lm(dt$level~poly(dt$date,2,raw=T))$coefficients[3]*2 #check acceleration estimate from table 2
  
  #create ar5 mid projection of 0.5 m (500 mm) incease from 1990 to 2100
  x<-seq(from=0,to=109,by=1)
  y<-x^2*8.25+x*169.25+dt[95,2]#smooth quadratric not from IPCC
  a<-(500-76)/y[length(y)]#scaling factor
  y2<-y[20:110]*a+76
  level2<-append(dt$level-dt[76,2],y2)#down-shift time series to match haigh (level is 0 at 1990)
  level2<-cbind(data.frame(seq(from=1915,to=2100,by=1)),data.frame(level2))
  names(level2)<-c("year","level")
  plot(level2$year,level2$level,type="l",main="Key West",ylab="Level(mm)",xlab="Year",lwd=2)
  
  #simulate future oscillations from ar1fit
  for(i in 1:1000){
  ar.sim<-arima.sim(list(ar=c(as.numeric(ar1fit$coef[1]))),n=length(y2),sd=23)#sd value from haigh et al not able to reproduce
  lines(seq(from=2010,to=2100,by=1),y2+ar.sim,col="grey")
  }
  lines(seq(from=2010,to=2100,by=1),y2+ar.sim,lwd=2)

###Reference

I. D. Haigh, T. Wahl, E. J. Rohling, R. M. Price, C. B. Pattiaratchi, F. M. Calafat, S.Dangendorf, “Timescales for detecting a significant acceleration in sea level rise”, Nature Communications 5, Article number: 3635, http://dx.doi.org/10.1038/ncomms4635, published 14 April 2014.