17 Mar 2014

Download all Documents from Google Drive with R

A commentator on my blog recently asked if it is possible to retrieve all direct links to your Google Documents. And indeed it can be very easily done with R, just like so:

# you'll need RGoogleDocs (with RCurl dependency..)
install.packages("RGoogleDocs", repos = "http://www.omegahat.org/R", type="source")

gpasswd = "mysecretpassword"
auth = getGoogleAuth("kay.cichini@gmail.com", gpasswd)
con = getGoogleDocsConnection(auth)

CAINFO = paste(system.file(package="RCurl"), "/CurlSSL/ca-bundle.crt", sep = "")
docs <- getDocs(con, cainfo = CAINFO)

# get file references
hrefs <- lapply(docs, function(x) return(x@access["href"]))
keys <- sub(".*/full/.*%3A(.*)", "\\1", hrefs)
types <- sub(".*/full/(.*)%3A.*", "\\1", hrefs)

# make urls (for url-scheme see: http://techathlon.com/download-shared-files-google-drive/)
# put format parameter for other output formats!
pdf_urls <- paste0("https://docs.google.com/uc?export=download&id=", keys)
doc_urls <- paste0("https://docs.google.com/document/d/", keys, "/export?format=", "txt")

# download documents with your browser
gdoc_ids <- grep("document", types)
lapply(gdoc_ids, function(x) shell.exec(doc_urls[x]))

pdf_ids <- grep("pdf", types, ignore.case = T)
lapply(pdf_ids, function(x) shell.exec(pdf_urls[x]))

3 Mar 2014

Use Case: Make Contour Lines for Google Earth with Spatial R

Here's comes a script I wrote for creating contour lines in KML-format to be used with Google Earth https://github.com/gimoya/theBioBucket-Archives/blob/master/R/r_contours_for_google_earth.R

If you want to check or just use the datasets I created for the Alps region, you can download it here: http://terrain-overlays.blogspot.co.at/index.html

1 Mar 2014

Use GDAL from R Console to Split Raster into Tiles

When working with raster datasets I often encounter performance issues caused by the large filesizes. I thus wrote up a little R function that invokes gdal_translate which would split the raster into parts which makes subsequent processing more CPU friendly. I didn't use built-in R functions simply because performance is much better when using gdal from the command line..

The screenshot to the left shows a raster in QGIS that was split into four parts with the below script.

## get filesnames (assuming the datasets were downloaded already. 
## please see http://thebiobucket.blogspot.co.at/2013/06/use-r-to-bulk-download-digital.html 
## on how to download high-resolution DEMs)
files <- dir(pattern = ".hgt")

## function for single file processing mind to replace the PATH to gdalinfo.exe!
## s = division applied to each side of raster, i.e. s = 2 gives 4 tiles, 3 gives 9, etc.
split_raster <- function(file, s = 2) {
    filename <- gsub(".hgt", "", file)
    gdalinfo_str <- paste0("\"C:/OSGeo4W64/bin/gdalinfo.exe\" ", file)
    # pick size of each side
    x <- as.numeric(gsub("[^0-9]", "", unlist(strsplit(system(gdalinfo_str, intern = T)[3], ", "))))[1]
    y <- as.numeric(gsub("[^0-9]", "", unlist(strsplit(system(gdalinfo_str, intern = T)[3], ", "))))[2]
    # t is nr. of iterations per side
    t <- s - 1
    for (i in 0:t) {
        for (j in 0:t) {
            # [-srcwin xoff yoff xsize ysize] src_dataset dst_dataset
            srcwin_str <- paste("-srcwin ", i * x/s, j * y/s, x/s, y/s)
            gdal_str <- paste0("\"C:/OSGeo4W64/bin/gdal_translate.exe\" ", srcwin_str, " ", "\"", file, "\" ", "\"", filename, "_", i, "_", j, ".tif\"")

## process all files and save to same directory
mapply(split_raster, files, 2)