12 Apr 2013

Download File from Google Drive/Docs Programmatically with R

Following up my lattest posting on how to download files from the cloud with R..

dl_from_GoogleD <- function(output, key, format) {

## Arguments:
## output = output file name
## key = Google document key
## format = output format (pdf, rtf, doc, txt..)
## Note: File must be shareable!

                        require(RCurl)
                        bin <- getBinaryURL(paste0("https://docs.google.com/document/d/", key, "/export?format=", format),
                                            ssl.verifypeer = FALSE)
                        con <- file(output, open = "wb")
                        writeBin(bin, con)
                        close(con)
                        message(noquote(paste(output, "read into", getwd())))                        
                        }


# Example:
dl_from_GoogleD(output = "dl_test.pdf", 
                key = "1DdauvkcVm5XtRBkQIv1na8PeLAwpCBdW8pALCFpRWeM",
                format = "pdf")
shell.exec("dl_test.pdf")
EDIT: Here's how it can be done for spreadsheet-like data, like HERE, which is a comma seperated file with .txt extension saved to Google Drive. See also this post
library(RCurl)
setwd(tempdir())
destfile = "test_google_docs.csv"
x = getBinaryURL("https://docs.google.com/uc?export=download&id=0B2wAunwURQNsR0I0a0NlQUlJdzA", followlocation = TRUE, ssl.verifypeer = FALSE)
writeBin(x, destfile, useBytes = TRUE)
shell.exec(paste(tempdir(), "/test_google_docs.csv", sep = ""))

6 comments :

  1. Kay, thanks a lot for this code -- very useful posting reproducible examples on SO! I've essentially copy/pasted what I think should be replaced in your code, but getting two errors. Wondering if you could perhaps comment on this:

    Error in file(output, open = "wb") : cannot open the connection
    In addition: Warning message:
    In file(output, open = "wb") :
    cannot open file 'VAR RENAME TEST.xlsx': Permission denied

    Here's the full code I had:

    ## My test file is here: https://docs.google.com/file/d/0B1bNxpx4XS8dTG01Z3pvRjVYZHc/edit?
    usp=sharing
    dl_from_GoogleD <- function(output, key, format) {
    require(RCurl)
    bin <- getBinaryURL(paste0("https://docs.google.com/file/d/", key, "/edit?usp=", format),
    ssl.verifypeer = FALSE)
    con <- file(output, open = "wb")
    writeBin(bin, con)
    close(con)
    message(noquote(paste(output, "read into", getwd())))
    }

    test <- dl_from_GoogleD(output = "VAR RENAME TEST.xlsx",
    key = "0B1bNxpx4XS8dTG01Z3pvRjVYZHc",
    format = ".xlsx")

    sorry for the intensive question, but thanks in advance for any help you may have.

    ReplyDelete
    Replies
    1. Generally, I think it is advisable to use simple file formats like csv which will keep you out of trouble with multiple worksheets, etc.

      See my EDIT above for how it could be done..

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
  3. How do we handle authentication where the file is on our own gmail/docs account and we want too download it as an XLSX.

    Cookies seem to have three parts,
    SID, HSID and SSID that are needed.
    However I am not sure how to obtain these other than pulling them out of my browser.
    It would be nice to be able to get them without using a browser for a tool I am working on.

    ReplyDelete
    Replies
    1. I don't know it exactly, but I saw it somewhere. Please ask for Help on StackOverflow and check if RGoogleDocs does what you want before!

      Delete