require(XML)
path_to_files <- "D:/GIS_DataBase/CorineLC/Seamless"
dir.create(path_to_files)
setwd(path_to_files)
doc <- htmlParse("http://www.eea.europa.eu/data-and-maps/data/clc-2006-vector-data-version-2")
urls <- xpathSApply(doc,'//*/a[contains(@href,".zip/at_download/file")]/@href')
# function to get zip file names
get_zip_name <- function(x) unlist(strsplit(x, "/"))[grep(".zip", unlist(strsplit(x, "/")))]
# function to plug into sapply
dl_urls <- function(x) try(download.file(x, get_zip_name(x), mode = "wb"))
# download all zip-files
sapply(urls, dl_urls)
# function for unzipping
try_unzip <- function(x) try(unzip(x))
# unzip all files in dir and delete them afterwards
sapply(list.files(pattern = "*.zip"), try_unzip)
# unlink(list.files(pattern = "*.zip"))
21 Apr 2013
Programmatically Download CORINE Land Cover Seamless Vector Data with R
Thanks to a helpful SO-Answer I was able to download all CLC vector data (43 zip-files) programmatically:
Subscribe to:
Post Comments
(
Atom
)
Thank you, this is really helpful!
ReplyDeleteAlthough this works fine, I have problems when I want to use the newest CLC version. I replaced the link with “http://www.eea.europa.eu/data-and-maps/data/clc-2006-vector-data-version-3”, but now it does not find any urls.
Karin
Hi there,
Deleteuse:
doc <- htmlParse("http://www.eea.europa.eu/data-and-maps/data/clc-2006-vector-data-version-3")
urls <- xpathSApply(doc,'//*/a[contains(@href,".zip")]')
Cheers,
Kay