13 Dec 2011

Some Fun with googleVis - Mapping Blog Visits on Google Map

See stand-alone code to produce this map below.


require(RJSONIO)
require(RCurl)
require(plyr)
library(googleVis)

# function to retrieve coords from ip-addresses
# (https://github.com/rtelmore/RDSTK/blob/master/src/RDSTK/R/functions.R)
ip2coordinates <- function(ip, session=getCurlHandle()) {
  api <- "http://www.datasciencetoolkit.org/ip2coordinates/"
  get.ips <- getURL(paste(api, URLencode(ip), sep=""), curl=session)
  result <- ldply(fromJSON(get.ips), data.frame)
  names(result)[1] <- "ip.address"
  return(result)
}

# read log-file:
setwd(tempdir())
download.file("http://docs.google.com/uc?export=download&id=0B2wAunwURQNsZjY4Njc1OTYtYWE3My00ZjNjLTg1YzEtNDNlNmYyYmNiYmI0",
              destfile = "google_docs.csv", mode = "wb")
log <- read.csv(paste(tempdir(),"/", "google_docs.csv", sep = ""),
                header = T, stringsAsFactors = F)

# create dataframe to collect coords:
nr = nrow(log)

Lon <- as.numeric(rep(NA, nr))
Lat <- Lon
Coords <- data.frame(Lon, Lat)

# some will not be found (I will dismiss these rows later):
for (i in 1:nr){
  try(
  Coords[i, 1:2] <- ip2coordinates(log$IP.Address[i])[c("longitude", "latitude")]
  )
}

# append to log-file:
log <- data.frame(log, Lat = Coords$Lat, Long = Coords$Lon,
                  LatLong = paste(round(Coords$Lat, 1), 
                                  round(Coords$Lon, 1),
                                  sep = ":"))

log_gmap <- log[!is.na(log$Lat), ]
gmap <- gvisMap(log_gmap, "LatLong",
                options = list(showTip = TRUE, enableScrollWheel = TRUE,
                               mapType = 'hybrid', useMapTypeControl = TRUE,
                               width = 550, height = 300))
plot(gmap)

5 comments :

  1. This is great. Do you have a script that converts raw access logs into the file format ou have listed?

    ReplyDelete
    Replies
    1. Regarding accessibility of log-data with R consider my post http://thebiobucket.blogspot.com/2011/12/blog-statistics-with-statcounter-and-r.html !

      Delete
    2. if so, then the answer for my service (stat-counter) is no, because the log-data that I use above is available only as download.

      I didn't try but I guess the process can be automated by retrieving this download csv-file in R with something like:

      library(RCurl)
      x <- getBinary("download_link_mylog.csv", ...)
      writeBin(x, ...)

      However, you will need a curlhandler to access the site with your username:password.

      Cheers,
      Kay

      Delete
  2. Excellent post!

    Is it possible to collect the ip address of the visitor in R?

    ReplyDelete
    Replies
    1. You'll need a service like Google-Analytics, Statcounter, i.e., to collect that data.
      Indeed I'm currently searching for an approach to access the log-files provide by these services via R. So, it may be possible to automate the whole process. I'll report on that!

      Delete