Analysing Beat The Street data using R

Beat The Street is a game currently running in East Sussex with the aim of increasing the amount of walking and cycling people do. Players are given a free Beat The Street card that they tap on Beat Boxes to record their journeys. At least 2 different Beat Boxes must be tapped in the space of an hour for points to be recorded. Each 2nd box tapped within an hour of the last earns at least 10 points.

While there isn’t an officially documented API to the Beat The Street data, the website itself is built using JSON feeds we can use to extract data ourselves. There can be multiple schemes running at once. I’m playing in East Sussex which is scheme 47, we’ll need this ID shortly.

Firstly we’ll need to login as this sets a cookie that identifies us for subsequent calls.

loginurl <- ""
loginbody <- list(Username = "Your Username", Password = "Your Password", RememberMe = "false")
r <- POST(loginurl, body = loginbody, encode = "form", verbose())

We can use R's httr library to handle the login for us. We POST our details to the form. I'm using the verbose() option for debugging, but that's optional.

Now we are logged in, we should have a cookie set identifying us, so we can request a record of our journeys so far.

swipesurl <- ""
r <- GET(swipesurl, accept_json())
cont <- content(r, as = "parsed", type="application/json")

The returned JSON data is an array of objects looking like this.


We can parse this by calling content().

The values are all strings apart from "Points" which is an integer. However, sometimes Beat The Street returns a null value, which we'll want to convert to an R NA. We can do this by applying a small utility function over the data to convert any nulls to NA.

nullToNA <- function(x) {
    x[sapply(x, is.null)] <- NA
cont <- lapply(cont, nullToNA)

Now we can convert our data to a data.frame, cast the "SwipeDatetime" as a DateTime, and add in an extra field called "Date" that's just the Date without the time.

swipes <- ldply(cont, data.frame)
swipes$SwipeDatetime <- as.POSIXct(swipes$SwipeDatetime, format = "%Y-%m-%dT%H:%M:%S")
swipes$Date <- as.Date(swipes$SwipeDatetime)

Now we have our data in usable format, we can extract some useful information.

I can plot a graph of how many Beat Boxes I've tapped each day.

ggplot(swipes, aes(Date)) + geom_bar(stat="count") + ggtitle("Beat Boxes Visited Per Day") + ylab("Visits") + xlab("Date") + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Number of Beat Boxes visited by day

Or I can plot which Beat Boxes I visit most.

swipesFrame <- data.frame(sort(table(swipes$Name)))
ggplot(data=swipesFrame, aes(x=Var1, y=Freq)) + geom_bar(stat="identity") + ggtitle("Most Visited Beat Boxes") + ylab("Visits") + xlab("Beat Box") + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Most visited Beat Boxes

I can even plot this onto a map assuming I have the latitude and longitudes of the Beat Boxes already available in a data.frame called "beatboxes", and the map in "map". The larger the frequency of visit, the larger the plotted point on the map.

swipeFrameLatLng <- merge(swipesFrame, beatboxes, by.x = "Var1", by.y="id")
ggmap(map) + geom_point(data = swipeFrameLatLng, aes(x = longitude, y = latitude, color="Red", size=Freq*10)) + ylab("") + xlab("") + guides(colour = "colorbar",size = "none")

Most visited Beat Boxes map