Who's Tweets Do I Read… Magic R Code Says…

tweetingme

So one glace at my user logs shows the truth: no one gives a rat’s rump that I just quit my job; you just love you some Twitter R code. And I’m nothing but an attention whore, so come get some!

So in my last ‘Twitter with R’ post I gave you some code I’d written ripped off that allowed you to update your status from R. That’s kinda cool, but really just for annoying your friends, tweeting when your code is finished running or, as Eva pointed out in the comments, maybe Tweeting the outcome of a routine. But R is good for analyzing data, plotting graphics, and cool stuff like that. Seems like under kill to just Tweet from it.

So let’s make some pretty pictures and stuff. Or more specifically, let’s plot a histogram of the last 200 Tweets you received and the people who sent them. An example of said histogram is above.

If you don’t already have the libraries XML, lattice, and RCurl, you will need them:
install.packages('RCurl')
install.packages('XML')
install.packages('lattice')

Then once you get those bad boys, you can run this code:

library("RCurl")
library("XML")
library("lattice")
#
#be sure and put your username and passy here
username<-"YourUserName"
passy<-"YourPass"
#
#This sets up the options for curl
#then makes the request from the Twitter API
#the count=200 option pulls the last 200 tweets from your friends
#the twitter api limits you to a max of 200.... yeah, well that's life
#
opts <- curlOptions(header = FALSE, userpwd = paste(username,":",passy,sep=""))
request <- "http://twitter.com/statuses/friends_timeline.xml?count=200"
timeline <- getURL(request,.opts = opts)
#
#Now let's beat up on the XML like it owes us money
doc <- xmlInternalTreeParse(timeline, useInternalNodes = TRUE)
#
#grab only the screen_names and make a list
xml_names_of_posters <- getNodeSet(doc, "/statuses/status/user/screen_name")
text_names_of_posters <- lapply(xml_names_of_posters,xmlValue)
#
#let's take it out of a list... just for kicks
Twitterbaters <- unlist(text_names_of_posters)
#
#and shove it into a data frame... seems like going around my

#ass to get to my elbow, but I want to put it in a table eventually
#table is kinda like a cross tab. It calcs the frequency for me
posters_list_df<-as.data.frame(Twitterbaters)
Tweets = table(posters_list_df)
#
#lets graph this monkey business with lattice
#
NiftyChart<-barchart(~sort(Tweets), main=list("Who's Tweets Am I Getting?" ,cex=1),xlab=list("Number of Tweets",cex=1))
NiftyChart
update(NiftyChart, col="brown")
#

EDIT: Look in the comments for a great base graphics solution from Paolo. He makes the same graph without Lattice.

Now be sure and change the username and password then run that mofo. So now you have a pretty picture like the one I made above. Pretty slick, no?

Special credit goes to @gappy3000 who tipped me off to making this with Lattice instead of ggplot because of the difficulty sorting with ggplot. @HarlanH for helping me know that my struggles with ggplot were not of my own making but were systemic.

The Twitter syntax I hacked together is from the Twitter API documentation. Have fun! And come back later for more attenion whoring blogging from your’s truly, @CMastication.

BTW, the reason I didn’t structure this as a function is that you should be stepping through this one line at a time to figure out how it works. That’s just harder with a function. So I did this for your own good. One day you’ll thank me.

7 Comments

  1. Paolo says:

    Nice code. Why don’t use base graphics?

    par(mfrow = c(1, 1), oma = c(0, 4, 0, 2), las=1)
    barplot((sort(Tweets)), horiz=T, main=”Who’s Tweets Am I getting?”, col=”cadetblue3″, xlab=”Number of Tweets”)

  2. J says:

    Paolo… this is EXACTLY why I like blogging this type of stuff! You are right. Base graphics works great. But I was having trouble getting the bars right and getting it sorted properly. Thank you for your code example!

    JD

  3. jebyrnes says:

    Hrm. Now, if only one’s webserver was running R, you could turn this into a little webservice via a cgi script. Sadly, there is no R on my webhost.

  4. J says:

    jebyrnes – @drewconway has some great code that does the same basic thing with Python and Google charts. I busted his balls a little about using a pie chart, but the code is good.

    http://www.drewconway.com/zia/?p=302

    -J

  5. ScottS-M says:

    I’ve been using RApache with good results recently but not much help when it’s not your own server.

  6. Arne S says:

    You could also consider using the twitteR-package. Has all features build in…

  7. JD Long says:

    Very good point about using twitteR! When I wrote this blog post Jeff Gentry had not yet released the twittR package. But now that he has, this blog post is a bit outdated! Here’s a link for others: http://cran.r-project.org/web/packages/twitteR/index.html

Leave a Reply