Kicking Ass with plyr

Tonight (October 29, 2009) at 5:30 PM is the Chicago R meetup at Jaks tap. Here’s more info.  I’ll be making a presentation based on my earlier blog post about plyr. The presentation will only be 8 minutes long so I’ve had to pick and choose my info carefully. OK, who am I kidding? I had a couple of Schlitz (in a bottle!) for lunch over at Boni Vinos and slammed some slides together rather haphazardly. At any rate, here’s the presentation. I owe special thanks to all the folks in Twitter who reviewed these slides this week. A special shout out to @kenahoo who caught my one code typo! And also to @hadleywickham (author of plyr) who made some good suggestions, some of which I heeded. As a professor he should consider 15% application of his information to be a phenomenally high rate.

Click the graphic to download the slides as a PDF:

kickingasswithplry

If you’re wondering what my favorite beer is, I’ll give you a secret. My favorite beer is #3. That’s the one that makes me a persuasive and articulate public speaker. #4 makes me dance well.

I hope to see you tonight.

3 Comments

  1. Farrel Buchinsky says:

    How does plyr differ from casting using the reshape package also from Hadley Wickham?

  2. J says:

    Farrel, I suspect there is a bit of code base sharing between the reshape and the plyr projects. In many situations (my simple mean example) the same result could be obtained with either package.

    One difference is that plyr is built to take input from any one of 3 formats and output anyone of 4 ways. Another difference is that plyr does not have the functionality to do the equivalent of the ‘melt’ command from reshape.

    Here’s my heuristic on when to use which package:

    1) if normalizing / de-normalizing a data frame is the main purpose: reshape
    2) Summarizing a data frame: either package
    3) taking data from one type of object, doing a split, apply, combine operation and outputting in another object: plyr

    I also have a number of summary type things that could be done in either. I tend to do those with plyr so I don’t have to remember both syntax sets. I generally only use reshape for #1 above.

    I hope that helps.

    The reshape package: http://had.co.nz/reshape/

  3. Jay says:

    Yes reshape and plyr share some possibilities but what plyr also makes easy is within group work (where you are not summarizing to a single row per group) such as adding group aggregates, etc…

    There are also many other options such as building models for each group, easilty traverse those objects and produce easy to use ouput, etc…performing operations by group id(s) is powerful and easy to use.

    This is a great package!

Leave a Reply