This example was presented at our initial kickoff Meetup. Here we're taking a few relatively large datasets (from the nycflights13 package: all the flights departing from any NYC airport throughout 2013, along with related datasets about the weather, airplanes, and airports) to try to better understand why flights are delayed.
This example demonstrates basic and intermediate/advanced usage of the dplyr package, and how to chain commands together using the %>%
syntax. I'm assuming some basic experience with data manipulation using base R, as well as familiarity with using ggplot2 for data visualization. We'll also use some basic functionality from the lubridate package for date/time manipulation.
After this demonstration, hopefully Meetup participants will have an appreciation for the power that dplyr provides for advanced data manipulation and analysis.