08 Jul data wrangling and visualization with R tidyverse
- data wrangling and visualization with R tidyverse
1:00 pm - 5:00 pm
statistics seminar | level: beginner |
affiliation: Ghent University
The tidyverse is a collection of R-packages used for data wrangling and visualization that share a common design philosophy. The goal of this workshop is to get you up to speed with the most important tidyverse tools for data exploration. After attending this workshop, you’ll have the tools to tackle a wide variety of data wrangling and visualization challenges, using the best parts of R tidyverse.
What you will learn:
• Tidying your data using tidyr: storing it in a consistent form that matches the semantics of the dataset with the way it is stored.
• Transforming your data using dplyr: narrowing in on observations of interest, creating new variables that are functions of existing variables, and calculating a set of summary statistics (like counts or means).
• Merging and comparing two datasets based on various matching or filtering criterion.
• Visualizing your data using ggplot2: creating more informative graphs (e.g., scatter plot, bar plot, histogram, smoother/regression line, …) in an elegant and efficient way. Arranging multiple plots on a grid.
• Other useful tools for R programming
What you won’t learn:
• A systematic training guide in basics of R. If you never used R or RStudio before, try to get some basic knowledge of them before attending the course.
• Big data. This workshop focuses on small, in-memory datasets as you can’t tackle big data easily unless you have experience with small data.
• Statistics. Although you will see many basic statistics in this workshop, the main focus is on R and the tidyverse tools instead of explaining the statistical concepts.
Workshop structureThe content of this workshop is structured roughly according to the order of a general data analysis procedure: starting with data ingest and tidying (tidyr), then data wrangling with transforming and summarizing (dplyr), followed by visualization and representation (ggplot2). Other tools (from other packages in the tidyverse) for programming will also be added. In the end, we will give further instructions and tips on how to get help, and to help you keep learning.
This workshop blends lectures with hands-on exercises which allows you to try out the tools you’ve seen in the class under guides.
The course materials e.g., lecture slides, data, r scripts, exercises and solutions, are piled into an RStudio project. Thus, it is recommended to install R and RStudio beforehand:
• R: https://cran.r-project.org
• RStudio: https://rstudio.com/products/rstudio/download/
FeePhD's or postdocs of a Flemish university: free of charge
Other academics: 30 €
Non-profit/Public sector: 50 €
Private sector: 100 €
The course is open to all interested persons with some basic experience using R or other programming languages.
Open recourse for your own learning and discovery on tidyverse:
Dr Limin Liu