31 Aug Data Wrangling in Python
- Day 1
9:00 am - 6:00 pm
- Day 2
9:00 am - 6:00 pm
data science course | level: intermediate |
for questions related to this event, contact firstname.lastname@example.org
affiliation: Vrije Universiteit Brussel
The handling of data is a recurring task for most scientists. Reading in experimental data,
checking its properties, and creating visualisations may become tedious tasks. Hence,
increasing the efficiency in this process is beneficial for many scientists. Spreadsheet-based
software lacks the ability to properly support this process, due to the lack of automation
and repeatability. The usage of a high-level scripting language such as Python is ideal for
This course trains students to use Python effectively to do these tasks. The course focuses
on data manipulation and cleaning, explorative analysis and visualisation using some
important packages such as Pandas, Numpy and Matplotlib.
The course is scheduled as a two-day course. On the first day, setting up the programming
environment with the required packages using the conda package manager and an
introduction of the Jupyter notebook environment are covered. Next, the data analysis
package Pandas and Matplotlib is introduced. On the second day, more advanced usage of
Pandas for different data cleaning and manipulation tasks is taught. The acquired skills will
immediately be brought into practice to handle real-world data sets. Applications include
time series handling, categorical data, merging data,...
The course does not cover statistics, data mining, machine learning, or predictive
modelling. It aims to provide researchers the means to effectively tackle commonly
encountered data handling tasks in order to increase the overall efficiency of the research.
Following a possible change in the current pandemic situation and new government
regulations, it is possible that the seminar will be organized remotely.
This course is intended for researchers that have at least basic programming skills. A basic
(scientific) programming course that is part of the regular curriculum should suffice. For
those who have experience in another programming language (e.g. Matlab, R, ...), following
a Python tutorial prior to the course is advised.
It is intended for researchers that want to enhance their general data manipulation and
analysis skills in Python. The course is NOT intended to be a course on statistics or machine
A good introduction is the ‘Python language introduction’ section of the Scipy lecture notes (https://scipy-lectures.org/intro/language/python_language.html).
PhDs and postdocs of a Flemish university: 0 €
Other academics: 120 €
Non-profit/Social sector: 200 €
Private sector: 400 €
Stijn Van Hoey and Joris Van den Bossche