(Statistical) Data Science for the Business User – ONLINE

(Statistical) Data Science for the Business User – ONLINE

  • 16/11/2020 - 17/11/2020
    10:00 am - 4:00 pm

Course details

statistics course | level: beginner | register now
for questions related to this event, contact uantwerpen@flames-statistics.com
affiliation: University of Antwerp


Abstract

This course teaches the basic concepts of statistical data science with focus on business applications. It kicks off by zooming out and reviewing various applications and the data science process model. We then discuss each of the steps in more detail. Data preprocessing is extensively covered given its impact on subsequent analytical model development. We elaborate on descriptive and predictive analytics and discuss various techniques together with their performance measurement from a business perspective. Throughout the course, some examples and small case studies are used for further clarification of the concepts introduced. In this course, illustrations will be given using the programming language R.

Upon finishing this course, the participant will have knowledge about the key concepts, business applications, potential and challenges of data science!


Prerequisites

There are no prerequisites, but previous knowledge about basic statistical principles (like maximum likelihood, hypotheses testing and linear regression) is recommended.

We will use free software R and RStudio, which can be downloaded here: https://cran.r-project.org and https://rstudio.com/products/rstudio/download/ for different operating systems.


Background readings

Extra references will be given in the slides.


Fee

Normal rates apply.


Venue

This course is online. Details will be announced.


Instructor

Tim Verdonck


Introduction to Data Science

  • What is data science/machine learning/artificial intelligence?
  • Machine learning examples and opportunities
  • Data science process model

Data Preprocessing

  • Types of data and variables
  • Summary statistics and visual data exploration
  • Dealing with missing values
  • Outlier or anomaly detection
  • Categorization
  • Transformations
  • Principal component analysis
  • Dealing with imbalanced data
  • Feature engineering

Clustering

  • Hierarchical clustering
  • K-means clustering

Predictive analytics

  • Target definition
  • Linear regression
  • Logistic regression
  • Decision trees
  • Ensemble methods
  • Neural networks and link to deep learning

Measuring the performance of predictive analytics models

  • Split sample method and cross-validation
  • Confusion matrix
  • Performance measures for classification
  • ROC curve and AUC
  • Performance measures for regression
We're sorry, but all ticket sales have ended.