Classification Methods- Episode 8: Handling Imbalanced Data

Classification Methods- Episode 8: Handling Imbalanced Data

  • 27/10/2021
    1:00 pm - 2:30 pm

Course details

data science seminar | level: intermediate | register now
for questions related to this event, contact ugent@flames-statistics.com
affiliation: Ghent University


Abstract

Imbalanced response classes are a common problem in classification whereby a disproportionate ratio of observations in each response class can occur. Class imbalance can be found in many different areas including medical diagnosis, spam filtering, and fraud detection. The main problem with class imbalanced data is their ability to significantly compromise the overall performance of most standard learning algorithms. e.g. classifiers attempt to reduce global quantities such as the error rate, not taking the data distribution into consideration.

Class imbalance can be tackled from different angles:

• the algorithm level,
• the data level
• using ensemble-based learning

In this seminar, we are going to discuss methods for handling the two-class imbalanced learning problem the IRC package in the R software.


Prerequisites


Background readings

Bing Zhu, Zihan Gao, Junkai Zhao, Seppe K. L. M. vanden Broucke: IRIC: An R library for binary imbalanced classification. SoftwareX 10: 100341 (2019)

Teh K, Armitage P, Tesfaye S, Selvarajah D, Wilkinson ID (2020) Imbalanced learning: Improving classification of diabetic neuropathy from magnetic resonance imaging. PLoS ONE 15(12): e0243907. https://doi.org/10.1371/journal.pone.0243907

H. He and E. A. Garcia, “Learning from Imbalanced Data,” IEEE Trans. Knowledge




Fee

Free


Venue

Online


Instructor

Dr Emmanuel Abatih


We're sorry, but all tickets sales have ended because the event is expired.