02 Sep Classification Methods- Episode 8: Handling Imbalanced Data
- 27/10/2021
1:00 pm - 2:30 pm
Course details
data science seminar | level: intermediate |
register now
for questions related to this event, contact ugent@flames-statistics.com
affiliation: Ghent University
Abstract
Imbalanced response classes are a common problem in classification whereby a disproportionate ratio of observations in each response class can occur. Class imbalance can be found in many different areas including medical diagnosis, spam filtering, and fraud detection. The main problem with class imbalanced data is their ability to significantly compromise the overall performance of most standard learning algorithms. e.g. classifiers attempt to reduce global quantities such as the error rate, not taking the data distribution into consideration.
Class imbalance can be tackled from different angles:
• the algorithm level,
• the data level
• using ensemble-based learning
In this seminar, we are going to discuss methods for handling the two-class imbalanced learning problem the IRC package in the R software.
Prerequisites
Background readings
Bing Zhu, Zihan Gao, Junkai Zhao, Seppe K. L. M. vanden Broucke: IRIC: An R library for binary imbalanced classification. SoftwareX 10: 100341 (2019)
Teh K, Armitage P, Tesfaye S, Selvarajah D, Wilkinson ID (2020) Imbalanced learning: Improving classification of diabetic neuropathy from magnetic resonance imaging. PLoS ONE 15(12): e0243907. https://doi.org/10.1371/journal.pone.0243907
H. He and E. A. Garcia, “Learning from Imbalanced Data,” IEEE Trans. Knowledge
Fee
Free
Venue
Online
Instructor
Dr Emmanuel Abatih