02 Sep Classification Methods- Episode 8: Handling Imbalanced Data
1:00 pm - 2:30 pm
data science seminar | level: intermediate |
for questions related to this event, contact firstname.lastname@example.org
affiliation: Ghent University
Imbalanced response classes are a common problem in classification whereby a disproportionate ratio of observations in each response class can occur. Class imbalance can be found in many different areas including medical diagnosis, spam filtering, and fraud detection. The main problem with class imbalanced data is their ability to significantly compromise the overall performance of most standard learning algorithms. e.g. classifiers attempt to reduce global quantities such as the error rate, not taking the data distribution into consideration.
Class imbalance can be tackled from different angles:
• the algorithm level,
• the data level
• using ensemble-based learning
In this seminar, we are going to discuss methods for handling the two-class imbalanced learning problem the IRC package in the R software.
Bing Zhu, Zihan Gao, Junkai Zhao, Seppe K. L. M. vanden Broucke: IRIC: An R library for binary imbalanced classification. SoftwareX 10: 100341 (2019)
Teh K, Armitage P, Tesfaye S, Selvarajah D, Wilkinson ID (2020) Imbalanced learning: Improving classification of diabetic neuropathy from magnetic resonance imaging. PLoS ONE 15(12): e0243907. https://doi.org/10.1371/journal.pone.0243907
H. He and E. A. Garcia, “Learning from Imbalanced Data,” IEEE Trans. Knowledge
Dr Emmanuel Abatih