Multi-class Pattern Classification in Imbalanced Data
Access Status
Authors
Date
2010Type
Metadata
Show full item recordCitation
Source Title
Source Conference
ISSN
School
Remarks
Copyright © 2010 IEEE This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
Collection
Abstract
The majority of multi-class pattern classification techniques are proposed for learning from balanced datasets. However, in several real-world domains, the datasets have imbalanced data distribution, where some classes of data may have few training examples compared for other classes. In this paper we present our research in learning from imbalanced multi-class data and propose a new approach, named Multi-IM, to deal with this problem. Multi-IM derives its fundamentals from the probabilistic relational technique (PRMs-IM), designed for learning from imbalanced relational data for the two-class problem. Multi-IM extends PRMs-IM to a generalized framework for multi-class imbalanced learning for both relational and non-relational domains.
Related items
Showing items related by title, author, creator and subject.
-
Ghanem, Amal Saleh (2009)Most data mining and pattern recognition techniques are designed for learning from at data files with the assumption of equal populations per class. However, most real-world data are stored as rich relational databases ...
-
Ghanem, Amal; Venkatesh, Svetha; West, Geoff (2008)Traditional learning techniques learn from flat data files with the assumption that each class has a similar number of examples. However, the majority of real-world data are stored as relational systems with imbalanced ...
-
Ghanem, Amal; Venkatesh, Svetha; West, Geoffrey (2009)Real-world data are often stored as relational database systems with different numbers of significant attributes. Unfortunately, most classification techniques are proposed for learning from balanced nonrelational data ...