Learning in imbalanced relational data
MetadataShow full item record
Traditional learning techniques learn from flat data files with the assumption that each class has a similar number of examples. However, the majority of real-world data are stored as relational systems with imbalanced data distribution, where one class of data is over-represented as compared with other classes. We propose to extend a relational learning technique called Probabilistic Relational Models (PRMs) to deal with the imbalanced class problem. We address learning from imbalanced relational data using an ensemble of PRMs and propose a new model: the PRMs-IM. We show the performance of PRMs-IM on a real university relational database to identify students at risk.
Showing items related by title, author, creator and subject.
Ghanem, Amal Saleh (2009)Most data mining and pattern recognition techniques are designed for learning from at data files with the assumption of equal populations per class. However, most real-world data are stored as rich relational databases ...
Ghanem, Amal; Venkatesh, Svetha; West, Geoffrey (2010)The majority of multi-class pattern classification techniques are proposed for learning from balanced datasets. However, in several real-world domains, the datasets have imbalanced data distribution, where some classes ...
Ghanem, Amal; Venkatesh, Svetha; West, Geoffrey (2009)Real-world data are often stored as relational database systems with different numbers of significant attributes. Unfortunately, most classification techniques are proposed for learning from balanced nonrelational data ...