An evaluation study on text categorization using automatically generated labeled dataset

Zhu, Dengya; Wong, K.

doi:10.1016/j.neucom.2016.04.072

dc.contributor.author	Zhu, Dengya
dc.contributor.author	Wong, K.
dc.date.accessioned	2017-06-23T03:00:34Z
dc.date.available	2017-06-23T03:00:34Z
dc.date.created	2017-06-19T03:39:41Z
dc.date.issued	2017
dc.identifier.citation	Zhu, D. and Wong, K. 2017. An evaluation study on text categorization using automatically generated labeled dataset. Neurocomputing. 249: pp. 321-336.
dc.identifier.uri	http://hdl.handle.net/20.500.11937/53578
dc.identifier.doi	10.1016/j.neucom.2016.04.072
dc.description.abstract	Naïve Bayes, k-nearest neighbors, Adaboost, support vector machines and neural networks are five among others commonly used text classifiers. Evaluation of these classifiers involves a variety of factors to be considered including benchmark used, feature selections, parameter settings of algorithms, and the measurement criteria employed. Researchers have demonstrated that some algorithms outperform others on some corpus, however, inconsistency of human labeling and high dimensionality of feature spaces are two issues to be addressed in text categorization. This paper focuses on evaluating the five commonly used text classifiers by using an automatically generated text document collection which is labeled by a group of experts to alleviate subjectivity of human category assignments, and at the same time to examine the influence of the number of features on the performance of the algorithms.
dc.publisher	Elsevier BV
dc.title	An evaluation study on text categorization using automatically generated labeled dataset
dc.type	Journal Article
dcterms.source.volume	249
dcterms.source.startPage	321
dcterms.source.endPage	336
dcterms.source.issn	0925-2312
dcterms.source.title	Neurocomputing
curtin.department	School of Information Systems
curtin.accessStatus	Fulltext not available

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Curtin Research Publications

Show simple item record

An evaluation study on text categorization using automatically generated labeled dataset

Files in this item

This item appears in the following Collection(s)

Related items