Curtin University Homepage
  • Library
  • Help
    • Admin

    espace - Curtin’s institutional repository

    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item

    Text Categorization Using an Automatically Generated Labelled Dataset: An Evaluation Study

    Access Status
    Fulltext not available
    Authors
    Zhu, Dengya
    Wong, K.
    Date
    2014
    Type
    Conference Paper
    
    Metadata
    Show full item record
    Citation
    Zhu, D. and Wong, K. 2014. Text Categorization Using an Automatically Generated Labelled Dataset: An Evaluation Study, in Loo, C.K. and Yap, K.S. and Wong, K.W. and Teoh, A. and Huang, K. (ed), Proceedings of 21st International Conference on Neural Information Processing: The Next Renaissance of the Neural Information Processing (Part 1), Nov 3-6 2014, pp. 479-486. Sarawak, Malaysia: University of Malaya.
    Source Title
    Neural Information Processing
    Source Conference
    ICONIP 2014
    DOI
    10.1007/978-3-319-12637-1_60
    ISBN
    9783319126364
    School
    School of Information Systems
    URI
    http://hdl.handle.net/20.500.11937/26799
    Collection
    • Curtin Research Publications
    Abstract

    Naïve Bayes(NB), kNN and Adaboost are three commonly used text classifiers. Evaluation of these classifiers involves a variety of factors to be considered including benchmark used, feature selections, parameter settings of algorithms, and the measurement criteria employed. Researchers have demonstrated that some algorithms outperform others on some corpus, however, labeling and corpus bias are two concerns in text categorization. This paper focuses on evaluating the three commonly used text classifiers by using an automatically generated text document set which is labelled by a group of experts to alleviate subjectiveness of labelling, and at the same time to examine how the performance of the algorithms is influenced by feature selection algorithms and the number of features selected.

    Related items

    Showing items related by title, author, creator and subject.

    • Improving the relevance of web search results by combining web snippet categorization, clustering and personalization
      Zhu, Dengya (2010)
      Web search results are far from perfect due to the polysemous and synonymous characteristics of nature languages, information overload as the results of information explosion on the Web, and the flat list, “one size fits ...
    • An evaluation study on text categorization using automatically generated labeled dataset
      Zhu, Dengya; Wong, K. (2017)
      Naïve Bayes, k-nearest neighbors, Adaboost, support vector machines and neural networks are five among others commonly used text classifiers. Evaluation of these classifiers involves a variety of factors to be considered ...
    • Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances
      Tohira, Hideo ; Finn, Judith ; Ball, Stephen ; Brink, D.; Buzzacott, Peter (2021)
      We derived machine learning models utilizing features generated by natural language processing (NLP) of free-text data from an ambulance services provider to identify fall cases. The data comprised samples of electronic ...
    Advanced search

    Browse

    Communities & CollectionsIssue DateAuthorTitleSubjectDocument TypeThis CollectionIssue DateAuthorTitleSubjectDocument Type

    My Account

    Admin

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Follow Curtin

    • 
    • 
    • 
    • 
    • 

    CRICOS Provider Code: 00301JABN: 99 143 842 569TEQSA: PRV12158

    Copyright | Disclaimer | Privacy statement | Accessibility

    Curtin would like to pay respect to the Aboriginal and Torres Strait Islander members of our community by acknowledging the traditional owners of the land on which the Perth campus is located, the Whadjuk people of the Nyungar Nation; and on our Kalgoorlie campus, the Wongutha people of the North-Eastern Goldfields.