Curtin University Homepage
  • Library
  • Help
    • Admin

    espace - Curtin’s institutional repository

    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item

    Graph-induced restricted Boltzmann machines for document modeling

    Access Status
    Fulltext not available
    Authors
    Nguyen, T.
    Tran, The Truyen
    Phung, D.
    Venkatesh, S.
    Date
    2016
    Type
    Journal Article
    
    Metadata
    Show full item record
    Citation
    Nguyen, T. and Tran, T.T. and Phung, D. and Venkatesh, S. 2016. Graph-induced restricted Boltzmann machines for document modeling. Information Sciences. 328: pp. 60-75.
    Source Title
    Information Sciences
    DOI
    10.1016/j.ins.2015.08.023
    ISSN
    0020-0255
    School
    Multi-Sensor Proc & Content Analysis Institute
    URI
    http://hdl.handle.net/20.500.11937/45888
    Collection
    • Curtin Research Publications
    Abstract

    © 2015 Elsevier Inc. All rights reserved. Discovering knowledge from unstructured texts is a central theme in data mining and machine learning. We focus on fast discovery of thematic structures from a corpus. Our approach is based on a versatile probabilistic formulation - the restricted Boltzmann machine (RBM) - where the underlying graphical model is an undirected bipartite graph. Inference is efficient - document representation can be computed with a single matrix projection, making RBMs suitable for massive text corpora available today. Standard RBMs, however, operate on bag-of-words assumption, ignoring the inherent underlying relational structures among words. This results in less coherent word thematic grouping. We introduce graph-based regularization schemes that exploit the linguistic structures, which in turn can be constructed from either corpus statistics or domain knowledge. We demonstrate that the proposed technique improves the group coherence, facilitates visualization, provides means for estimation of intrinsic dimensionality, reduces overfitting, and possibly leads to better classification accuracy.

    Advanced search

    Browse

    Communities & CollectionsIssue DateAuthorTitleSubjectDocument TypeThis CollectionIssue DateAuthorTitleSubjectDocument Type

    My Account

    Admin

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Follow Curtin

    • 
    • 
    • 
    • 
    • 

    CRICOS Provider Code: 00301JABN: 99 143 842 569TEQSA: PRV12158

    Copyright | Disclaimer | Privacy statement | Accessibility

    Curtin would like to pay respect to the Aboriginal and Torres Strait Islander members of our community by acknowledging the traditional owners of the land on which the Perth campus is located, the Whadjuk people of the Nyungar Nation; and on our Kalgoorlie campus, the Wongutha people of the North-Eastern Goldfields.