Curtin University Homepage
  • Library
  • Help
    • Admin

    espace - Curtin’s institutional repository

    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item

    SOF: a semi-supervised ontology - learning - based focused crawler

    Access Status
    Fulltext not available
    Authors
    Dong, Hai
    Hussain, Farookh
    Date
    2013
    Type
    Journal Article
    
    Metadata
    Show full item record
    Citation
    Dong, Hai and Hussain, Farookh Khadeer. 2013. SOF: a semi-supervised ontology - learning - based focused crawler. Concurrency and Computation: Practice and Experience. 25 (12): pp. 1755-1770.
    Source Title
    Concurrency and Computation: Practice and Experience
    DOI
    10.1002/cpe.2980
    ISSN
    1532-0626
    URI
    http://hdl.handle.net/20.500.11937/18523
    Collection
    • Curtin Research Publications
    Abstract

    The rapid increase in the volume of data available on the Internet makes it increasingly impractical for a crawler to index the whole Web. Instead, many intelligent crawlers, known as ontology-based semantic focused crawlers, have been designed by making use of Semantic Web technologies for topic-centered Web information crawling. Ontologies, however, have constraints of validity and time, which may influence the performance of the crawlers. Ontology-learning-based focused crawlers are therefore designed to automatically evolve ontologies by integrating ontology learning technologies. Nevertheless, surveys indicate that the existing ontology-learning-based focused crawlers do not have the capability to automatically enrich the content of ontologies, which makes these crawlers unreliable in the open and heterogeneous Web environment. Hence, in this paper, we propose a framework for a novel semi-supervised ontology-learning-based focused (SOF) crawler, the SOF crawler, which embodies a series of schemas for ontology generation and Web information formatting, a semi-supervised ontology learning framework, and a hybrid Web page classification approach aggregated by a group of support vector machine models. A series of tests are implemented to evaluate the technical feasibility of this proposed framework. The conclusion and the future work are summarized in the final section.

    Related items

    Showing items related by title, author, creator and subject.

    • State of the art in semantic focused crawlers
      Dong, Hai; Hussain, Farookh Khadeer; Chang, Elizabeth (2009)
      Nowadays, the research of focused crawler approaches the field of semantic web, along with the appearance of increasing semantic web documents and the rapid development of ontology mark-up languages. Semantic focused ...
    • Self-adaptive semantic focused crawler for mining services information discovery
      Dong, Hai; Hussain, F. (2014)
      It is well recognized that the Internet has become the largest marketplace in the world, and online advertising is very popular with numerous industries, including the traditional mining service industry where mining ...
    • A survey in semantic web technologies-inspired focused crawlers
      Dong, Hai; Hussain, Farookh Khadeer; Chang, Elizabeth (2008)
      Crawlers are software which can traverse the internet and retrieve webpages by hyperlinks. In theface of the inundant spam websites, traditional web crawlers cannot function well to solve this problem.Semantic focused ...
    Advanced search

    Browse

    Communities & CollectionsIssue DateAuthorTitleSubjectDocument TypeThis CollectionIssue DateAuthorTitleSubjectDocument Type

    My Account

    Admin

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Follow Curtin

    • 
    • 
    • 
    • 
    • 

    CRICOS Provider Code: 00301JABN: 99 143 842 569TEQSA: PRV12158

    Copyright | Disclaimer | Privacy statement | Accessibility

    Curtin would like to pay respect to the Aboriginal and Torres Strait Islander members of our community by acknowledging the traditional owners of the land on which the Perth campus is located, the Whadjuk people of the Nyungar Nation; and on our Kalgoorlie campus, the Wongutha people of the North-Eastern Goldfields.