Curtin University Homepage
  • Library
  • Help
    • Admin

    espace - Curtin’s institutional repository

    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    • espace Home
    • espace
    • Curtin Theses
    • View Item
    • espace Home
    • espace
    • Curtin Theses
    • View Item

    A sentiment based approach to pattern discovery and classification in social media

    186192_Thesis2012.pdf (6.994Mb)
    Access Status
    Open access
    Authors
    Nguyen, Thin K.
    Date
    2012
    Supervisor
    Dr Dinh Phung
    Type
    Thesis
    Award
    PhD
    
    Metadata
    Show full item record
    School
    Department of Computing
    URI
    http://hdl.handle.net/20.500.11937/1558
    Collection
    • Curtin Theses
    Abstract

    Social media allows people to participate, express opinions, mediate their own content and interact with other users. As such, sentiment information has become an integral part of social media. This thesis presents a sentiment-based approach to analyse content and social relationships in social media.First, this thesis aims to construct building blocks for sentiment analysis in social media, using sentiment in the form of mood. To that end, the problem of supervised mood classification is investigated. This line of work provides insights into what features in a generic document classification problem can be transferred to a mood classification problem in social media. As data in social media is normally large scale, novel scalable feature sets are introduced for this task. In particular, a novel set of psycholinguistic features is proposed and validated, which does not require a supervised feature selection phase and can therefore be applied for mood analysis at a large scale. Next, under an unsupervised setting, this thesis explores the new problem of pattern discovery in social media using sentiment information. The result is the discovery of intrinsic patterns of moods, each of which can be considered as a group of moods similar to a basic emotion studied in psychology, and therefore providing valuable empirical evidence about the structure of human emotion in the social media domain in a data-driven approach.The second major contribution of this thesis explores the use of sentiment information conveyed in on-line social diaries for detection of real-world events in a large scale setting. In particular, this thesis introduces the novel concept of 'sentiment burst' and employs a stochastic model for detection, and subsequent extraction, of events in social media. The resultant model is a powerful bursty detection algorithm suitable for on-line deployment on ever-growing datasets such as social media. An additional contribution in this line of work is an effective method for evaluating and ranking events using Google Timeline. This offers an objective measure by which to evaluate event detection a topic that is largely under explored in the current literature due to a general lack of human groundtruth.Next, under an egocentric analysis, sentiment information is used to study the impact of the demographics and personalities of users on the messages they create. In particular, we examine how the age and social connectivity of on-line users correlate with the affective, topical and psycholinguistic features of the texts they author. Using a large, ground-truthed dataset of millions of users and on-line diaries, we investigate various important questions posed in social media analysis, psychology and sociology. For example, is there a difference with regard to topic, psycholinguistic features and mood in the messages written by old versus young users? What features are predictive of a user's personality? Of extraversion and introversion? Are there features that are predictive of influence? The results obtained by our sentiment-based approach are encouraging, do not require an expensive feature selection phase and thus suggest a new and promising approach for egocentric analysis in the social media domain.Finally, the sentiment information conveyed in media content is investigated with respect to the networking and interaction aspects of a social media system. Sentiment information is studied in parallel with two other common aspects of social media content: topics and linguistic styles. Sentiment information is proved in this thesis to provide additional insights into the process of community formation. It is also shown to be a powerful predictor of community membership for a message or a user at a lighter computational cost.

    Related items

    Showing items related by title, author, creator and subject.

    • Twitter mining for ontology-based domain discovery incorporating machine learning
      Abu-Salih, B.; Wongthongtham, Pornpit; Kit, C. (2018)
      © 2018, Emerald Publishing Limited. Purpose: This paper aims to obtain the domain of the textual content generated by users of online social network (OSN) platforms. Understanding a users’ domain (s) of interest is a ...
    • Twitter mining for ontology-based domain discovery incorporating machine learning
      Abu-Salih, B.; Wongthongtham, Pornpit; Kit, C. (2018)
      Purpose: This paper aims to obtain the domain of the textual content generated by users of online social network (OSN) platforms. Understanding a users’ domain (s) of interest is a significant step towards addressing their ...
    • Hyper-community detection in the blogosphere
      Nguyen, Thin; Phung, Dinh; Adams, Brett; Tran, Truyen; Venkatesh, Svetha (2010)
      Most existing work on learning community structure in social network is graph-based whose links among the members are often represented as an adjacency matrix, encoding direct pairwise associations between members. In ...
    Advanced search

    Browse

    Communities & CollectionsIssue DateAuthorTitleSubjectDocument TypeThis CollectionIssue DateAuthorTitleSubjectDocument Type

    My Account

    Admin

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Follow Curtin

    • 
    • 
    • 
    • 
    • 

    CRICOS Provider Code: 00301JABN: 99 143 842 569TEQSA: PRV12158

    Copyright | Disclaimer | Privacy statement | Accessibility

    Curtin would like to pay respect to the Aboriginal and Torres Strait Islander members of our community by acknowledging the traditional owners of the land on which the Perth campus is located, the Whadjuk people of the Nyungar Nation; and on our Kalgoorlie campus, the Wongutha people of the North-Eastern Goldfields.