The theory, design, development and evaluation of the MarkIT automated essay grading system
Access Status
Authors
Date
2011Supervisor
Type
Award
Metadata
Show full item recordSchool
Collection
Abstract
The research presented in this exegesis relates to the design, development and testing of a new Automated Essay Grading (AEG) system. AEG systems make use of Information Technology (IT) to grade essays. The major objective for AEG system developers is to build systems that grade as well as, or exceed the accuracy of, human graders.This research discusses the main theories that currently underpin existing systems. It then discusses a new theoretical concept, the Normalised Word Vector (NWV), which has been developed and tested during this research. This exegesis also synthesises into a cohesive discourse seven of the author’s papers on the NWV and related issues published during the period 2002 to 2007. The papers can be grouped into three themes as follows: the theory of NWV and related matters, the development of the system, and the testing of the system.Thirteen existing AEG systems have been identified in this research. Each system has its own set of unique features; some focus on grading for essay writing style, others for essay content, and others attempt to consider both aspects in assigning a score to an essay. The type and amount of feedback on an essay also varies amongst the systems; some provide feedback on essay mechanics and others provide feedback on missing content. The MarkIT system described in this exegesis primarily grades for essay content, with a secondary focus on style. It has the unique feature, which distinguishes it from the other systems, of providing interactive visual feedback on essay content. This enables the teacher and student to discuss how the essay can be improved to obtain a higher grade.In brief, the theory of the NWV is as follows. The words in an essay are ‘normalised’ to their root concepts in a thesaurus. The number of times these concepts occur in the essay (the counts) are then used to build the coordinates of the vector in the vector space induced by all the concepts in the thesaurus. This adaptation of the theory used for many years in the document retrieval industry enables very fast comparison of essay content, and enables MarkIT to grade in real time.In essence the system works by mathematically modelling, using multiple linear regression, the grading criteria used by human graders for a given essay. These criteria are extracted from a set of training essays, and include items such as the number of words, the number of nouns, the number of verbs, the number of adjectives, and the number of adverbs. The model is then used to grade the essays not previously graded by humans. It does this by measuring the predictor factors in the ungraded essays, and then applying the multiple regression equation. The cosine of the angle between the NWV for a student essay and the NWV for a model answer is often one of the significant predictor variables.The system has been tested with 390 Year 10 high school essays, of about 400 words in length, on the topic of ‘The School Leaving Age’. The correlation of grades amongst the human graders was 0.81, and the system scores matched this correlation of the human graders.
Related items
Showing items related by title, author, creator and subject.
-
Williams, Robert; Dreher, Heinz (2005)In this paper we discuss a simple but comprehensive form of feedback to essay authors, based on a thesaurus and computer graphics, which enables the essay authors to see where essay content is inadequate in terms of the ...
-
Williams, Robert; Dreher, Heinz (2005)In this paper we discuss a simple but comprehensive form of feedback to essay authors, based on a thesaurus and computer graphics, which enables the essay authors to see where essay content is inadequate in terms of the ...
-
Reiners, Torsten; Dreher, Heinz (2009)In modern learning environments, the lecturer or educational designer is often confronted withmulti-national student cohorts, requiring special consideration regarding language, cultural norms and taboos, religion, and ...