International Journal of applied mathematics and computer science

online read us now

Paper details

Number 2 - June 2005
Volume 15 - 2005

Correcting spelling errors by modelling their causes

Sebastian Deorowicz, Marcin G. Ciura

Abstract
This paper accounts for a new technique of correcting isolated words in typed texts. A language-dependent set of string substitutions reflects the surface form of errors that result from vocabulary incompetence, misspellings, or mistypings. Candidate corrections are formed by applying the substitutions to text words absent from the computer lexicon. A minimal acyclic deterministic finite automaton storing the lexicon allows quick rejection of nonsense corrections, while costs associated with the substitutions serve to rank the remaining ones. A comparison of the correction lists generated by several spellcheckers for two corpora of English spelling errors shows that our technique suggests the right words more accurately than the others.

Keywords
spelling correction, finite state automata, spelling errors