User:Jaseem/spellcheck: Difference between revisions

From SMC Wiki
(Created page with "= Malayalam Spell-checker = == Problem== English dictionaries "rely on complete lists of full word forms, a requirement that cannot be met for morphologically complex language...")
 
mNo edit summary
Line 21: Line 21:
Spell Checking an Agglutinative Language: Quechua
Spell Checking an Agglutinative Language: Quechua
http://www.zora.uzh.ch/52921/1/ltc-106-rios.pdf
http://www.zora.uzh.ch/52921/1/ltc-106-rios.pdf
 
Quechua, doesn't seem to have the complexity that malayalam sandhi's have. The automaton presented in the paper doesn't seem to work on malayalam.
*;kachichasqa= kachi + cha +sqa


http://www.cmpe.boun.edu.tr/~akin/papers/spelling_checking_in_Turkish.pdf
http://www.cmpe.boun.edu.tr/~akin/papers/spelling_checking_in_Turkish.pdf


http://arxiv.org/pdf/cmp-lg/9410004.pdf
http://arxiv.org/pdf/cmp-lg/9410004.pdf

Revision as of 14:15, 3 March 2014

Malayalam Spell-checker

Problem

English dictionaries "rely on complete lists of full word forms, a requirement that cannot be met for morphologically complex languages" like Malayalam. Theoretically, In Malayalam agglutination of unlimited words can happen. Generally less than 10. Handling agglutinations and inflections in a spell-checker can be challenging.

Refer http://thottingal.in/documents/MalayalamComputingChallenges.pdf

Other Challenges

  • Homophonic root words can have difference inflections
    മറക്കുക & മറയുക; പറയുക & പറക്കുക
  • Same word can inflect differently in same context (not common)
    പോവുക, പോകുക
  • Sandhi rules are complex.

Possible solutions

Hunspell

Hunspell has an algorithm for figuring out agglutination. Need to figure out how to use it.

Implementation in other languages

Spell Checking an Agglutinative Language: Quechua http://www.zora.uzh.ch/52921/1/ltc-106-rios.pdf Quechua, doesn't seem to have the complexity that malayalam sandhi's have. The automaton presented in the paper doesn't seem to work on malayalam.

  • kachichasqa= kachi + cha +sqa

http://www.cmpe.boun.edu.tr/~akin/papers/spelling_checking_in_Turkish.pdf

http://arxiv.org/pdf/cmp-lg/9410004.pdf