Anonymous

Changes

From SMC Wiki

GSoC/2009

2,179 bytes added, 04:34, 21 February 2009
'''Mentor''' : Praveen A/Santhosh Thottingal
===Functional Optical character Recognition system===
'''Brief Desciption:'''
Malayalam(or Any Indian Language) does not have a working Optical Character Recognition system. There was lots of research in this field by many, but none of them was successfull. Tesseract OCR seems promising and there are works going on in Bengali. Based on that works we need to add Malayalam support to tesseract ocr
'''Expectation:'''
* Study tesseract OCR system
* Recognition of all characters
* Add support to Malayalam and optimize the accuracy
 
More details : http://code.google.com/p/tesseract-ocr/ and http://code.google.com/p/ocropus/
Mentor: TBD
 
===New Family of Equal Height Fonts (EHF)for Malayalam Language===
'''Brief Description:'''
To design and create a new family of Equal Height Fonts for the traditional Malayalam script. Following Roman typology, serif and sans serif type of font variations are available in Malayalam. Equal Width Fonts, such as Courier, available in Roman typography are impossible for Malayalam characters and this is unnecessary. The proposed Equal Height Fonts is a new concept in the history of font making to surmount the typographical challenge of vertically stacked conjuncts
'''Knowledge Prerequisite'''
Understanding of opentype/truetype font design technologies and experience with tools like fontforge
'''Mentor:''' Hussain K H
 
===Batch converter for documents(doc/odt) with ASCII Font encoded data to Unicode Documents===
'''Brief Description:'''
There are lots of documents exists in India with content encoded in non-standard ASCII fonts. The project aim is enhance our existing ASCII to Unicode converter [[Payyans]] such a way that it can read doc and odt documents and do the conversion using the existing APIs
'''Expectation:'''
* Payyans should be able to convert .doc documents to Unicode encoded ODT documents
* Batch conversion as well as single copy conversion should be possible
* APIs should be provided for developers
* Should support almost all ASCII fonts. Supporting the maps present in Padma converter is recommended.
'''Knowledge Prerequisite'''
Students should know Python.
'''Mentor:''' Rajeesh Nambiar/Nishan Naseer