GSoC/2013/Project ideas: Difference between revisions
Line 54: | Line 54: | ||
== A spell checker for Indic language that understands inflections == | == A spell checker for Indic language that understands inflections == | ||
'''Project''': | '''Project''': | ||
'''Mentor''' : | SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination happening in Indian languages especially south indian languages. The dictionary we have for Malayalam spellchecker is having 150000 words. Of course we can expand the dictionary, but that has no much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this inside hunspell. If that is not feasible(hunspell upstream is not active), develop and algorithm and implement it. | ||
'''Expertise required''': Basic understanding of grammar system of atleast one Indian language | |||
'''Mentor''' : Santhosh Thottingal | |||
== Improving the webfonts module in Silpa using jquery.webfonts and proving more Indic and complex fonts as part of it. == | == Improving the webfonts module in Silpa using jquery.webfonts and proving more Indic and complex fonts as part of it. == | ||
'''Project''': | '''Project''': |
Revision as of 06:40, 31 March 2013
Ideas for Google Summer of Code 2013
Indic rendering support in ConTeXt.
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. (We already have rendering module in SILPA this can be improved to allow the implementation of above idea. )
Expertise required:
Mentor : Rajeeesh Nambiar
Automated Rendering Testing
Port remaining modules to the new flask based Silpa
Project:
Expertise required:
Mentor : Rajeeesh Nambiar
Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC.
Project:
Expertise required:
Mentor :
Separate templates from SILPA and have it inside modules packaged for pypi
Project:
this should give more idea on it
Expertise required:
Mentor :
Integrating jquery.ime input method frame work with internationalization using jquery.i18n
Project: (not complex and will be expanded by Santhosh) Expertise required:
Mentor :
Converting indic processing modules currently in SILPA into Jquery library
Project:
Expertise required:
Mentor :
Improving cross language transliteration system.
Project:
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration
Expertise required:
Mentor :
A spell checker for Indic language that understands inflections
Project:
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination happening in Indian languages especially south indian languages. The dictionary we have for Malayalam spellchecker is having 150000 words. Of course we can expand the dictionary, but that has no much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this inside hunspell. If that is not feasible(hunspell upstream is not active), develop and algorithm and implement it.
Expertise required: Basic understanding of grammar system of atleast one Indian language
Mentor : Santhosh Thottingal
Improving the webfonts module in Silpa using jquery.webfonts and proving more Indic and complex fonts as part of it.
Project:
Expertise required:
Mentor :
Android App- Malayalam Calendar
Full featured Malayalam input tool like Google Hindi Input.
(?) jquery.ime from wikimedia is getting ready as input method with 150+ input methods for android
Add proper Indic / Malayalam rendering to Mapnik.
Mapnik is a free mapping toolkit, written in C++. One of it's major users is OpenStreetMap. If you check OpenStreetMap, you can see that Languages like Russian, Arabic, Persian, Chinese etc are rendered in it (Not sure whether they are properly rendered or not). The lack of proper Indic support is the major reason for the absence of Malayalam.
Add Indic / Malayalam rendering to MapServer + OpenLayers stack.
Both are OSGeo projects, and used in most of the WebGIS applications recently. MapServer is an open source development environment for building spatially enabled internet applications. OpenLayers is an open source JavaScript library for displaying map data in web browsers. OpenLayers is used by OpenStreetMap for its "slippy map" map interface.
Add proper Indic / Malayalam support and rendering to GRASS GIS.
It is used by a number of organizations for analysing GIS data, creating maps etc. GRASS also is an OSGeo project. It is in the process of rewriting the old Tcl/Tk interface in the new wx-python.