# Santhosh Thottingal# Baiju M# Praveen A# Rajeesh K Nambiar# Vasudev Kammath
# Hussain K. H# Jishnu Mohan# Hrishikesh K.B# Anivar Aravind
=Ideas for Google Summer of Code 2013=
== Internationalize SILPA project with Wikimedia jquery projects== '''Project''': SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it . Similarly, the application is not internationalized. Both of these can be achieved by using the jquery.ime and jquery.i18n libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using jquery.webfonts library. * [https:// github. com/ wikimedia/jquery. i18n jquery. i18n] * [https://github.com/wikimedia/jquery.ime jquery.ime] * [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] '''Expertise required''': jquery, css, html5, python '''Mentor''' : Hrishikesh
== A spell checker for Indic language that understands inflections ==
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn't have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.
'''Expertise required''': Basic understanding of grammar system of at least one Indian language
'''Mentor''' : Santhosh Thottingal
==Indic rendering support in ConTeXt==
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII
is supposed to have Indic language rendering support using XeTeX , but in practice we have found it lacking. MKII is deprecated anyway, and the new MKIV backend doesn't support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering. '''Expertise required''': Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage. '''Mentor''' : Rajeesh K Nambiar ==Automated Rendering Testing== '''Project''': Automated Rendering Testing system for Indic languages. Currently there exists 3 main rendering engines in computing world - Uniscribe of Microsoft, CoreText (Apple Advance Typography - AAT) of Apple and Harfbuzz for *nix systems. The Opentype font specification is maintained by Microsoft and implemented in Uniscribe, which is used as baseline for Harfbuzz. At present, there is no automated mechanism to determine if Harfbuzz is rendering complex Indic text correctly or not - someone expert in relevant language has to manually inspect the output from hb-view. The project aim is to identify and implement an automated method to test the rendering. One method to do this might be to check the order of glyphs/glyph indices output by the rendering engine - this depends on the font too. A related topic is UTRRS https://fedorahosted.org/utrrs/, http://tdil-dc.in/utrrs/home/about '''Expertise required''': Knowledge of Indic language rendering and Opentype specification. '''Mentor''' : Rajeesh K Nambiar == Create Bold and Italic variants for Meera and Rachana == '''Project''' :The Meera font has only regular version now. Synthetic bold and italic is not perfect and suitable for Malayalam. Create a bold and italic variant version for the Meera and Rachana fonts
Digital typography, good understanding of Malayalam writing system, fontforge, understanding of rendering engines like Harfbuzz.
Possible Mentor''' : Hussain K H
[https: //savannah. nongnu.org/task/index. php?12553 Savannah Task]'''
Flask Based SILPA==
===Port remaining modules to the new flask based Silpa===
'''Project''': Silpa is being re-written using flask framework. Core part is almost complete. But most of the sub modules written under old framework are need to be ported to [http://flasksilpa-indic.rhcloud.com/ new framework].
Rajeeesh Nambiar/ Jishnu
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===
'''Mentor''' : Vasudev/Jishnu
== Improving cross language transliteration system. ==
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration
'''Mentor''' : Vasudev/Jishnu
== Improving the webfonts module in Silpa using jquery.webfonts and proving more Indic and complex fonts as part of it. == '''Project''':
* Currently Silpa provides 36 webfonts. add more fonts to this collection.
* Rewrote webfonts module to use the features of jquery.webfonts
* Provide font preview and download options
'''Expertise required''': jQuery, Python , technical understanding about fonts '''Mentor''' : Vasudev/ Jishnu
==Adding Indic Sript Rendering Support to GIS applications == ===Add proper Indic / Malayalam rendering to Mapnik=== Mapnik is a free mapping toolkit, written in C++. One of it's major users is OpenStreetMap. If you check OpenStreetMap, you can see that Languages like Russian, Arabic, Persian, Chinese etc are rendered in it (Not sure whether they are properly rendered or not). The lack of proper Indic support is the major reason for the absence of Malayalam. * http://mapnik.org/ * http://www.openstreetmap.org/
==Building a system and API's for accessing and upadating Malayalamgrandham Bibligiography Data==
[http://www.malayalagrandham.com/about/ Malayala Grantha Vivaram] is a project intended to make available reliable bibliographic information on all Malayalam books published in Kerala and elsewhere. This Open data set contains Complete bibliography data from first Impression to 1995. This project wants to add following features to Malayalagrandham DB and build it as a bibliography web service
* Facility for adding copyright expired books to malaylagrandha vivaram * Adding ISBN* Building Interface for Publishers through with they can contribute
their publication bibliography .
* Similar module for Libraries . That will be added to found in library section of each book
* A module for building qr code of bibliography with a malayalagrandham link
. It can be used by publishers and libraries* Crowd sourced way for input and an evaluation interface for submissions.
* MARC21 and MARCXML support
Mentor''': Baiju M /Anivar