GSoC/2013/Project ideas: Difference between revisions

From SMC Wiki
No edit summary
Line 1: Line 1:
=Ideas for Google Summer of Code 2013=
=Ideas for Google Summer of Code 2013=
==Mentors==
# Santhosh Thottingal
# Baiju M
# Praveen A
# Rajeesh K Nambiar
# Vasudev Kammath
# Hussain K.H
# Jishnu Mohan
# Hrishikesh K.B
# Anivar Aravind
== Internationalize SILPA project with Wikimedia jquery projects==
'''Project''':
SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the jquery.ime and jquery.i18n libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using jquery.webfonts library.
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]
* [https://github.com/wikimedia/jquery.ime jquery.ime]
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]
'''Expertise required''': jquery, css, html5, python
'''Mentor''' : Hrishikesh
== A spell checker for Indic language that understands inflections ==
'''Project''':
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination happening in Indian languages especially south indian languages. The dictionary we have for Malayalam spellchecker is having 150000 words. Of course we can expand the dictionary, but that has no much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this inside hunspell. If that is not feasible(hunspell upstream is not active), develop and algorithm and implement it.
'''Expertise required''': Basic understanding of grammar system of atleast one Indian language
'''Mentor''' : Santhosh Thottingal
==Indic rendering support in ConTeXt.==
==Indic rendering support in ConTeXt.==
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. (We already have rendering module
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. (We already have rendering module
Line 6: Line 40:
'''Expertise required''':
'''Expertise required''':


'''Mentor''' : Rajeeesh Nambiar
==Automated Rendering Testing==
Automatic Rendering Testing system for Indian languages
'''Expertise required''':
'''Mentor''' : Rajeeesh Nambiar
'''Mentor''' : Rajeeesh Nambiar


Line 17: Line 57:
'''Mentor''' : Hussain K H
'''Mentor''' : Hussain K H


==Automated Rendering Testing==


== Port remaining modules to the new flask based Silpa ==
== Port remaining modules to the new flask based Silpa ==
Line 24: Line 63:
'''Expertise required''':
'''Expertise required''':


'''Mentor''' : Rajeeesh Nambiar
'''Mentor''' : Rajeeesh Nambiar/ Jishnu


== Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC.==
== Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC.==
Line 31: Line 70:
'''Expertise required''':
'''Expertise required''':


'''Mentor''' :  
'''Mentor''' : Vasudev/Jishnu
 
== Separate templates from SILPA and have it inside modules packaged for pypi  ==
== Separate templates from SILPA and have it inside modules packaged for pypi  ==
'''Project''':
'''Project''':
Line 39: Line 79:
'''Expertise required''':
'''Expertise required''':


'''Mentor''' :  
'''Mentor''' : Vasudev/Jishnu
== Internationalize SILPA project with Wikimedia jquery projects==
'''Project''':
 
SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the jquery.ime and jquery.i18n libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using jquery.webfonts library.
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]
* [https://github.com/wikimedia/jquery.ime jquery.ime]
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]


'''Expertise required''': jquery, css, html5, python
'''Mentor''' : Hrishikesh


== Converting indic processing modules currently in SILPA into Jquery library  ==
== Converting indic processing modules currently in SILPA into Jquery library  ==
Line 57: Line 87:
'''Expertise required''': javascript, python
'''Expertise required''': javascript, python


'''Mentor''' :
'''Mentor''' :Vasudev/Jishnu


==  Improving cross language transliteration system.  ==
==  Improving cross language transliteration system.  ==
Line 66: Line 96:
'''Expertise required''':
'''Expertise required''':


'''Mentor''' :
'''Mentor''' : Vasudev/Jishnu


== A spell checker for Indic language that understands inflections ==
'''Project''':


SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination happening in Indian languages especially south indian languages. The dictionary we have for Malayalam spellchecker is having 150000 words. Of course we can expand the dictionary, but that has no much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this inside hunspell. If that is not feasible(hunspell upstream is not active), develop and algorithm and implement it.
'''Expertise required''': Basic understanding of grammar system of atleast one Indian language
'''Mentor''' : Santhosh Thottingal


==  Improving the webfonts module in Silpa using jquery.webfonts and proving more Indic and complex fonts as part of it. ==
==  Improving the webfonts module in Silpa using jquery.webfonts and proving more Indic and complex fonts as part of it. ==
Line 82: Line 105:
'''Expertise required''':
'''Expertise required''':


'''Mentor''' :
'''Mentor''' : Vasudev/Jishnu


== Android App-  Malayalam Calendar ==
== Full featured Malayalam input tool like [http://play.google.com/store/apps/details?id=com.google.android.apps.inputmethod.hindi Google Hindi Input]. ==
(?) jquery.ime from wikimedia is getting ready as input method with 150+ input methods for android


==Add proper Indic / Malayalam rendering to Mapnik.==  
==Adding Indic Sript Rendering Support to GIS applications ==
===Add proper Indic / Malayalam rendering to Mapnik===  
Mapnik is a free mapping toolkit, written in C++. One of it's major users is OpenStreetMap. If you check OpenStreetMap, you can see that Languages like Russian, Arabic, Persian, Chinese etc are rendered in it (Not sure whether they are properly rendered or not). The lack of proper Indic support is the major reason for the absence of Malayalam.
Mapnik is a free mapping toolkit, written in C++. One of it's major users is OpenStreetMap. If you check OpenStreetMap, you can see that Languages like Russian, Arabic, Persian, Chinese etc are rendered in it (Not sure whether they are properly rendered or not). The lack of proper Indic support is the major reason for the absence of Malayalam.
* http://mapnik.org/
* http://mapnik.org/
* http://www.openstreetmap.org/  
* http://www.openstreetmap.org/  
==Add Indic / Malayalam rendering to MapServer + OpenLayers stack.==
===Add Indic / Malayalam rendering to MapServer + OpenLayers stack.===
Both are OSGeo projects, and used in most of the WebGIS applications recently. MapServer is an open source development environment for building spatially enabled internet applications. OpenLayers is an open source JavaScript library for displaying map data in web browsers. OpenLayers is used by OpenStreetMap for its "slippy map" map interface.
Both are OSGeo projects, and used in most of the WebGIS applications recently. MapServer is an open source development environment for building spatially enabled internet applications. OpenLayers is an open source JavaScript library for displaying map data in web browsers. OpenLayers is used by OpenStreetMap for its "slippy map" map interface.
* http://www.mapserver.org/
* http://www.mapserver.org/
* http://www.openlayers.org/
* http://www.openlayers.org/
* http://www.osgeo.org/
* http://www.osgeo.org/
==Add proper Indic / Malayalam support and rendering to GRASS GIS.==
===Add proper Indic / Malayalam support and rendering to GRASS GIS.===
It is used by a number of organizations for analysing GIS data, creating maps etc. GRASS also is an OSGeo project. It is in the process of rewriting the old Tcl/Tk interface in the new wx-python.
It is used by a number of organizations for analysing GIS data, creating maps etc. GRASS also is an OSGeo project. It is in the process of rewriting the old Tcl/Tk interface in the new wx-python.
* http://grass.osgeo.org/
* http://grass.osgeo.org/
'''Mentor''': Praveen A
==Building a systme and API's for accessing and upadating Malayalamgrandham Bibligiography Data==
[http://www.malayalagrandham.com/about/] Malayala Grantha Vivaram is a project intended to make available reliable bibliographic information on all Malayalam books published in Kerala and elsewhere. This Open data set contains Complete bibliography data from first Impression to 1995. This project wants to add following features to Malayalagrandham DB and build it as a bibliography web service
1. Facility for adding copyright expired books to malaylagrandha vivaram
2. Adding ISBN
3. Building  Interface for Publishers through with they can contribute  their publication bibliography .
4. Similar module for Libraries . That will be added to found in library section of each book
5. A module for building qr code of bibliography with a malayalagrandham link . It can be used by publishers and libraries
6. Crowd sourced way for input and an evaluation interface for submissions.
7. MARC21 and MARCXML support
'''Expertise required''':
'''Mentor''': Baiju M /Anivar

Revision as of 08:31, 31 March 2013

Ideas for Google Summer of Code 2013

Mentors

  1. Santhosh Thottingal
  2. Baiju M
  3. Praveen A
  4. Rajeesh K Nambiar
  5. Vasudev Kammath
  6. Hussain K.H
  7. Jishnu Mohan
  8. Hrishikesh K.B
  9. Anivar Aravind


Internationalize SILPA project with Wikimedia jquery projects

Project:

SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the jquery.ime and jquery.i18n libraries from Wikimedia. A sample implementation is avaliable in our website. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using jquery.webfonts library.

Expertise required: jquery, css, html5, python

Mentor : Hrishikesh

A spell checker for Indic language that understands inflections

Project:

SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination happening in Indian languages especially south indian languages. The dictionary we have for Malayalam spellchecker is having 150000 words. Of course we can expand the dictionary, but that has no much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this inside hunspell. If that is not feasible(hunspell upstream is not active), develop and algorithm and implement it.

Expertise required: Basic understanding of grammar system of atleast one Indian language

Mentor : Santhosh Thottingal

Indic rendering support in ConTeXt.

ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. (We already have rendering module in SILPA this can be improved to allow the implementation of above idea. )

Expertise required:

Mentor : Rajeeesh Nambiar

Automated Rendering Testing

Automatic Rendering Testing system for Indian languages

Expertise required: Mentor : Rajeeesh Nambiar

Create Bold and Italic variants for Meera and Rachana

Project :The Meera font has only regular version now. Synthetic bold and italic is not perfect and suitable for Malayalam. Create a bold and italic variant version for the Meera and Rachana fonts


Expertise required: Digital typography, good understanding of Malayalam writing system, fontforge, understanding of rendering engines like Harfbuzz.

Mentor : Hussain K H


Port remaining modules to the new flask based Silpa

Project:

Expertise required:

Mentor : Rajeeesh Nambiar/ Jishnu

Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC.

Project:

Expertise required:

Mentor : Vasudev/Jishnu

Separate templates from SILPA and have it inside modules packaged for pypi

Project:

this should give more idea on it

Expertise required:

Mentor : Vasudev/Jishnu


Converting indic processing modules currently in SILPA into Jquery library

Project: Port some of the silpa alogorithms to javascript

Expertise required: javascript, python

Mentor :Vasudev/Jishnu

Improving cross language transliteration system.

Project:

Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration

Expertise required:

Mentor : Vasudev/Jishnu


Improving the webfonts module in Silpa using jquery.webfonts and proving more Indic and complex fonts as part of it.

Project:

Expertise required:

Mentor : Vasudev/Jishnu


Adding Indic Sript Rendering Support to GIS applications

Add proper Indic / Malayalam rendering to Mapnik

Mapnik is a free mapping toolkit, written in C++. One of it's major users is OpenStreetMap. If you check OpenStreetMap, you can see that Languages like Russian, Arabic, Persian, Chinese etc are rendered in it (Not sure whether they are properly rendered or not). The lack of proper Indic support is the major reason for the absence of Malayalam.

Add Indic / Malayalam rendering to MapServer + OpenLayers stack.

Both are OSGeo projects, and used in most of the WebGIS applications recently. MapServer is an open source development environment for building spatially enabled internet applications. OpenLayers is an open source JavaScript library for displaying map data in web browsers. OpenLayers is used by OpenStreetMap for its "slippy map" map interface.

Add proper Indic / Malayalam support and rendering to GRASS GIS.

It is used by a number of organizations for analysing GIS data, creating maps etc. GRASS also is an OSGeo project. It is in the process of rewriting the old Tcl/Tk interface in the new wx-python.

Mentor: Praveen A

Building a systme and API's for accessing and upadating Malayalamgrandham Bibligiography Data

[1] Malayala Grantha Vivaram is a project intended to make available reliable bibliographic information on all Malayalam books published in Kerala and elsewhere. This Open data set contains Complete bibliography data from first Impression to 1995. This project wants to add following features to Malayalagrandham DB and build it as a bibliography web service 1. Facility for adding copyright expired books to malaylagrandha vivaram 2. Adding ISBN 3. Building Interface for Publishers through with they can contribute their publication bibliography . 4. Similar module for Libraries . That will be added to found in library section of each book 5. A module for building qr code of bibliography with a malayalagrandham link . It can be used by publishers and libraries 6. Crowd sourced way for input and an evaluation interface for submissions. 7. MARC21 and MARCXML support

Expertise required:

Mentor: Baiju M /Anivar