<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.smc.org.in/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Navaneethkn</id>
	<title>SMC Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.smc.org.in/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Navaneethkn"/>
	<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/Special:Contributions/Navaneethkn"/>
	<updated>2026-05-05T01:06:56Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.40.1</generator>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2016/Project_ideas&amp;diff=10728</id>
		<title>GSoC/2016/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2016/Project_ideas&amp;diff=10728"/>
		<updated>2016-03-06T04:32:39Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Varnam Based */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;green&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sayamindu Das Gupta (&#039;&#039;&#039;unmadindu&#039;&#039;&#039;on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeesh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Samuel Thibault (&#039;&#039;&#039;youpi&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2016=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/GSoC/2016#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]  of [http://smc.org.in Swathanthra Malayalam computing] &lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
== Indic Keyboard ==&lt;br /&gt;
https://gitlab.com/smc/indic-keyboard/issues or https://github.com/smc/Indic-Keyboard/issues&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project or #silpa on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java / Android, &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== libindic - Android ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project or #silpa on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java / Android, &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== ibus-braille module modifications ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
This project will be to make improvements on the [[GSoC/2014/Project_ideas#Adding_Braille_Keyboard_layouts_for_Indian_Languages_to_m17n_Library | project]] that was successfully completed by a student under SMC. The remaining tasks areː&lt;br /&gt;
#Integrate Ibus-Braille with Liblouis&lt;br /&gt;
#Create Table editor for Liblouis&lt;br /&gt;
#Create a web version and host it.&lt;br /&gt;
#Add more indian languages to Liblouis&lt;br /&gt;
#Add facility to write direct braille Unicode characters&lt;br /&gt;
#Remove espeak dependency and make accessible via orca itself.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Samuel Thibault&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - youpi on #smc-project on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Varnam based ==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]].&lt;br /&gt;
&lt;br /&gt;
=== Add Varnam support into Indic Keyboard ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
As part of this project, students can add support for Varnam into IndicKeyboard. This involves roughly the following steps:&lt;br /&gt;
&lt;br /&gt;
# Compiling libvarnam for Android&lt;br /&gt;
# Writing JNI wrappers for the libvarnam library&lt;br /&gt;
# Hooking up varnamd on Android to do the word corpus synchronization&lt;br /&gt;
# Add varnam support to IndicKeyboard&lt;br /&gt;
&lt;br /&gt;
Before submitting proposals:&lt;br /&gt;
&lt;br /&gt;
# Ensure you can program using C, Java and golang&lt;br /&gt;
# Ensure you have varnam libraries compiled on your local machine&lt;br /&gt;
# Ensure you have IndicKeyboard setup on your local machine&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Navaneeth K. N.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Java, golang, Android&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;: How to write an input system for Android&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Reference:&lt;br /&gt;
&lt;br /&gt;
# [https://www.varnamproject.com/ Varnam]&lt;br /&gt;
# IndicKeyboard [https://play.google.com/store/apps/details?id=org.smc.inputmethod.indic Playstore] | [https://github.com/androidtweak/Indic-Keyboard Github]&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
libindic project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of libindic framework.&lt;br /&gt;
&lt;br /&gt;
The project is not about coding an existing algorithm, but to develop and implement an algorithm.&lt;br /&gt;
&lt;br /&gt;
Hunspell&#039;s limitations can be understood from [[User:%E0%B4%B8%E0%B4%A8%E0%B5%8D%E0%B4%A4%E0%B5%8B%E0%B4%B7%E0%B5%8D/HunspellConversation| this conversation]] we had with the author of Hunspell in 2008&lt;br /&gt;
&lt;br /&gt;
Homework to do before submitting applications:&lt;br /&gt;
# Use Hunspell in any Indian language like Malayalam for spell correction in editors or word processors and understand the limitations&lt;br /&gt;
# Study the nature of inflection and agglutination in Indian languages, read existing documents on this(ask for documents too) and note down your observations&lt;br /&gt;
# Study Hunspell and other spellcheckers to see how this problem is addressed&lt;br /&gt;
# Understand how a spell checker works. How to write a spellchecker from scratch?&lt;br /&gt;
# Come up with a plan about addressing the issue.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language and complete the homework as listed above.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==libindic Project Based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===libindic Project Improvements===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
This is set of ideas needed to improve the existing libindic infrastructure. We have decided following tasks as part of this project&lt;br /&gt;
&lt;br /&gt;
# Provide REST API to libindic without disturbing existing JSONRPC API&lt;br /&gt;
# Improve the Transliteration module&lt;br /&gt;
# Integrate [https://github.com/Project-SILPA/flask-webfonts Flask Webfonts] extension with libindic to provide Webfonts support.&lt;br /&gt;
&lt;br /&gt;
==== Provide REST like API for libindic ====&lt;br /&gt;
&lt;br /&gt;
libindic provides JSONRPC API currently which is also utilized by the templates of framework. JSONRPC is not well supported in all languages and results in [https://en.wikipedia.org/wiki/Not_invented_here NIH code]. So we would like to provide REST like HTTP based API&#039;s for libindic and at the same time leave the current JSONRPC code untouched for backward compatibility reasons.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
* Develop module or use existing module to provide REST like API&#039;s&lt;br /&gt;
* API should support GET and POST. [http://www.w3.org/2001/tag/doc/whenToUseGet.html When to use GET?].&lt;br /&gt;
&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
====  Improve Transliteration module ====&lt;br /&gt;
&lt;br /&gt;
We have a Transliteration module which supports transliteration from any Indic language to other Indic language and also support to English to Indic and Indic to English transliteration. Also we support IPA and ISO15919 transliteration system. But the module isn&#039;t in perfect shape and has lot of bugs. With this idea we would like to improve the following parts&lt;br /&gt;
&lt;br /&gt;
# Improve cross indic language transliteration system. Currently only Malayalam and Kannada are working without any external language support, all other Indian languages are first transliterated to Malayalam and then transliterated to target Indic language. We want to remove this cycle from source -&amp;gt; Malayalam -&amp;gt; target.&lt;br /&gt;
# English to IPA transliteration is currently broken and this needs to be fixed. See [https://github.com/Project-SILPA/Transliteration/issues/3 IPA transliteration bug].&lt;br /&gt;
# Once the IPA transliteration issue above is fixed, imporve English to Indic transliteration system using IPA. Currently English to Indic transliteration system is done using CMU Sphinx dictionary which is having limited set of words which inturn limits the output of English to Indic transliteration system.&lt;br /&gt;
# Improve IS015919 to Indic transliteration system see [https://github.com/Project-SILPA/Transliteration/issues/4 IS015919 to Indic transliteration].&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
==== Integrating flask-webfonts extension with libindic ====&lt;br /&gt;
&lt;br /&gt;
libindic used to have a Webfonts module for serving Indian language fonts as Webfonts for browsers. During GSOC 2013 it was separated as an extension to Flask framework which can be generally used with any Flask powered app. The current code can be found at [https://github.com/Project-SILPA/flask-webfonts]. The module is not fine tuned yet so below are the objectives.&lt;br /&gt;
&lt;br /&gt;
# The module is not yet fine tuned and using it will make other modules break. This needs to be fixed (Can be checked with &#039;webfonts&#039; branch of libindic code on github.&lt;br /&gt;
# Write tests to check the functionalities.&lt;br /&gt;
# Adhere to Flask extension guidelines and submit the modules to Flask extensions directory.&lt;br /&gt;
# Write a tool which can take a directory containing fonts file or single font file and generate configuration file needed by the extension. (A possible such tool which is outdated can be found at [https://github.com/copyninja/fontinfo])&lt;br /&gt;
# Provide HTTP api&#039;s through flask extension which can expose the CSS for applications.&lt;br /&gt;
&lt;br /&gt;
For all tasks above we expect documentation, test cases from the students as deliverable. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Intermediate&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org &amp;lt;preferred&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
# Writing applications using Flask&lt;br /&gt;
# Various Transliteration system knolwedge&lt;br /&gt;
# Webfonts knowledge and writing extensions for Flask&lt;br /&gt;
# Test drive development.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in libindic into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the libindic algorithms to node modules. Several modules, alogorithms in libindic project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Integrate Varnam into libindic===&lt;br /&gt;
&lt;br /&gt;
Create a libindic module which hosts [http://www.varnamproject.com varnam]. This includes making a python port for libvarnam and making a libindic module which uses the python port. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Python&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; :&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2016/Project_ideas&amp;diff=10727</id>
		<title>GSoC/2016/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2016/Project_ideas&amp;diff=10727"/>
		<updated>2016-03-06T04:31:24Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Programming language bindings &amp;amp; varnam-daemon */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;green&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sayamindu Das Gupta (&#039;&#039;&#039;unmadindu&#039;&#039;&#039;on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeesh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Samuel Thibault (&#039;&#039;&#039;youpi&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2016=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/GSoC/2016#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]  of [http://smc.org.in Swathanthra Malayalam computing] &lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
== Indic Keyboard ==&lt;br /&gt;
https://gitlab.com/smc/indic-keyboard/issues or https://github.com/smc/Indic-Keyboard/issues&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project or #silpa on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java / Android, &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== libindic - Android ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project or #silpa on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java / Android, &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== ibus-braille module modifications ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
This project will be to make improvements on the [[GSoC/2014/Project_ideas#Adding_Braille_Keyboard_layouts_for_Indian_Languages_to_m17n_Library | project]] that was successfully completed by a student under SMC. The remaining tasks areː&lt;br /&gt;
#Integrate Ibus-Braille with Liblouis&lt;br /&gt;
#Create Table editor for Liblouis&lt;br /&gt;
#Create a web version and host it.&lt;br /&gt;
#Add more indian languages to Liblouis&lt;br /&gt;
#Add facility to write direct braille Unicode characters&lt;br /&gt;
#Remove espeak dependency and make accessible via orca itself.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Samuel Thibault&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - youpi on #smc-project on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Varnam based ==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]].&lt;br /&gt;
&lt;br /&gt;
=== Add Varnam support into Indic Keyboard ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
As part of this project, students can add support for Varnam into IndicKeyboard. This involves roughly the following steps:&lt;br /&gt;
&lt;br /&gt;
# Compiling libvarnam for Android&lt;br /&gt;
# Writing JNI wrappers for the libvarnam library&lt;br /&gt;
# Hooking up varnamd on Android to do the word corpus synchronization&lt;br /&gt;
# Add varnam support to IndicKeyboard&lt;br /&gt;
&lt;br /&gt;
Before submitting proposals:&lt;br /&gt;
&lt;br /&gt;
# Ensure you can program using C, Java and golang&lt;br /&gt;
# Ensure you have varnam libraries compiled on your local machine&lt;br /&gt;
# Ensure you have IndicKeyboard setup on your local machine&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Navaneeth K. N.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Java, golang, Android&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;: How to write an input system for Android&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Reference:&lt;br /&gt;
&lt;br /&gt;
# [https://www.varnamproject.com/ Varnam]&lt;br /&gt;
# IndicKeyboard [https://play.google.com/store/apps/details?id=org.smc.inputmethod.indic Playstore] | [https://github.com/androidtweak/Indic-Keyboard Github]&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
libindic project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of libindic framework.&lt;br /&gt;
&lt;br /&gt;
The project is not about coding an existing algorithm, but to develop and implement an algorithm.&lt;br /&gt;
&lt;br /&gt;
Hunspell&#039;s limitations can be understood from [[User:%E0%B4%B8%E0%B4%A8%E0%B5%8D%E0%B4%A4%E0%B5%8B%E0%B4%B7%E0%B5%8D/HunspellConversation| this conversation]] we had with the author of Hunspell in 2008&lt;br /&gt;
&lt;br /&gt;
Homework to do before submitting applications:&lt;br /&gt;
# Use Hunspell in any Indian language like Malayalam for spell correction in editors or word processors and understand the limitations&lt;br /&gt;
# Study the nature of inflection and agglutination in Indian languages, read existing documents on this(ask for documents too) and note down your observations&lt;br /&gt;
# Study Hunspell and other spellcheckers to see how this problem is addressed&lt;br /&gt;
# Understand how a spell checker works. How to write a spellchecker from scratch?&lt;br /&gt;
# Come up with a plan about addressing the issue.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language and complete the homework as listed above.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==libindic Project Based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===libindic Project Improvements===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
This is set of ideas needed to improve the existing libindic infrastructure. We have decided following tasks as part of this project&lt;br /&gt;
&lt;br /&gt;
# Provide REST API to libindic without disturbing existing JSONRPC API&lt;br /&gt;
# Improve the Transliteration module&lt;br /&gt;
# Integrate [https://github.com/Project-SILPA/flask-webfonts Flask Webfonts] extension with libindic to provide Webfonts support.&lt;br /&gt;
&lt;br /&gt;
==== Provide REST like API for libindic ====&lt;br /&gt;
&lt;br /&gt;
libindic provides JSONRPC API currently which is also utilized by the templates of framework. JSONRPC is not well supported in all languages and results in [https://en.wikipedia.org/wiki/Not_invented_here NIH code]. So we would like to provide REST like HTTP based API&#039;s for libindic and at the same time leave the current JSONRPC code untouched for backward compatibility reasons.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
* Develop module or use existing module to provide REST like API&#039;s&lt;br /&gt;
* API should support GET and POST. [http://www.w3.org/2001/tag/doc/whenToUseGet.html When to use GET?].&lt;br /&gt;
&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
====  Improve Transliteration module ====&lt;br /&gt;
&lt;br /&gt;
We have a Transliteration module which supports transliteration from any Indic language to other Indic language and also support to English to Indic and Indic to English transliteration. Also we support IPA and ISO15919 transliteration system. But the module isn&#039;t in perfect shape and has lot of bugs. With this idea we would like to improve the following parts&lt;br /&gt;
&lt;br /&gt;
# Improve cross indic language transliteration system. Currently only Malayalam and Kannada are working without any external language support, all other Indian languages are first transliterated to Malayalam and then transliterated to target Indic language. We want to remove this cycle from source -&amp;gt; Malayalam -&amp;gt; target.&lt;br /&gt;
# English to IPA transliteration is currently broken and this needs to be fixed. See [https://github.com/Project-SILPA/Transliteration/issues/3 IPA transliteration bug].&lt;br /&gt;
# Once the IPA transliteration issue above is fixed, imporve English to Indic transliteration system using IPA. Currently English to Indic transliteration system is done using CMU Sphinx dictionary which is having limited set of words which inturn limits the output of English to Indic transliteration system.&lt;br /&gt;
# Improve IS015919 to Indic transliteration system see [https://github.com/Project-SILPA/Transliteration/issues/4 IS015919 to Indic transliteration].&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
==== Integrating flask-webfonts extension with libindic ====&lt;br /&gt;
&lt;br /&gt;
libindic used to have a Webfonts module for serving Indian language fonts as Webfonts for browsers. During GSOC 2013 it was separated as an extension to Flask framework which can be generally used with any Flask powered app. The current code can be found at [https://github.com/Project-SILPA/flask-webfonts]. The module is not fine tuned yet so below are the objectives.&lt;br /&gt;
&lt;br /&gt;
# The module is not yet fine tuned and using it will make other modules break. This needs to be fixed (Can be checked with &#039;webfonts&#039; branch of libindic code on github.&lt;br /&gt;
# Write tests to check the functionalities.&lt;br /&gt;
# Adhere to Flask extension guidelines and submit the modules to Flask extensions directory.&lt;br /&gt;
# Write a tool which can take a directory containing fonts file or single font file and generate configuration file needed by the extension. (A possible such tool which is outdated can be found at [https://github.com/copyninja/fontinfo])&lt;br /&gt;
# Provide HTTP api&#039;s through flask extension which can expose the CSS for applications.&lt;br /&gt;
&lt;br /&gt;
For all tasks above we expect documentation, test cases from the students as deliverable. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Intermediate&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org &amp;lt;preferred&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
# Writing applications using Flask&lt;br /&gt;
# Various Transliteration system knolwedge&lt;br /&gt;
# Webfonts knowledge and writing extensions for Flask&lt;br /&gt;
# Test drive development.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in libindic into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the libindic algorithms to node modules. Several modules, alogorithms in libindic project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Integrate Varnam into libindic===&lt;br /&gt;
&lt;br /&gt;
Create a libindic module which hosts [http://www.varnamproject.com varnam]. This includes making a python port for libvarnam and making a libindic module which uses the python port. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Python&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
* [http://www.varnamproject.com/docs/faq FAQ]&lt;br /&gt;
* [http://www.varnamproject.com/docs Documentation]&lt;br /&gt;
* [http://www.varnamproject.com/docs/contributing Contributors guide &amp;amp; ideas to work on]&lt;br /&gt;
&lt;br /&gt;
Apart from the following ideas, you can propose your own idea. &lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Create an Android IME===&lt;br /&gt;
&lt;br /&gt;
Varnam will be ported as a libindic module and it will be available on Android as part of the android SDK project which libindic has proposed. This idea is merged to the [http://wiki.smc.org.in/SoC/2014/Project_ideas#Android_SDK_for_Silpa libindic] project ideas.&lt;br /&gt;
&lt;br /&gt;
===Enable varnam&#039;s suggestions system to be used from Inscript or any other input system===&lt;br /&gt;
&lt;br /&gt;
Varnam has knowledge about lot of words. This idea proposes a method to use these words and provide suggestions for other input systems. Basically, in Varnam, the API call will be something like,&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
varnam_get_suggestions (handle, &amp;quot;भारत&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will fetch all the suggestions which has the given prefix. &lt;br /&gt;
&lt;br /&gt;
`varnam_get_suggestions` needs to keep track of the previous words and use [http://en.wikipedia.org/wiki/N-gram n-gram] based dataset to filter the results. This should also learn the words back into the word corpus that varnam is using. Filtering suggestions won&#039;t be just a prefix search, but it will have knowledge about how text can be written in the target language and provide smart filtering. Searching in a large corpus and providing real-time suggestions makes this a challenging task. &lt;br /&gt;
&lt;br /&gt;
Once this is implemented in `libvarnam`, it can be used in the ibus-engine.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Unicode &amp;amp; encodings&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; :&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2016/Project_ideas&amp;diff=10726</id>
		<title>GSoC/2016/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2016/Project_ideas&amp;diff=10726"/>
		<updated>2016-03-06T04:30:26Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Varnam based */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;green&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sayamindu Das Gupta (&#039;&#039;&#039;unmadindu&#039;&#039;&#039;on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeesh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Samuel Thibault (&#039;&#039;&#039;youpi&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2016=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/GSoC/2016#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]  of [http://smc.org.in Swathanthra Malayalam computing] &lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
== Indic Keyboard ==&lt;br /&gt;
https://gitlab.com/smc/indic-keyboard/issues or https://github.com/smc/Indic-Keyboard/issues&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project or #silpa on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java / Android, &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== libindic - Android ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project or #silpa on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java / Android, &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== ibus-braille module modifications ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
This project will be to make improvements on the [[GSoC/2014/Project_ideas#Adding_Braille_Keyboard_layouts_for_Indian_Languages_to_m17n_Library | project]] that was successfully completed by a student under SMC. The remaining tasks areː&lt;br /&gt;
#Integrate Ibus-Braille with Liblouis&lt;br /&gt;
#Create Table editor for Liblouis&lt;br /&gt;
#Create a web version and host it.&lt;br /&gt;
#Add more indian languages to Liblouis&lt;br /&gt;
#Add facility to write direct braille Unicode characters&lt;br /&gt;
#Remove espeak dependency and make accessible via orca itself.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Samuel Thibault&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - youpi on #smc-project on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Varnam based ==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]].&lt;br /&gt;
&lt;br /&gt;
=== Add Varnam support into Indic Keyboard ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
As part of this project, students can add support for Varnam into IndicKeyboard. This involves roughly the following steps:&lt;br /&gt;
&lt;br /&gt;
# Compiling libvarnam for Android&lt;br /&gt;
# Writing JNI wrappers for the libvarnam library&lt;br /&gt;
# Hooking up varnamd on Android to do the word corpus synchronization&lt;br /&gt;
# Add varnam support to IndicKeyboard&lt;br /&gt;
&lt;br /&gt;
Before submitting proposals:&lt;br /&gt;
&lt;br /&gt;
# Ensure you can program using C, Java and golang&lt;br /&gt;
# Ensure you have varnam libraries compiled on your local machine&lt;br /&gt;
# Ensure you have IndicKeyboard setup on your local machine&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Navaneeth K. N.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Java, golang, Android&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;: How to write an input system for Android&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Reference:&lt;br /&gt;
&lt;br /&gt;
# [https://www.varnamproject.com/ Varnam]&lt;br /&gt;
# IndicKeyboard [https://play.google.com/store/apps/details?id=org.smc.inputmethod.indic Playstore] | [https://github.com/androidtweak/Indic-Keyboard Github]&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
libindic project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of libindic framework.&lt;br /&gt;
&lt;br /&gt;
The project is not about coding an existing algorithm, but to develop and implement an algorithm.&lt;br /&gt;
&lt;br /&gt;
Hunspell&#039;s limitations can be understood from [[User:%E0%B4%B8%E0%B4%A8%E0%B5%8D%E0%B4%A4%E0%B5%8B%E0%B4%B7%E0%B5%8D/HunspellConversation| this conversation]] we had with the author of Hunspell in 2008&lt;br /&gt;
&lt;br /&gt;
Homework to do before submitting applications:&lt;br /&gt;
# Use Hunspell in any Indian language like Malayalam for spell correction in editors or word processors and understand the limitations&lt;br /&gt;
# Study the nature of inflection and agglutination in Indian languages, read existing documents on this(ask for documents too) and note down your observations&lt;br /&gt;
# Study Hunspell and other spellcheckers to see how this problem is addressed&lt;br /&gt;
# Understand how a spell checker works. How to write a spellchecker from scratch?&lt;br /&gt;
# Come up with a plan about addressing the issue.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language and complete the homework as listed above.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==libindic Project Based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===libindic Project Improvements===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
This is set of ideas needed to improve the existing libindic infrastructure. We have decided following tasks as part of this project&lt;br /&gt;
&lt;br /&gt;
# Provide REST API to libindic without disturbing existing JSONRPC API&lt;br /&gt;
# Improve the Transliteration module&lt;br /&gt;
# Integrate [https://github.com/Project-SILPA/flask-webfonts Flask Webfonts] extension with libindic to provide Webfonts support.&lt;br /&gt;
&lt;br /&gt;
==== Provide REST like API for libindic ====&lt;br /&gt;
&lt;br /&gt;
libindic provides JSONRPC API currently which is also utilized by the templates of framework. JSONRPC is not well supported in all languages and results in [https://en.wikipedia.org/wiki/Not_invented_here NIH code]. So we would like to provide REST like HTTP based API&#039;s for libindic and at the same time leave the current JSONRPC code untouched for backward compatibility reasons.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
* Develop module or use existing module to provide REST like API&#039;s&lt;br /&gt;
* API should support GET and POST. [http://www.w3.org/2001/tag/doc/whenToUseGet.html When to use GET?].&lt;br /&gt;
&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
====  Improve Transliteration module ====&lt;br /&gt;
&lt;br /&gt;
We have a Transliteration module which supports transliteration from any Indic language to other Indic language and also support to English to Indic and Indic to English transliteration. Also we support IPA and ISO15919 transliteration system. But the module isn&#039;t in perfect shape and has lot of bugs. With this idea we would like to improve the following parts&lt;br /&gt;
&lt;br /&gt;
# Improve cross indic language transliteration system. Currently only Malayalam and Kannada are working without any external language support, all other Indian languages are first transliterated to Malayalam and then transliterated to target Indic language. We want to remove this cycle from source -&amp;gt; Malayalam -&amp;gt; target.&lt;br /&gt;
# English to IPA transliteration is currently broken and this needs to be fixed. See [https://github.com/Project-SILPA/Transliteration/issues/3 IPA transliteration bug].&lt;br /&gt;
# Once the IPA transliteration issue above is fixed, imporve English to Indic transliteration system using IPA. Currently English to Indic transliteration system is done using CMU Sphinx dictionary which is having limited set of words which inturn limits the output of English to Indic transliteration system.&lt;br /&gt;
# Improve IS015919 to Indic transliteration system see [https://github.com/Project-SILPA/Transliteration/issues/4 IS015919 to Indic transliteration].&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
==== Integrating flask-webfonts extension with libindic ====&lt;br /&gt;
&lt;br /&gt;
libindic used to have a Webfonts module for serving Indian language fonts as Webfonts for browsers. During GSOC 2013 it was separated as an extension to Flask framework which can be generally used with any Flask powered app. The current code can be found at [https://github.com/Project-SILPA/flask-webfonts]. The module is not fine tuned yet so below are the objectives.&lt;br /&gt;
&lt;br /&gt;
# The module is not yet fine tuned and using it will make other modules break. This needs to be fixed (Can be checked with &#039;webfonts&#039; branch of libindic code on github.&lt;br /&gt;
# Write tests to check the functionalities.&lt;br /&gt;
# Adhere to Flask extension guidelines and submit the modules to Flask extensions directory.&lt;br /&gt;
# Write a tool which can take a directory containing fonts file or single font file and generate configuration file needed by the extension. (A possible such tool which is outdated can be found at [https://github.com/copyninja/fontinfo])&lt;br /&gt;
# Provide HTTP api&#039;s through flask extension which can expose the CSS for applications.&lt;br /&gt;
&lt;br /&gt;
For all tasks above we expect documentation, test cases from the students as deliverable. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Intermediate&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org &amp;lt;preferred&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
# Writing applications using Flask&lt;br /&gt;
# Various Transliteration system knolwedge&lt;br /&gt;
# Webfonts knowledge and writing extensions for Flask&lt;br /&gt;
# Test drive development.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in libindic into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the libindic algorithms to node modules. Several modules, alogorithms in libindic project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Integrate Varnam into libindic===&lt;br /&gt;
&lt;br /&gt;
Create a libindic module which hosts [http://www.varnamproject.com varnam]. This includes making a python port for libvarnam and making a libindic module which uses the python port. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Python&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
* [http://www.varnamproject.com/docs/faq FAQ]&lt;br /&gt;
* [http://www.varnamproject.com/docs Documentation]&lt;br /&gt;
* [http://www.varnamproject.com/docs/contributing Contributors guide &amp;amp; ideas to work on]&lt;br /&gt;
&lt;br /&gt;
Apart from the following ideas, you can propose your own idea. &lt;br /&gt;
&lt;br /&gt;
===Programming language bindings &amp;amp; varnam-daemon===&lt;br /&gt;
&lt;br /&gt;
Varnam is written on C which makes interoperability with other languages easy. There are language bindings available for `NodeJs` and `Ruby`. Supporting Varnam in multiple languages allows projects to use varnam easily to enable Indian language input.&lt;br /&gt;
&lt;br /&gt;
To make using varnam from different languages easier, make a cross platform standalone process which uses `libvarnam` shared library and exposes a RPC API over network. This allows any programming language with a socket support can be used with libvarnam. This also makes language bindings fairly easy because they don&#039;t have to work with the native interoperability support. The protocol can be a simple text based protocol for all the commands that `libvarnam` supports. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Create an Android IME===&lt;br /&gt;
&lt;br /&gt;
Varnam will be ported as a libindic module and it will be available on Android as part of the android SDK project which libindic has proposed. This idea is merged to the [http://wiki.smc.org.in/SoC/2014/Project_ideas#Android_SDK_for_Silpa libindic] project ideas.&lt;br /&gt;
&lt;br /&gt;
===Enable varnam&#039;s suggestions system to be used from Inscript or any other input system===&lt;br /&gt;
&lt;br /&gt;
Varnam has knowledge about lot of words. This idea proposes a method to use these words and provide suggestions for other input systems. Basically, in Varnam, the API call will be something like,&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
varnam_get_suggestions (handle, &amp;quot;भारत&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will fetch all the suggestions which has the given prefix. &lt;br /&gt;
&lt;br /&gt;
`varnam_get_suggestions` needs to keep track of the previous words and use [http://en.wikipedia.org/wiki/N-gram n-gram] based dataset to filter the results. This should also learn the words back into the word corpus that varnam is using. Filtering suggestions won&#039;t be just a prefix search, but it will have knowledge about how text can be written in the target language and provide smart filtering. Searching in a large corpus and providing real-time suggestions makes this a challenging task. &lt;br /&gt;
&lt;br /&gt;
Once this is implemented in `libvarnam`, it can be used in the ibus-engine.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Unicode &amp;amp; encodings&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; :&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2016/Project_ideas&amp;diff=10725</id>
		<title>GSoC/2016/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2016/Project_ideas&amp;diff=10725"/>
		<updated>2016-03-06T04:28:05Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Varnam based */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;green&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sayamindu Das Gupta (&#039;&#039;&#039;unmadindu&#039;&#039;&#039;on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeesh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Samuel Thibault (&#039;&#039;&#039;youpi&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2016=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/GSoC/2016#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]  of [http://smc.org.in Swathanthra Malayalam computing] &lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
== Indic Keyboard ==&lt;br /&gt;
https://gitlab.com/smc/indic-keyboard/issues or https://github.com/smc/Indic-Keyboard/issues&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project or #silpa on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java / Android, &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== libindic - Android ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project or #silpa on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java / Android, &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== ibus-braille module modifications ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;ː&lt;br /&gt;
&lt;br /&gt;
This project will be to make improvements on the [[GSoC/2014/Project_ideas#Adding_Braille_Keyboard_layouts_for_Indian_Languages_to_m17n_Library | project]] that was successfully completed by a student under SMC. The remaining tasks areː&lt;br /&gt;
#Integrate Ibus-Braille with Liblouis&lt;br /&gt;
#Create Table editor for Liblouis&lt;br /&gt;
#Create a web version and host it.&lt;br /&gt;
#Add more indian languages to Liblouis&lt;br /&gt;
#Add facility to write direct braille Unicode characters&lt;br /&gt;
#Remove espeak dependency and make accessible via orca itself.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Samuel Thibault&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - youpi on #smc-project on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Varnam based ==&lt;br /&gt;
&lt;br /&gt;
=== Add Varnam support into Indic Keyboard ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
“Varnam” is an open source, cross platform, predictive transliterator for Indian languages. At the core, varnam has a shared library called `libvarnam` which does transliteration, learning etc. There are many clients built using `libvarnam` like, ibus-varnam, varnam-web etc. Today, Varnam don&#039;t have a presence in Android.&lt;br /&gt;
&lt;br /&gt;
As part of this project, students can add support for Varnam into IndicKeyboard. This involves roughly the following steps:&lt;br /&gt;
&lt;br /&gt;
# Compiling libvarnam for Android&lt;br /&gt;
# Writing JNI wrappers for the libvarnam library&lt;br /&gt;
# Hooking up varnamd on Android to do the word corpus synchronization&lt;br /&gt;
# Add varnam support to IndicKeyboard&lt;br /&gt;
&lt;br /&gt;
Before submitting proposals:&lt;br /&gt;
&lt;br /&gt;
# Ensure you can program using C, Java and golang&lt;br /&gt;
# Ensure you have varnam libraries compiled on your local machine&lt;br /&gt;
# Ensure you have IndicKeyboard setup on your local machine&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Navaneeth K. N.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Java, golang, Android&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;: How to write an input system for Android&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Reference:&lt;br /&gt;
&lt;br /&gt;
# [https://www.varnamproject.com/ Varnam]&lt;br /&gt;
# IndicKeyboard [https://play.google.com/store/apps/details?id=org.smc.inputmethod.indic Playstore] | [https://github.com/androidtweak/Indic-Keyboard Github]&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
libindic project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of libindic framework.&lt;br /&gt;
&lt;br /&gt;
The project is not about coding an existing algorithm, but to develop and implement an algorithm.&lt;br /&gt;
&lt;br /&gt;
Hunspell&#039;s limitations can be understood from [[User:%E0%B4%B8%E0%B4%A8%E0%B5%8D%E0%B4%A4%E0%B5%8B%E0%B4%B7%E0%B5%8D/HunspellConversation| this conversation]] we had with the author of Hunspell in 2008&lt;br /&gt;
&lt;br /&gt;
Homework to do before submitting applications:&lt;br /&gt;
# Use Hunspell in any Indian language like Malayalam for spell correction in editors or word processors and understand the limitations&lt;br /&gt;
# Study the nature of inflection and agglutination in Indian languages, read existing documents on this(ask for documents too) and note down your observations&lt;br /&gt;
# Study Hunspell and other spellcheckers to see how this problem is addressed&lt;br /&gt;
# Understand how a spell checker works. How to write a spellchecker from scratch?&lt;br /&gt;
# Come up with a plan about addressing the issue.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language and complete the homework as listed above.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==libindic Project Based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===libindic Project Improvements===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
This is set of ideas needed to improve the existing libindic infrastructure. We have decided following tasks as part of this project&lt;br /&gt;
&lt;br /&gt;
# Provide REST API to libindic without disturbing existing JSONRPC API&lt;br /&gt;
# Improve the Transliteration module&lt;br /&gt;
# Integrate [https://github.com/Project-SILPA/flask-webfonts Flask Webfonts] extension with libindic to provide Webfonts support.&lt;br /&gt;
&lt;br /&gt;
==== Provide REST like API for libindic ====&lt;br /&gt;
&lt;br /&gt;
libindic provides JSONRPC API currently which is also utilized by the templates of framework. JSONRPC is not well supported in all languages and results in [https://en.wikipedia.org/wiki/Not_invented_here NIH code]. So we would like to provide REST like HTTP based API&#039;s for libindic and at the same time leave the current JSONRPC code untouched for backward compatibility reasons.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
* Develop module or use existing module to provide REST like API&#039;s&lt;br /&gt;
* API should support GET and POST. [http://www.w3.org/2001/tag/doc/whenToUseGet.html When to use GET?].&lt;br /&gt;
&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
====  Improve Transliteration module ====&lt;br /&gt;
&lt;br /&gt;
We have a Transliteration module which supports transliteration from any Indic language to other Indic language and also support to English to Indic and Indic to English transliteration. Also we support IPA and ISO15919 transliteration system. But the module isn&#039;t in perfect shape and has lot of bugs. With this idea we would like to improve the following parts&lt;br /&gt;
&lt;br /&gt;
# Improve cross indic language transliteration system. Currently only Malayalam and Kannada are working without any external language support, all other Indian languages are first transliterated to Malayalam and then transliterated to target Indic language. We want to remove this cycle from source -&amp;gt; Malayalam -&amp;gt; target.&lt;br /&gt;
# English to IPA transliteration is currently broken and this needs to be fixed. See [https://github.com/Project-SILPA/Transliteration/issues/3 IPA transliteration bug].&lt;br /&gt;
# Once the IPA transliteration issue above is fixed, imporve English to Indic transliteration system using IPA. Currently English to Indic transliteration system is done using CMU Sphinx dictionary which is having limited set of words which inturn limits the output of English to Indic transliteration system.&lt;br /&gt;
# Improve IS015919 to Indic transliteration system see [https://github.com/Project-SILPA/Transliteration/issues/4 IS015919 to Indic transliteration].&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
==== Integrating flask-webfonts extension with libindic ====&lt;br /&gt;
&lt;br /&gt;
libindic used to have a Webfonts module for serving Indian language fonts as Webfonts for browsers. During GSOC 2013 it was separated as an extension to Flask framework which can be generally used with any Flask powered app. The current code can be found at [https://github.com/Project-SILPA/flask-webfonts]. The module is not fine tuned yet so below are the objectives.&lt;br /&gt;
&lt;br /&gt;
# The module is not yet fine tuned and using it will make other modules break. This needs to be fixed (Can be checked with &#039;webfonts&#039; branch of libindic code on github.&lt;br /&gt;
# Write tests to check the functionalities.&lt;br /&gt;
# Adhere to Flask extension guidelines and submit the modules to Flask extensions directory.&lt;br /&gt;
# Write a tool which can take a directory containing fonts file or single font file and generate configuration file needed by the extension. (A possible such tool which is outdated can be found at [https://github.com/copyninja/fontinfo])&lt;br /&gt;
# Provide HTTP api&#039;s through flask extension which can expose the CSS for applications.&lt;br /&gt;
&lt;br /&gt;
For all tasks above we expect documentation, test cases from the students as deliverable. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Intermediate&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org &amp;lt;preferred&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
# Writing applications using Flask&lt;br /&gt;
# Various Transliteration system knolwedge&lt;br /&gt;
# Webfonts knowledge and writing extensions for Flask&lt;br /&gt;
# Test drive development.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in libindic into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the libindic algorithms to node modules. Several modules, alogorithms in libindic project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Integrate Varnam into libindic===&lt;br /&gt;
&lt;br /&gt;
Create a libindic module which hosts [http://www.varnamproject.com varnam]. This includes making a python port for libvarnam and making a libindic module which uses the python port. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Python&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
* [http://www.varnamproject.com/docs/faq FAQ]&lt;br /&gt;
* [http://www.varnamproject.com/docs Documentation]&lt;br /&gt;
* [http://www.varnamproject.com/docs/contributing Contributors guide &amp;amp; ideas to work on]&lt;br /&gt;
&lt;br /&gt;
Apart from the following ideas, you can propose your own idea. &lt;br /&gt;
&lt;br /&gt;
===Programming language bindings &amp;amp; varnam-daemon===&lt;br /&gt;
&lt;br /&gt;
Varnam is written on C which makes interoperability with other languages easy. There are language bindings available for `NodeJs` and `Ruby`. Supporting Varnam in multiple languages allows projects to use varnam easily to enable Indian language input.&lt;br /&gt;
&lt;br /&gt;
To make using varnam from different languages easier, make a cross platform standalone process which uses `libvarnam` shared library and exposes a RPC API over network. This allows any programming language with a socket support can be used with libvarnam. This also makes language bindings fairly easy because they don&#039;t have to work with the native interoperability support. The protocol can be a simple text based protocol for all the commands that `libvarnam` supports. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Create an Android IME===&lt;br /&gt;
&lt;br /&gt;
Varnam will be ported as a libindic module and it will be available on Android as part of the android SDK project which libindic has proposed. This idea is merged to the [http://wiki.smc.org.in/SoC/2014/Project_ideas#Android_SDK_for_Silpa libindic] project ideas.&lt;br /&gt;
&lt;br /&gt;
===Enable varnam&#039;s suggestions system to be used from Inscript or any other input system===&lt;br /&gt;
&lt;br /&gt;
Varnam has knowledge about lot of words. This idea proposes a method to use these words and provide suggestions for other input systems. Basically, in Varnam, the API call will be something like,&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
varnam_get_suggestions (handle, &amp;quot;भारत&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will fetch all the suggestions which has the given prefix. &lt;br /&gt;
&lt;br /&gt;
`varnam_get_suggestions` needs to keep track of the previous words and use [http://en.wikipedia.org/wiki/N-gram n-gram] based dataset to filter the results. This should also learn the words back into the word corpus that varnam is using. Filtering suggestions won&#039;t be just a prefix search, but it will have knowledge about how text can be written in the target language and provide smart filtering. Searching in a large corpus and providing real-time suggestions makes this a challenging task. &lt;br /&gt;
&lt;br /&gt;
Once this is implemented in `libvarnam`, it can be used in the ibus-engine.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Unicode &amp;amp; encodings&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; :&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4869</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4869"/>
		<updated>2014-03-19T11:33:19Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Create an Android IME */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
The project is not about coding an existing algorithm, but to develop and implement an algorithm.&lt;br /&gt;
&lt;br /&gt;
Hunspell&#039;s limitations can be understood from [[User:%E0%B4%B8%E0%B4%A8%E0%B5%8D%E0%B4%A4%E0%B5%8B%E0%B4%B7%E0%B5%8D/HunspellConversation| this conversation]] we had with the author of Hunspell in 2008&lt;br /&gt;
&lt;br /&gt;
Homework to do before submitting applications:&lt;br /&gt;
# Use Hunspell in any Indian language like Malayalam for spell correction in editors or word processors and understand the limitations&lt;br /&gt;
# Study the nature of inflection and agglutination in Indian languages, read existing documents on this(ask for documents too) and note down your observations&lt;br /&gt;
# Study Hunspell and other spellcheckers to see how this problem is addressed&lt;br /&gt;
# Understand how a spell checker works. How to write a spellchecker from scratch?&lt;br /&gt;
# Come up with a plan about addressing the issue.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - santhosh on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language and complete the homework as listed above.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - rajeeshknambiar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Deepa P Gopinath&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - deepagopinath on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==SILPA Project Based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===SILPA Project Improvements===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
This is set of ideas needed to improve the existing SILPA infrastructure. We have decided following tasks as part of this project&lt;br /&gt;
&lt;br /&gt;
# Provide REST API to SILPA without disturbing existing JSONRPC API&lt;br /&gt;
# Improve the Transliteration module&lt;br /&gt;
# Integrate [https://github.com/Project-SILPA/flask-webfonts Flask Webfonts] extension with SILPA to provide Webfonts support.&lt;br /&gt;
&lt;br /&gt;
==== Provide REST like API for SILPA ====&lt;br /&gt;
&lt;br /&gt;
SILPA provides JSONRPC API currently which is also utilized by the templates of framework. JSONRPC is not well supported in all languages and results in [https://en.wikipedia.org/wiki/Not_invented_here NIH code]. So we would like to provide REST like HTTP based API&#039;s for SILPA and at the same time leave the current JSONRPC code untouched for backward compatibility reasons.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
* Develop module or use existing module to provide REST like API&#039;s&lt;br /&gt;
* API should support GET and POST. [http://www.w3.org/2001/tag/doc/whenToUseGet.html When to use GET?].&lt;br /&gt;
&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
====  Improve Transliteration module ====&lt;br /&gt;
&lt;br /&gt;
We have a Transliteration module which supports transliteration from any Indic language to other Indic language and also support to English to Indic and Indic to English transliteration. Also we support IPA and ISO15919 transliteration system. But the module isn&#039;t in perfect shape and has lot of bugs. With this idea we would like to improve the following parts&lt;br /&gt;
&lt;br /&gt;
# Improve cross indic language transliteration system. Currently only Malayalam and Kannada are working without any external language support, all other Indian languages are first transliterated to Malayalam and then transliterated to target Indic language. We want to remove this cycle from source -&amp;gt; Malayalam -&amp;gt; target.&lt;br /&gt;
# English to IPA transliteration is currently broken and this needs to be fixed. See [https://github.com/Project-SILPA/Transliteration/issues/3 IPA transliteration bug].&lt;br /&gt;
# Once the IPA transliteration issue above is fixed, imporve English to Indic transliteration system using IPA. Currently English to Indic transliteration system is done using CMU Sphinx dictionary which is having limited set of words which inturn limits the output of English to Indic transliteration system.&lt;br /&gt;
# Improve IS015919 to Indic transliteration system see [https://github.com/Project-SILPA/Transliteration/issues/4 IS015919 to Indic transliteration].&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
==== Integrating flask-webfonts extension with SILPA ====&lt;br /&gt;
&lt;br /&gt;
SILPA used to have a Webfonts module for serving Indian language fonts as Webfonts for browsers. During GSOC 2013 it was separated as an extension to Flask framework which can be generally used with any Flask powered app. The current code can be found at [https://github.com/Project-SILPA/flask-webfonts]. The module is not fine tuned yet so below are the objectives.&lt;br /&gt;
&lt;br /&gt;
# The module is not yet fine tuned and using it will make other modules break. This needs to be fixed (Can be checked with &#039;webfonts&#039; branch of SILPA code on github.&lt;br /&gt;
# Write tests to check the functionalities.&lt;br /&gt;
# Adhere to Flask extension guidelines and submit the modules to Flask extensions directory.&lt;br /&gt;
# Write a tool which can take a directory containing fonts file or single font file and generate configuration file needed by the extension. (A possible such tool which is outdated can be found at [https://github.com/copyninja/fontinfo])&lt;br /&gt;
# Provide HTTP api&#039;s through flask extension which can expose the CSS for applications.&lt;br /&gt;
&lt;br /&gt;
For all tasks above we expect documentation, test cases from the students as deliverable. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Intermediate&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Vasudev Kamath, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Vasudev Kamath - copyninja on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org &amp;lt;preferred&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
# Writing applications using Flask&lt;br /&gt;
# Various Transliteration system knolwedge&lt;br /&gt;
# Webfonts knowledge and writing extensions for Flask&lt;br /&gt;
# Test drive development.&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&amp;lt;Please note this idea is for a SDK, not an app or just a java port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*All modules need to be ported to java so that it can be used inside an Android Project.&lt;br /&gt;
*Other applications should be able to use this Silpa library to easy integrate features (as a SDK) from our modules. Eg.&lt;br /&gt;
**Transliteration - Developer can specify a text input inside the  application needs transliteration, and our SDK should take care of the  transliteration process whenever user inputs text to that field.&lt;br /&gt;
**Render module - Detect whether necessary font is available in the  system, if it is not, render text as image and replace text with this.&lt;br /&gt;
**All modules can be explained like this.&lt;br /&gt;
*Investigate whether image rendering part of render module can be done in device, inside application itself. Few ways to implement that are&lt;br /&gt;
**Compiling cairo/pango with ndk&lt;br /&gt;
**Compiling Harffbuzz from AOSP tree with ndk&lt;br /&gt;
**Based on the result of rendering module investigation, we can device on whether to use server side rendering or not.&lt;br /&gt;
**Pack popular fonts with the SDK, Use it to display text if device doesn&#039;t have required font. (there are few hacks to get better rendering in older versions of android). Developer should be able to force rendering using packaged font, to get consistency across devices.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;Better to prepare SDK with helper than preparing application itself. SDK aka library&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Hrishikesh K. B, Jishnu Mohan, Aashik S&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Hrishikesh K B - stultus on on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
*Aashik S - irumbumoideen on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in SILPA into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the silpa algorithms to node modules. Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Santhosh Thottingal, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 santhosh on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Integrate Varnam into Silpa===&lt;br /&gt;
&lt;br /&gt;
Create a Silpa module which hosts [http://www.varnamproject.com varnam]. This includes making a python port for libvarnam and making a Silpa module which uses the python port. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Python&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC  &lt;br /&gt;
*Pirate Praveen - j4v4m4n on #smc-project on Freenode&lt;br /&gt;
*Ershad K - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Upstream discussion&#039;&#039;&#039;: https://www.loomio.org/d/4vTqCj5X/language-filter-for-diaspora-as-a-gsoc-project&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
* [http://www.varnamproject.com/docs/faq FAQ]&lt;br /&gt;
* [http://www.varnamproject.com/docs Documentation]&lt;br /&gt;
* [http://www.varnamproject.com/docs/contributing Contributors guide &amp;amp; ideas to work on]&lt;br /&gt;
&lt;br /&gt;
Apart from the following ideas, you can propose your own idea. &lt;br /&gt;
&lt;br /&gt;
===Programming language bindings &amp;amp; varnam-daemon===&lt;br /&gt;
&lt;br /&gt;
Varnam is written on C which makes interoperability with other languages easy. There are language bindings available for `NodeJs` and `Ruby`. Supporting Varnam in multiple languages allows projects to use varnam easily to enable Indian language input.&lt;br /&gt;
&lt;br /&gt;
To make using varnam from different languages easier, make a cross platform standalone process which uses `libvarnam` shared library and exposes a RPC API over network. This allows any programming language with a socket support can be used with libvarnam. This also makes language bindings fairly easy because they don&#039;t have to work with the native interoperability support. The protocol can be a simple text based protocol for all the commands that `libvarnam` supports. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning words. Today, when a word is learned, varnam takes all the possible prefixes into account and learn all of them to improve future suggestions. But sometimes, this is not enough to predict good suggestions. An improvement is suggested which will try to infer the base form of the word under learning.&lt;br /&gt;
&lt;br /&gt;
Varnam has a learning system built-in which can learn words and it can also learn possible other ways to write a word. Consider the following example. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
learn(&amp;quot;भारत&amp;quot;) = [bharat, bhaarath, bharath]&lt;br /&gt;
transliterate(&amp;quot;bharat&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bhaarath&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bharath&amp;quot;) = भारत&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Varnam also learns a word&#039;s prefixes so that it can produce better predictions for any word which has the same prefix. So in this case, with just learning the word &amp;quot;भारत&amp;quot;, varnam can predict &amp;quot;bharateey&amp;quot; = &amp;quot;भारतीय&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The proposed idea talks about making this learn better. One example is infer the word &amp;quot;भारत&amp;quot; when learning भारतीय. Something like a porter stemmer implementation but integrated into the varnam framework so that&lt;br /&gt;
new language support can be added easily.&lt;br /&gt;
&lt;br /&gt;
This idea also includes improving concurrency support for learn. Currently, learn can&#039;t be called concurrently because of the restrictions on the SQLite. So every learn has to be sequentially done. This needs to be improved by having a simple internal queue from which words gets queued when the learn is busy. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C, Ruby (basics)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Create an Android IME===&lt;br /&gt;
&lt;br /&gt;
Varnam will be ported as a Silpa module and it will be available on Android as part of the android SDK project which Silpa has proposed. This idea is merged to the [http://wiki.smc.org.in/SoC/2014/Project_ideas#Android_SDK_for_Silpa Silpa] project ideas.&lt;br /&gt;
&lt;br /&gt;
===Enable varnam&#039;s suggestions system to be used from Inscript or any other input system===&lt;br /&gt;
&lt;br /&gt;
Varnam has knowledge about lot of words. This idea proposes a method to use these words and provide suggestions for other input systems. Basically, in Varnam, the API call will be something like,&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
varnam_get_suggestions (handle, &amp;quot;भारत&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will fetch all the suggestions which has the given prefix. &lt;br /&gt;
&lt;br /&gt;
`varnam_get_suggestions` needs to keep track of the previous words and use [http://en.wikipedia.org/wiki/N-gram n-gram] based dataset to filter the results. This should also learn the words back into the word corpus that varnam is using. Filtering suggestions won&#039;t be just a prefix search, but it will have knowledge about how text can be written in the target language and provide smart filtering. Searching in a large corpus and providing real-time suggestions makes this a challenging task. &lt;br /&gt;
&lt;br /&gt;
Once this is implemented in `libvarnam`, it can be used in the ibus-engine.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Unicode &amp;amp; encodings&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
==Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Anilkumar K V&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - anilkumar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4679</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4679"/>
		<updated>2014-03-07T05:44:11Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* SILPA Project Based */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
The project is not about coding an existing algorithm, but to develop and implement an algorithm&lt;br /&gt;
&lt;br /&gt;
Homework to do before submitting applications:&lt;br /&gt;
# Use Hunspell in any Indian language like Malayalam for spell correction in editors or word processors and understand the limitations&lt;br /&gt;
# Study the nature of inflection and agglutination in Indian languages, read existing documents on this(ask for documents too) and note down your observations&lt;br /&gt;
# Study Hunspell and other spellcheckers to see how this problem is addressed&lt;br /&gt;
# Understand how a spell checker works. How to write a spellchecker from scratch?&lt;br /&gt;
# Come up with a plan about addressing the issue.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - santhosh on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language and complete the homework as listed above.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - rajeeshknambiar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Deepa P Gopinath&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - deepagopinath on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==SILPA Project Based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===SILPA Project Improvements===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
This is set of ideas needed to improve the existing SILPA infrastructure. We have decided following tasks as part of this project&lt;br /&gt;
&lt;br /&gt;
# Provide REST API to SILPA without disturbing existing JSONRPC API&lt;br /&gt;
# Improve the Transliteration module&lt;br /&gt;
# Integrate [https://github.com/Project-SILPA/flask-webfonts Flask Webfonts] extension with SILPA to provide Webfonts support.&lt;br /&gt;
&lt;br /&gt;
==== Provide REST like API for SILPA ====&lt;br /&gt;
&lt;br /&gt;
SILPA provides JSONRPC API currently which is also utilized by the templates of framework. JSONRPC is not well supported in all languages and results in [https://en.wikipedia.org/wiki/Not_invented_here NIH code]. So we would like to provide REST like HTTP based API&#039;s for SILPA and at the same time leave the current JSONRPC code untouched for backward compatibility reasons.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
* Develop module or use existing module to provide REST like API&#039;s&lt;br /&gt;
* API should support GET and POST. [http://www.w3.org/2001/tag/doc/whenToUseGet.html When to use GET?].&lt;br /&gt;
&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
====  Improve Transliteration module ====&lt;br /&gt;
&lt;br /&gt;
We have a Transliteration module which supports transliteration from any Indic language to other Indic language and also support to English to Indic and Indic to English transliteration. Also we support IPA and ISO15919 transliteration system. But the module isn&#039;t in perfect shape and has lot of bugs. With this idea we would like to improve the following parts&lt;br /&gt;
&lt;br /&gt;
# Improve cross indic language transliteration system. Currently only Malayalam and Kannada are working without any external language support, all other Indian languages are first transliterated to Malayalam and then transliterated to target Indic language. We want to remove this cycle from source -&amp;gt; Malayalam -&amp;gt; target.&lt;br /&gt;
# English to IPA transliteration is currently broken and this needs to be fixed. See [https://github.com/Project-SILPA/Transliteration/issues/3 IPA transliteration bug].&lt;br /&gt;
# Once the IPA transliteration issue above is fixed, imporve English to Indic transliteration system using IPA. Currently English to Indic transliteration system is done using CMU Sphinx dictionary which is having limited set of words which inturn limits the output of English to Indic transliteration system.&lt;br /&gt;
# Improve IS015919 to Indic transliteration system see [https://github.com/Project-SILPA/Transliteration/issues/4 IS015919 to Indic transliteration].&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
==== Integrating flask-webfonts extension with SILPA ====&lt;br /&gt;
&lt;br /&gt;
SILPA used to have a Webfonts module for serving Indian language fonts as Webfonts for browsers. During GSOC 2013 it was separated as an extension to Flask framework which can be generally used with any Flask powered app. The current code can be found at [https://github.com/Project-SILPA/flask-webfonts]. The module is not fine tuned yet so below are the objectives.&lt;br /&gt;
&lt;br /&gt;
# The module is not yet fine tuned and using it will make other modules break. This needs to be fixed (Can be checked with &#039;webfonts&#039; branch of SILPA code on github.&lt;br /&gt;
# Write tests to check the functionalities.&lt;br /&gt;
# Adhere to Flask extension guidelines and submit the modules to Flask extensions directory.&lt;br /&gt;
# Write a tool which can take a directory containing fonts file or single font file and generate configuration file needed by the extension. (A possible such tool which is outdated can be found at [https://github.com/copyninja/fontinfo])&lt;br /&gt;
# Provide HTTP api&#039;s through flask extension which can expose the CSS for applications.&lt;br /&gt;
&lt;br /&gt;
For all tasks above we expect documentation, test cases from the students as deliverable. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Intermediate&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Vasudev Kamath, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Vasudev Kamath - copyninja on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org &amp;lt;preferred&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
# Writing applications using Flask&lt;br /&gt;
# Various Transliteration system knolwedge&lt;br /&gt;
# Webfonts knowledge and writing extensions for Flask&lt;br /&gt;
# Test drive development.&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&amp;lt;Please note this idea is for a SDK, not an app or just a java port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*All modules need to be ported to java so that it can be used inside an Android Project.&lt;br /&gt;
*Other applications should be able to use this Silpa library to easy integrate features (as a SDK) from our modules. Eg.&lt;br /&gt;
**Transliteration - Developer can specify a text input inside the  application needs transliteration, and our SDK should take care of the  transliteration process whenever user inputs text to that field.&lt;br /&gt;
**Render module - Detect whether necessary font is available in the  system, if it is not, render text as image and replace text with this.&lt;br /&gt;
**All modules can be explained like this.&lt;br /&gt;
*Investigate whether image rendering part of render module can be done in device, inside application itself. Few ways to implement that are&lt;br /&gt;
**Compiling cairo/pango with ndk&lt;br /&gt;
**Compiling Harffbuzz from AOSP tree with ndk&lt;br /&gt;
**Based on the result of rendering module investigation, we can device on whether to use server side rendering or not.&lt;br /&gt;
**Pack popular fonts with the SDK, Use it to display text if device doesn&#039;t have required font. (there are few hacks to get better rendering in older versions of android). Developer should be able to force rendering using packaged font, to get consistency across devices.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;Better to prepare SDK with helper than preparing application itself. SDK aka library&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Hrishikesh K. B, Jishnu Mohan, Aashik S&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Hrishikesh K B - stultus on on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
*Aashik S - irumbumoideen on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in SILPA into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the silpa algorithms to node modules. Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Santhosh Thottingal, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 santhosh on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Integrate Varnam into Silpa===&lt;br /&gt;
&lt;br /&gt;
Create a Silpa module which hosts [http://www.varnamproject.com varnam]. This includes making a python port for libvarnam and making a Silpa module which uses the python port. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C, Python&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC  &lt;br /&gt;
*Pirate Praveen - j4v4m4n on #smc-project on Freenode&lt;br /&gt;
*Ershad K - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
* [http://www.varnamproject.com/docs/faq FAQ]&lt;br /&gt;
* [http://www.varnamproject.com/docs Documentation]&lt;br /&gt;
* [http://www.varnamproject.com/docs/contributing Contributors guide &amp;amp; ideas to work on]&lt;br /&gt;
&lt;br /&gt;
Apart from the following ideas, you can propose your own idea. &lt;br /&gt;
&lt;br /&gt;
===Programming language bindings &amp;amp; varnam-daemon===&lt;br /&gt;
&lt;br /&gt;
Varnam is written on C which makes interoperability with other languages easy. There are language bindings available for `NodeJs` and `Ruby`. Supporting Varnam in multiple languages allows projects to use varnam easily to enable Indian language input.&lt;br /&gt;
&lt;br /&gt;
To make using varnam from different languages easier, make a cross platform standalone process which uses `libvarnam` shared library and exposes a RPC API over network. This allows any programming language with a socket support can be used with libvarnam. This also makes language bindings fairly easy because they don&#039;t have to work with the native interoperability support. The protocol can be a simple text based protocol for all the commands that `libvarnam` supports. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning words. Today, when a word is learned, varnam takes all the possible prefixes into account and learn all of them to improve future suggestions. But sometimes, this is not enough to predict good suggestions. An improvement is suggested which will try to infer the base form of the word under learning.&lt;br /&gt;
&lt;br /&gt;
Varnam has a learning system built-in which can learn words and it can also learn possible other ways to write a word. Consider the following example. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
learn(&amp;quot;भारत&amp;quot;) = [bharat, bhaarath, bharath]&lt;br /&gt;
transliterate(&amp;quot;bharat&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bhaarath&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bharath&amp;quot;) = भारत&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Varnam also learns a word&#039;s prefixes so that it can produce better predictions for any word which has the same prefix. So in this case, with just learning the word &amp;quot;भारत&amp;quot;, varnam can predict &amp;quot;bharateey&amp;quot; = &amp;quot;भारतीय&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The proposed idea talks about making this learn better. One example is infer the word &amp;quot;भारत&amp;quot; when learning भारतीय. Something like a porter stemmer implementation but integrated into the varnam framework so that&lt;br /&gt;
new language support can be added easily.&lt;br /&gt;
&lt;br /&gt;
This idea also includes improving concurrency support for learn. Currently, learn can&#039;t be called concurrently because of the restrictions on the SQLite. So every learn has to be sequentially done. This needs to be improved by having a simple internal queue from which words gets queued when the learn is busy. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C, Ruby (basics)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Create an Android IME===&lt;br /&gt;
&lt;br /&gt;
Android has an extensible input method system. Use that to make a IME which uses varnam internally. This includes, getting `libvarnam` compiled on android first. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Android, Java&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
===Enable varnam&#039;s suggestions system to be used from Inscript or any other input system===&lt;br /&gt;
&lt;br /&gt;
Varnam has knowledge about lot of words. This idea proposes a method to use these words and provide suggestions for other input systems. Basically, in Varnam, the API call will be something like,&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
varnam_get_suggestions (handle, &amp;quot;भारत&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will fetch all the suggestions which has the given prefix. &lt;br /&gt;
&lt;br /&gt;
`varnam_get_suggestions` needs to keep track of the previous words and use [http://en.wikipedia.org/wiki/N-gram n-gram] based dataset to filter the results. This should also learn the words back into the word corpus that varnam is using. Filtering suggestions won&#039;t be just a prefix search, but it will have knowledge about how text can be written in the target language and provide smart filtering. Searching in a large corpus and providing real-time suggestions makes this a challenging task. &lt;br /&gt;
&lt;br /&gt;
Once this is implemented in `libvarnam`, it can be used in the ibus-engine.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Unicode &amp;amp; encodings&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
==Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Anivar Aravind&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - anivar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4678</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4678"/>
		<updated>2014-03-07T05:19:40Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Improve the learning system */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
The project is not about coding an existing algorithm, but to develop and implement an algorithm&lt;br /&gt;
&lt;br /&gt;
Homework to do before submitting applications:&lt;br /&gt;
# Use Hunspell in any Indian language like Malayalam for spell correction in editors or word processors and understand the limitations&lt;br /&gt;
# Study the nature of inflection and agglutination in Indian languages, read existing documents on this(ask for documents too) and note down your observations&lt;br /&gt;
# Study Hunspell and other spellcheckers to see how this problem is addressed&lt;br /&gt;
# Understand how a spell checker works. How to write a spellchecker from scratch?&lt;br /&gt;
# Come up with a plan about addressing the issue.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - santhosh on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language and complete the homework as listed above.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - rajeeshknambiar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Deepa P Gopinath&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - deepagopinath on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==SILPA Project Based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===SILPA Project Improvements===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
This is set of ideas needed to improve the existing SILPA infrastructure. We have decided following tasks as part of this project&lt;br /&gt;
&lt;br /&gt;
# Provide REST API to SILPA without disturbing existing JSONRPC API&lt;br /&gt;
# Improve the Transliteration module&lt;br /&gt;
# Integrate [https://github.com/Project-SILPA/flask-webfonts Flask Webfonts] extension with SILPA to provide Webfonts support.&lt;br /&gt;
&lt;br /&gt;
==== Provide REST like API for SILPA ====&lt;br /&gt;
&lt;br /&gt;
SILPA provides JSONRPC API currently which is also utilized by the templates of framework. JSONRPC is not well supported in all languages and results in [https://en.wikipedia.org/wiki/Not_invented_here NIH code]. So we would like to provide REST like HTTP based API&#039;s for SILPA and at the same time leave the current JSONRPC code untouched for backward compatibility reasons.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
* Develop module or use existing module to provide REST like API&#039;s&lt;br /&gt;
* API should support GET and POST. [http://www.w3.org/2001/tag/doc/whenToUseGet.html When to use GET?].&lt;br /&gt;
&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
====  Improve Transliteration module ====&lt;br /&gt;
&lt;br /&gt;
We have a Transliteration module which supports transliteration from any Indic language to other Indic language and also support to English to Indic and Indic to English transliteration. Also we support IPA and ISO15919 transliteration system. But the module isn&#039;t in perfect shape and has lot of bugs. With this idea we would like to improve the following parts&lt;br /&gt;
&lt;br /&gt;
# Improve cross indic language transliteration system. Currently only Malayalam and Kannada are working without any external language support, all other Indian languages are first transliterated to Malayalam and then transliterated to target Indic language. We want to remove this cycle from source -&amp;gt; Malayalam -&amp;gt; target.&lt;br /&gt;
# English to IPA transliteration is currently broken and this needs to be fixed. See [https://github.com/Project-SILPA/Transliteration/issues/3 IPA transliteration bug].&lt;br /&gt;
# Once the IPA transliteration issue above is fixed, imporve English to Indic transliteration system using IPA. Currently English to Indic transliteration system is done using CMU Sphinx dictionary which is having limited set of words which inturn limits the output of English to Indic transliteration system.&lt;br /&gt;
# Improve IS015919 to Indic transliteration system see [https://github.com/Project-SILPA/Transliteration/issues/4 IS015919 to Indic transliteration].&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
==== Integrating flask-webfonts extension with SILPA ====&lt;br /&gt;
&lt;br /&gt;
SILPA used to have a Webfonts module for serving Indian language fonts as Webfonts for browsers. During GSOC 2013 it was separated as an extension to Flask framework which can be generally used with any Flask powered app. The current code can be found at [https://github.com/Project-SILPA/flask-webfonts]. The module is not fine tuned yet so below are the objectives.&lt;br /&gt;
&lt;br /&gt;
# The module is not yet fine tuned and using it will make other modules break. This needs to be fixed (Can be checked with &#039;webfonts&#039; branch of SILPA code on github.&lt;br /&gt;
# Write tests to check the functionalities.&lt;br /&gt;
# Adhere to Flask extension guidelines and submit the modules to Flask extensions directory.&lt;br /&gt;
# Write a tool which can take a directory containing fonts file or single font file and generate configuration file needed by the extension. (A possible such tool which is outdated can be found at [https://github.com/copyninja/fontinfo])&lt;br /&gt;
# Provide HTTP api&#039;s through flask extension which can expose the CSS for applications.&lt;br /&gt;
&lt;br /&gt;
For all tasks above we expect documentation, test cases from the students as deliverable. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Intermediate&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Vasudev Kamath, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Vasudev Kamath - copyninja on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org &amp;lt;preferred&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
# Writing applications using Flask&lt;br /&gt;
# Various Transliteration system knolwedge&lt;br /&gt;
# Webfonts knowledge and writing extensions for Flask&lt;br /&gt;
# Test drive development.&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&amp;lt;Please note this idea is for a SDK, not an app or just a java port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*All modules need to be ported to java so that it can be used inside an Android Project.&lt;br /&gt;
*Other applications should be able to use this Silpa library to easy integrate features (as a SDK) from our modules. Eg.&lt;br /&gt;
**Transliteration - Developer can specify a text input inside the  application needs transliteration, and our SDK should take care of the  transliteration process whenever user inputs text to that field.&lt;br /&gt;
**Render module - Detect whether necessary font is available in the  system, if it is not, render text as image and replace text with this.&lt;br /&gt;
**All modules can be explained like this.&lt;br /&gt;
*Investigate whether image rendering part of render module can be done in device, inside application itself. Few ways to implement that are&lt;br /&gt;
**Compiling cairo/pango with ndk&lt;br /&gt;
**Compiling Harffbuzz from AOSP tree with ndk&lt;br /&gt;
**Based on the result of rendering module investigation, we can device on whether to use server side rendering or not.&lt;br /&gt;
**Pack popular fonts with the SDK, Use it to display text if device doesn&#039;t have required font. (there are few hacks to get better rendering in older versions of android). Developer should be able to force rendering using packaged font, to get consistency across devices.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;Better to prepare SDK with helper than preparing application itself. SDK aka library&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Hrishikesh K. B, Jishnu Mohan, Aashik S&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Hrishikesh K B - stultus on on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
*Aashik S - irumbumoideen on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in SILPA into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the silpa algorithms to node modules. Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Santhosh Thottingal, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 santhosh on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC  &lt;br /&gt;
*Pirate Praveen - j4v4m4n on #smc-project on Freenode&lt;br /&gt;
*Ershad K - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
* [http://www.varnamproject.com/docs/faq FAQ]&lt;br /&gt;
* [http://www.varnamproject.com/docs Documentation]&lt;br /&gt;
* [http://www.varnamproject.com/docs/contributing Contributors guide &amp;amp; ideas to work on]&lt;br /&gt;
&lt;br /&gt;
Apart from the following ideas, you can propose your own idea. &lt;br /&gt;
&lt;br /&gt;
===Programming language bindings &amp;amp; varnam-daemon===&lt;br /&gt;
&lt;br /&gt;
Varnam is written on C which makes interoperability with other languages easy. There are language bindings available for `NodeJs` and `Ruby`. Supporting Varnam in multiple languages allows projects to use varnam easily to enable Indian language input.&lt;br /&gt;
&lt;br /&gt;
To make using varnam from different languages easier, make a cross platform standalone process which uses `libvarnam` shared library and exposes a RPC API over network. This allows any programming language with a socket support can be used with libvarnam. This also makes language bindings fairly easy because they don&#039;t have to work with the native interoperability support. The protocol can be a simple text based protocol for all the commands that `libvarnam` supports. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning words. Today, when a word is learned, varnam takes all the possible prefixes into account and learn all of them to improve future suggestions. But sometimes, this is not enough to predict good suggestions. An improvement is suggested which will try to infer the base form of the word under learning.&lt;br /&gt;
&lt;br /&gt;
Varnam has a learning system built-in which can learn words and it can also learn possible other ways to write a word. Consider the following example. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
learn(&amp;quot;भारत&amp;quot;) = [bharat, bhaarath, bharath]&lt;br /&gt;
transliterate(&amp;quot;bharat&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bhaarath&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bharath&amp;quot;) = भारत&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Varnam also learns a word&#039;s prefixes so that it can produce better predictions for any word which has the same prefix. So in this case, with just learning the word &amp;quot;भारत&amp;quot;, varnam can predict &amp;quot;bharateey&amp;quot; = &amp;quot;भारतीय&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The proposed idea talks about making this learn better. One example is infer the word &amp;quot;भारत&amp;quot; when learning भारतीय. Something like a porter stemmer implementation but integrated into the varnam framework so that&lt;br /&gt;
new language support can be added easily.&lt;br /&gt;
&lt;br /&gt;
This idea also includes improving concurrency support for learn. Currently, learn can&#039;t be called concurrently because of the restrictions on the SQLite. So every learn has to be sequentially done. This needs to be improved by having a simple internal queue from which words gets queued when the learn is busy. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C, Ruby (basics)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Create an Android IME===&lt;br /&gt;
&lt;br /&gt;
Android has an extensible input method system. Use that to make a IME which uses varnam internally. This includes, getting `libvarnam` compiled on android first. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Android, Java&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
===Enable varnam&#039;s suggestions system to be used from Inscript or any other input system===&lt;br /&gt;
&lt;br /&gt;
Varnam has knowledge about lot of words. This idea proposes a method to use these words and provide suggestions for other input systems. Basically, in Varnam, the API call will be something like,&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
varnam_get_suggestions (handle, &amp;quot;भारत&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will fetch all the suggestions which has the given prefix. &lt;br /&gt;
&lt;br /&gt;
`varnam_get_suggestions` needs to keep track of the previous words and use [http://en.wikipedia.org/wiki/N-gram n-gram] based dataset to filter the results. This should also learn the words back into the word corpus that varnam is using. Filtering suggestions won&#039;t be just a prefix search, but it will have knowledge about how text can be written in the target language and provide smart filtering. Searching in a large corpus and providing real-time suggestions makes this a challenging task. &lt;br /&gt;
&lt;br /&gt;
Once this is implemented in `libvarnam`, it can be used in the ibus-engine.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Unicode &amp;amp; encodings&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
==Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Anivar Aravind&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - anivar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4677</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4677"/>
		<updated>2014-03-07T05:16:56Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Varnam Based */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
The project is not about coding an existing algorithm, but to develop and implement an algorithm&lt;br /&gt;
&lt;br /&gt;
Homework to do before submitting applications:&lt;br /&gt;
# Use Hunspell in any Indian language like Malayalam for spell correction in editors or word processors and understand the limitations&lt;br /&gt;
# Study the nature of inflection and agglutination in Indian languages, read existing documents on this(ask for documents too) and note down your observations&lt;br /&gt;
# Study Hunspell and other spellcheckers to see how this problem is addressed&lt;br /&gt;
# Understand how a spell checker works. How to write a spellchecker from scratch?&lt;br /&gt;
# Come up with a plan about addressing the issue.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - santhosh on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language and complete the homework as listed above.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - rajeeshknambiar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Deepa P Gopinath&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - deepagopinath on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==SILPA Project Based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===SILPA Project Improvements===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
This is set of ideas needed to improve the existing SILPA infrastructure. We have decided following tasks as part of this project&lt;br /&gt;
&lt;br /&gt;
# Provide REST API to SILPA without disturbing existing JSONRPC API&lt;br /&gt;
# Improve the Transliteration module&lt;br /&gt;
# Integrate [https://github.com/Project-SILPA/flask-webfonts Flask Webfonts] extension with SILPA to provide Webfonts support.&lt;br /&gt;
&lt;br /&gt;
==== Provide REST like API for SILPA ====&lt;br /&gt;
&lt;br /&gt;
SILPA provides JSONRPC API currently which is also utilized by the templates of framework. JSONRPC is not well supported in all languages and results in [https://en.wikipedia.org/wiki/Not_invented_here NIH code]. So we would like to provide REST like HTTP based API&#039;s for SILPA and at the same time leave the current JSONRPC code untouched for backward compatibility reasons.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
* Develop module or use existing module to provide REST like API&#039;s&lt;br /&gt;
* API should support GET and POST. [http://www.w3.org/2001/tag/doc/whenToUseGet.html When to use GET?].&lt;br /&gt;
&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
====  Improve Transliteration module ====&lt;br /&gt;
&lt;br /&gt;
We have a Transliteration module which supports transliteration from any Indic language to other Indic language and also support to English to Indic and Indic to English transliteration. Also we support IPA and ISO15919 transliteration system. But the module isn&#039;t in perfect shape and has lot of bugs. With this idea we would like to improve the following parts&lt;br /&gt;
&lt;br /&gt;
# Improve cross indic language transliteration system. Currently only Malayalam and Kannada are working without any external language support, all other Indian languages are first transliterated to Malayalam and then transliterated to target Indic language. We want to remove this cycle from source -&amp;gt; Malayalam -&amp;gt; target.&lt;br /&gt;
# English to IPA transliteration is currently broken and this needs to be fixed. See [https://github.com/Project-SILPA/Transliteration/issues/3 IPA transliteration bug].&lt;br /&gt;
# Once the IPA transliteration issue above is fixed, imporve English to Indic transliteration system using IPA. Currently English to Indic transliteration system is done using CMU Sphinx dictionary which is having limited set of words which inturn limits the output of English to Indic transliteration system.&lt;br /&gt;
# Improve IS015919 to Indic transliteration system see [https://github.com/Project-SILPA/Transliteration/issues/4 IS015919 to Indic transliteration].&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
==== Integrating flask-webfonts extension with SILPA ====&lt;br /&gt;
&lt;br /&gt;
SILPA used to have a Webfonts module for serving Indian language fonts as Webfonts for browsers. During GSOC 2013 it was separated as an extension to Flask framework which can be generally used with any Flask powered app. The current code can be found at [https://github.com/Project-SILPA/flask-webfonts]. The module is not fine tuned yet so below are the objectives.&lt;br /&gt;
&lt;br /&gt;
# The module is not yet fine tuned and using it will make other modules break. This needs to be fixed (Can be checked with &#039;webfonts&#039; branch of SILPA code on github.&lt;br /&gt;
# Write tests to check the functionalities.&lt;br /&gt;
# Adhere to Flask extension guidelines and submit the modules to Flask extensions directory.&lt;br /&gt;
# Write a tool which can take a directory containing fonts file or single font file and generate configuration file needed by the extension. (A possible such tool which is outdated can be found at [https://github.com/copyninja/fontinfo])&lt;br /&gt;
# Provide HTTP api&#039;s through flask extension which can expose the CSS for applications.&lt;br /&gt;
&lt;br /&gt;
For all tasks above we expect documentation, test cases from the students as deliverable. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Intermediate&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Vasudev Kamath, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Vasudev Kamath - copyninja on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org &amp;lt;preferred&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
# Writing applications using Flask&lt;br /&gt;
# Various Transliteration system knolwedge&lt;br /&gt;
# Webfonts knowledge and writing extensions for Flask&lt;br /&gt;
# Test drive development.&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&amp;lt;Please note this idea is for a SDK, not an app or just a java port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*All modules need to be ported to java so that it can be used inside an Android Project.&lt;br /&gt;
*Other applications should be able to use this Silpa library to easy integrate features (as a SDK) from our modules. Eg.&lt;br /&gt;
**Transliteration - Developer can specify a text input inside the  application needs transliteration, and our SDK should take care of the  transliteration process whenever user inputs text to that field.&lt;br /&gt;
**Render module - Detect whether necessary font is available in the  system, if it is not, render text as image and replace text with this.&lt;br /&gt;
**All modules can be explained like this.&lt;br /&gt;
*Investigate whether image rendering part of render module can be done in device, inside application itself. Few ways to implement that are&lt;br /&gt;
**Compiling cairo/pango with ndk&lt;br /&gt;
**Compiling Harffbuzz from AOSP tree with ndk&lt;br /&gt;
**Based on the result of rendering module investigation, we can device on whether to use server side rendering or not.&lt;br /&gt;
**Pack popular fonts with the SDK, Use it to display text if device doesn&#039;t have required font. (there are few hacks to get better rendering in older versions of android). Developer should be able to force rendering using packaged font, to get consistency across devices.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;Better to prepare SDK with helper than preparing application itself. SDK aka library&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Hrishikesh K. B, Jishnu Mohan, Aashik S&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Hrishikesh K B - stultus on on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
*Aashik S - irumbumoideen on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in SILPA into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the silpa algorithms to node modules. Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Santhosh Thottingal, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 santhosh on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Mailing List&#039;&#039;&#039;: silpa-discuss@nongnu.org&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC  &lt;br /&gt;
*Pirate Praveen - j4v4m4n on #smc-project on Freenode&lt;br /&gt;
*Ershad K - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
* [http://www.varnamproject.com/docs/faq FAQ]&lt;br /&gt;
* [http://www.varnamproject.com/docs Documentation]&lt;br /&gt;
* [http://www.varnamproject.com/docs/contributing Contributors guide &amp;amp; ideas to work on]&lt;br /&gt;
&lt;br /&gt;
Apart from the following ideas, you can propose your own idea. &lt;br /&gt;
&lt;br /&gt;
===Programming language bindings &amp;amp; varnam-daemon===&lt;br /&gt;
&lt;br /&gt;
Varnam is written on C which makes interoperability with other languages easy. There are language bindings available for `NodeJs` and `Ruby`. Supporting Varnam in multiple languages allows projects to use varnam easily to enable Indian language input.&lt;br /&gt;
&lt;br /&gt;
To make using varnam from different languages easier, make a cross platform standalone process which uses `libvarnam` shared library and exposes a RPC API over network. This allows any programming language with a socket support can be used with libvarnam. This also makes language bindings fairly easy because they don&#039;t have to work with the native interoperability support. The protocol can be a simple text based protocol for all the commands that `libvarnam` supports. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: C&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning words. Today, when a word is learned, varnam takes all the possible prefixes into account and learn all of them to improve future suggestions. But sometimes, this is not enough to predict good suggestions. An improvement is suggested which will try to infer the base form of the word under learning.&lt;br /&gt;
&lt;br /&gt;
Varnam has a learning system built-in which can learn words and it can also learn possible other ways to write a word. Consider the following example. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
learn(&amp;quot;भारत&amp;quot;) = [bharat, bhaarath, bharath]&lt;br /&gt;
transliterate(&amp;quot;bharat&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bhaarath&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bharath&amp;quot;) = भारत&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Varnam also learns a word&#039;s prefixes so that it can produce better predictions for any word which has the same prefix. So in this case, with just learning the word &amp;quot;भारत&amp;quot;, varnam can predict &amp;quot;bharateey&amp;quot; = &amp;quot;भारतीय&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The proposed idea talks about making this learn better. One example is infer the word &amp;quot;भारत&amp;quot; when learning भारतीय. Something like a porter stemmer implementation but integrated into the varnam framework so that&lt;br /&gt;
new language support can be added easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C, Ruby (basics)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Create an Android IME===&lt;br /&gt;
&lt;br /&gt;
Android has an extensible input method system. Use that to make a IME which uses varnam internally. This includes, getting `libvarnam` compiled on android first. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Android, Java&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
===Enable varnam&#039;s suggestions system to be used from Inscript or any other input system===&lt;br /&gt;
&lt;br /&gt;
Varnam has knowledge about lot of words. This idea proposes a method to use these words and provide suggestions for other input systems. Basically, in Varnam, the API call will be something like,&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
varnam_get_suggestions (handle, &amp;quot;भारत&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This will fetch all the suggestions which has the given prefix. &lt;br /&gt;
&lt;br /&gt;
`varnam_get_suggestions` needs to keep track of the previous words and use [http://en.wikipedia.org/wiki/N-gram n-gram] based dataset to filter the results. This should also learn the words back into the word corpus that varnam is using. Filtering suggestions won&#039;t be just a prefix search, but it will have knowledge about how text can be written in the target language and provide smart filtering. Searching in a large corpus and providing real-time suggestions makes this a challenging task. &lt;br /&gt;
&lt;br /&gt;
Once this is implemented in `libvarnam`, it can be used in the ibus-engine.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  C, Unicode &amp;amp; encodings&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
==Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Anivar Aravind&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - anivar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4641</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4641"/>
		<updated>2014-02-28T18:16:29Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Varnam Based */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
* If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039;: Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - santhosh on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the student will learn&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - rajeeshknambiar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Deepa P Gopinath&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - deepagopinath on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Silpa based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
*Use Flask-Restuful and write separate module without modifying existing  JSONRPC. JSONRPC need to be present to allow backward compatibility&lt;br /&gt;
*Both GET and POST should be supported. Deverloper can decide on what to use. (Do we need this?)&lt;br /&gt;
Many people have doubt on how the API should look like. We can give twitter API (https://dev.twitter.com/docs/api) as example &lt;br /&gt;
Sample API calls :&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/ASCII2Unicode&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
    POST api.silpa.org.in/payyans/Unicode2ASCII&lt;br /&gt;
    Paramets: text, font&lt;br /&gt;
    Response: JSON data&lt;br /&gt;
-------------------------------------------------------------&lt;br /&gt;
Generic: &lt;br /&gt;
    GET/POST (http://api.silpa.org.in/module/function_name or http://silpa.org.in/api/module/function_name)&lt;br /&gt;
    Parameters: function parameters&lt;br /&gt;
    Response: JSON encoded return value from function&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Easy&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Vasudev Kamath, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Vasudev Kamath - copyninja on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Objectives&#039;&#039;&#039;:&lt;br /&gt;
&amp;lt;Please note this idea is for a SDK, not an app or just a java port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*All modules need to be ported to java so that it can be used inside an Android Project.&lt;br /&gt;
*Other applications should be able to use this Silpa library to easy integrate features (as a SDK) from our modules. Eg.&lt;br /&gt;
**Transliteration - Developer can specify a text input inside the  application needs transliteration, and our SDK should take care of the  transliteration process whenever user inputs text to that field.&lt;br /&gt;
**Render module - Detect whether necessary font is available in the  system, if it is not, render text as image and replace text with this.&lt;br /&gt;
**All modules can be explained like this.&lt;br /&gt;
*Investigate whether image rendering part of render module can be done in device, inside application itself. Few ways to implement that are&lt;br /&gt;
**Compiling cairo/pango with ndk&lt;br /&gt;
**Compiling Harffbuzz from AOSP tree with ndk&lt;br /&gt;
**Based on the result of rendering module investigation, we can device on whether to use server side rendering or not.&lt;br /&gt;
**Pack popular fonts with the SDK, Use it to display text if device doesn&#039;t have required font. (there are few hacks to get better rendering in older versions of android). Developer should be able to force rendering using packaged font, to get consistency across devices.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;Better to prepare SDK with helper than preparing application itself. SDK aka library&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Hrishikesh K. B, Jishnu Mohan, Aashik S&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Hrishikesh K B - stultus on on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
*Aashik S - irumbumoideen on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in SILPA into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the silpa algorithms to node modules. Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration. Current Indic to English and vice versa transliteration depends on CMUSphinx dictionary which is having limited set of words which will result in some words being left in native text.&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Easy&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Vasudev Kamath, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC -&lt;br /&gt;
*Vasudev Kamath - copyninja on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  &lt;br /&gt;
SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* Create a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**(Task from last GSoC)&lt;br /&gt;
* flask-webfonts needs further improvements and fine tuning&lt;br /&gt;
* This should be integrated into our SILPA and submitted to Flask&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Vasudev Kamath, Jishnu Mohan&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC&lt;br /&gt;
*Vasudev Kamath - copyninja on #smc-project and #silpa on Freenode&lt;br /&gt;
*Jishnu Mohan - jishnu7 on #smc-project and #silpa on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentors&#039;&#039;&#039; : Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentors&#039;&#039;&#039;: IRC  &lt;br /&gt;
*Pirate Praveen - j4v4m4n on #smc-project on Freenode&lt;br /&gt;
*Ershad K - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
* FAQ - [http://www.varnamproject.com/docs/faq]&lt;br /&gt;
* Documentation - [http://www.varnamproject.com/docs]&lt;br /&gt;
* Contributors guide - [http://www.varnamproject.com/docs/contributing]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning words. Today, when a word is learned, varnam takes all the possible prefixes into account and learn all of them to improve future suggestions. But sometimes, this is not enough to predict good suggestions. An improvement is suggested which will try to infer the base form of the word under learning.&lt;br /&gt;
&lt;br /&gt;
Varnam has a learning system built-in which can learn words and it can also learn possible other ways to write a word. Consider the following example. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
learn(&amp;quot;भारत&amp;quot;) = [bharat, bhaarath, bharath]&lt;br /&gt;
transliterate(&amp;quot;bharat&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bhaarath&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bharath&amp;quot;) = भारत&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Varnam also learns a word&#039;s prefixes so that it can produce better predictions for any word which has the same prefix. So in this case, with just learning the word &amp;quot;भारत&amp;quot;, varnam can predict &amp;quot;bharateey&amp;quot; = &amp;quot;भारतीय&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The proposed idea talks about making this learn better. One example is infer the word &amp;quot;भारत&amp;quot; when learning भारतीय. Something like a porter stemmer implementation but integrated into the varnam framework so that&lt;br /&gt;
new language support can be added easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C, Ruby (basics)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - nkn__ on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:  Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
==Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Anivar Aravind&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - anivar on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Confirmed Mentor&#039;&#039;&#039; : Ershad K&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;How to contact the mentor&#039;&#039;&#039;: IRC - ershad on #smc-project on Freenode&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;What the students will learn&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4628</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4628"/>
		<updated>2014-02-27T18:16:53Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Improvements to the REST API */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Silpa based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Hrishikesh/Aashik&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in SILPA into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the silpa algorithms to node modules. Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  &lt;br /&gt;
SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
===Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning words. Today, when a word is learned, varnam takes all the possible prefixes into account and learn all of them to improve future suggestions. But sometimes, this is not enough to predict good suggestions. An improvement is suggested which will try to infer the base form of the word under learning.&lt;br /&gt;
&lt;br /&gt;
Varnam has a learning system built-in which can learn words and it can also learn possible other ways to write a word. Consider the following example. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
learn(&amp;quot;भारत&amp;quot;) = [bharat, bhaarath, bharath]&lt;br /&gt;
transliterate(&amp;quot;bharat&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bhaarath&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bharath&amp;quot;) = भारत&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Varnam also learns a word&#039;s prefixes so that it can produce better predictions for any word which has the same prefix. So in this case, with just learning the word &amp;quot;भारत&amp;quot;, varnam can predict &amp;quot;bharateey&amp;quot; = &amp;quot;भारतीय&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The proposed idea talks about making this learn better. One example is infer the word &amp;quot;भारत&amp;quot; when learning भारतीय. Something like a porter stemmer implementation but integrated into the varnam framework so that&lt;br /&gt;
new language support can be added easily.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C, Ruby (basics)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
==Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Anivar Aravind&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Ershad&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4627</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4627"/>
		<updated>2014-02-27T18:16:33Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Word corpus synchronization */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Silpa based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Hrishikesh/Aashik&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in SILPA into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the silpa algorithms to node modules. Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  &lt;br /&gt;
SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth S&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning words. Today, when a word is learned, varnam takes all the possible prefixes into account and learn all of them to improve future suggestions. But sometimes, this is not enough to predict good suggestions. An improvement is suggested which will try to infer the base form of the word under learning.&lt;br /&gt;
&lt;br /&gt;
Varnam has a learning system built-in which can learn words and it can also learn possible other ways to write a word. Consider the following example. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
learn(&amp;quot;भारत&amp;quot;) = [bharat, bhaarath, bharath]&lt;br /&gt;
transliterate(&amp;quot;bharat&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bhaarath&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bharath&amp;quot;) = भारत&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Varnam also learns a word&#039;s prefixes so that it can produce better predictions for any word which has the same prefix. So in this case, with just learning the word &amp;quot;भारत&amp;quot;, varnam can predict &amp;quot;bharateey&amp;quot; = &amp;quot;भारतीय&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The proposed idea talks about making this learn better. One example is infer the word &amp;quot;भारत&amp;quot; when learning भारतीय. Something like a porter stemmer implementation but integrated into the varnam framework so that&lt;br /&gt;
new language support can be added easily.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C, Ruby (basics)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
==Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Anivar Aravind&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Ershad&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4626</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4626"/>
		<updated>2014-02-27T18:15:15Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Improve the learning system */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nandaja Varma (&#039;&#039;&#039;gem&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=Projects with confirmed mentors=&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Background Reading&#039;&#039;&#039;&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Silpa based==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Hrishikesh/Aashik&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Converting indic processing modules currently in SILPA into javascript modules library===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
Port some of the silpa algorithms to node modules. Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  &lt;br /&gt;
SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage. All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Advanced&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth S&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning words. Today, when a word is learned, varnam takes all the possible prefixes into account and learn all of them to improve future suggestions. But sometimes, this is not enough to predict good suggestions. An improvement is suggested which will try to infer the base form of the word under learning.&lt;br /&gt;
&lt;br /&gt;
Varnam has a learning system built-in which can learn words and it can also learn possible other ways to write a word. Consider the following example. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
learn(&amp;quot;भारत&amp;quot;) = [bharat, bhaarath, bharath]&lt;br /&gt;
transliterate(&amp;quot;bharat&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bhaarath&amp;quot;) = भारत&lt;br /&gt;
transliterate(&amp;quot;bharath&amp;quot;) = भारत&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Varnam also learns a word&#039;s prefixes so that it can produce better predictions for any word which has the same prefix. So in this case, with just learning the word &amp;quot;भारत&amp;quot;, varnam can predict &amp;quot;bharateey&amp;quot; = &amp;quot;भारतीय&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The proposed idea talks about making this learn better. One example is infer the word &amp;quot;भारत&amp;quot; when learning भारतीय. Something like a porter stemmer implementation but integrated into the varnam framework so that&lt;br /&gt;
new language support can be added easily.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C, Ruby (basics)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039;: Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download the word corpus from offline IMEs like varnam-ibus[2]. This helps to build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Navaneeth S&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;More Details&#039;&#039;&#039;&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Anivar Aravind&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
* [1]: http://dev.grandham.org&lt;br /&gt;
* [2]: https://github.com/smc/grandham&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Ershad&lt;br /&gt;
&lt;br /&gt;
=Projects with unconfirmed mentors=&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4614</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4614"/>
		<updated>2014-02-25T04:54:13Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Varnam Based */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/student-projects-smc.org.in student projects mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
* Complexity: Advanced&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
* &#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
=== Background Reading ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
&lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
==SILPA BASED==&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
===Android SDK for Silpa===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Port possible Silpa modules to java and create SDK so that other developers can use this for their apps. Modules like Indic Render, Transliteration, Payyas has really good potential in android because of the fragmentation exists in Android and lack for proper Indic support. This SDK will help developers to support their Indic app in wide range of android devices.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Java, Android, Python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; :&lt;br /&gt;
&lt;br /&gt;
=== Converting indic processing modules currently in SILPA into javascript modules library  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Port some of the silpa algorithms to node modules.&lt;br /&gt;
&lt;br /&gt;
Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
Proposed javascript module pattern is https://github.com/umdjs/umd&lt;br /&gt;
&lt;br /&gt;
Student proposals should have a list of alogorithms planning to port, planned demo applications, planned documentation details, and publishing details(Example: npm registry)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; :   -&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
====More Details====&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
&lt;br /&gt;
Varnam is a cross-platform predictive transliterator for Indian languages. It works mostly like Google&#039;s transliterate, but shows key differences in the way word tokenization is done. It has a learning system built in which allows Varnam to make smart predictions. &lt;br /&gt;
&lt;br /&gt;
There are varnam clients available as [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Firefox]] &amp;amp; [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Chrome addon] and an [https://gitorious.org/varnamproject/libvarnam-ibus/source/d939adf50024013902c27310c03ef21a9210cdcb IBus engine].&lt;br /&gt;
&lt;br /&gt;
To try out Varnam, navigate to [http://varnamproject.com/editor[http://varnamproject.com/editor]]. Currently it support Hindi and Malayalam.&lt;br /&gt;
&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add&lt;br /&gt;
support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage.&lt;br /&gt;
All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
=== Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning&lt;br /&gt;
words. Today, when a word is learned, varnam takes all the possible&lt;br /&gt;
prefixes into account and learn all of them to improve future&lt;br /&gt;
suggestions. But sometimes, this is not enough to predict good&lt;br /&gt;
suggestions. An improvement is suggested which will try to infer the&lt;br /&gt;
base form of the word under learning&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download&lt;br /&gt;
the word corpus from offline IMEs like varnam-ibus[2]. This helps to&lt;br /&gt;
build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
* [1]: http://www.varnamproject.com&lt;br /&gt;
* [2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
== Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
====More Details====&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Anvar Aravind&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
[1]: http://dev.grandham.org&lt;br /&gt;
[2]: https://github.com/smc/grandham&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4591</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4591"/>
		<updated>2014-02-13T05:37:31Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Potential Mentors */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt; &lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt; &amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Potential Mentors=&lt;br /&gt;
# Santhosh Thottingal (&#039;&#039;&#039;santhosh&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Baiju M (&#039;&#039;&#039;baijum&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Praveen A (&#039;&#039;&#039;j4v4m4n&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Rajeesh K Nambiar (&#039;&#039;&#039;rajeeshknambiar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Vasudev Kammath (&#039;&#039;&#039;copyninja&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Jishnu Mohan (&#039;&#039;&#039;jishnu7&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Hrishikesh K.B (&#039;&#039;&#039;stultus&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anivar Aravind (&#039;&#039;&#039;anivar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Anilkumar K V (&#039;&#039;&#039;anilkumar&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Sajjad Anwar (&#039;&#039;&#039;geohacker&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Deepa V Gopinath (&#039;&#039;&#039;deepagopinath&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# jain Basil  (&#039;&#039;&#039;jainbasil&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Ershad K (&#039;&#039;&#039;ershad&#039;&#039;&#039; on irc.freenode.net&lt;br /&gt;
# Navaneeth (&#039;&#039;&#039;nkn__&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
# Nishan Naseer (&#039;&#039;&#039;nishan&#039;&#039;&#039; on irc.freenode.net)&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2014#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/discuss-smc.org.in project mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
* Complexity: Advanced&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
* &#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
=== Background Reading ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
&lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
==SILPA BASED==&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Converting indic processing modules currently in SILPA into nodejs modules library  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Port some of the silpa algorithms to node modules.&lt;br /&gt;
&lt;br /&gt;
Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript helps developers. For example, cross language transliteration can be done javascript too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in javascript.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript,nodejs, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; :   -&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
====More Details====&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add&lt;br /&gt;
support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage.&lt;br /&gt;
All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
=== Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning&lt;br /&gt;
words. Today, when a word is learned, varnam takes all the possible&lt;br /&gt;
prefixes into account and learn all of them to improve future&lt;br /&gt;
suggestions. But sometimes, this is not enough to predict good&lt;br /&gt;
suggestions. An improvement is suggested which will try to infer the&lt;br /&gt;
base form of the word under learning&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download&lt;br /&gt;
the word corpus from offline IMEs like varnam-ibus[2]. This helps to&lt;br /&gt;
build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
[1]: http://www.varnamproject.com&lt;br /&gt;
[2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Navaneeth K N&lt;br /&gt;
&lt;br /&gt;
== Adding Braille Keyboard layouts for Indian Languages to m17n Library==&lt;br /&gt;
&lt;br /&gt;
Project is building support for Bharati Braille keyboard layouts in GNU/Linux systemes.  Bharati Braille standard is the official Braille standard in India. A regular QWERTY keyboard is used for data entry. SDF-JKL keys are used for six dots of Braille. This support need to be built as m17n layouts. This will enable visually challenged people who studied braille layouts to use GNU/Linux systems easily with the help of Audio feedback from TTS&lt;br /&gt;
&lt;br /&gt;
====More Details====&lt;br /&gt;
* http://www.acharya.gen.in:8080/disabilities/bh_brl.php&lt;br /&gt;
* http://en.wikipedia.org/wiki/Bharati_Braille&lt;br /&gt;
* http://www.nongnu.org/m17n/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;: Anvar Aravind&lt;br /&gt;
&lt;br /&gt;
==Grandham ==&lt;br /&gt;
=== Adding MARC21 import/export feature in Grandham ===&lt;br /&gt;
&lt;br /&gt;
We need a feature in Grandham to import and parse data from MARC21 documents. We should also be able to export existing data in MARC21.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in Ruby/Ruby on Rails&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : High&lt;br /&gt;
&lt;br /&gt;
[1]: http://dev.grandha.org&lt;br /&gt;
[2]: https://github.com/smc/grandham&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4584</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4584"/>
		<updated>2014-02-09T15:29:50Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Varnam Based */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;&lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt;This Page is under development&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2013#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/discuss-smc.org.in project mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
* Complexity: Advanced&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
* &#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
=== Background Reading ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
&lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
==SILPA BASED==&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Converting indic processing modules currently in SILPA into Jquery library  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Port some of the silpa algorithms to Jquery.&lt;br /&gt;
&lt;br /&gt;
Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript(jQuery) helps developers. For example, cross language transliteration can be done jquery too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in jquery. ALternatively these js libraries can be used with node.js using server side js concept too.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript,JQuery, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; :Jishnu&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
====More Details====&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add&lt;br /&gt;
support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage.&lt;br /&gt;
All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
=== Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning&lt;br /&gt;
words. Today, when a word is learned, varnam takes all the possible&lt;br /&gt;
prefixes into account and learn all of them to improve future&lt;br /&gt;
suggestions. But sometimes, this is not enough to predict good&lt;br /&gt;
suggestions. An improvement is suggested which will try to infer the&lt;br /&gt;
base form of the word under learning&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download&lt;br /&gt;
the word corpus from offline IMEs like varnam-ibus[2]. This helps to&lt;br /&gt;
build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C/golang&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
[1]: http://www.varnamproject.com&lt;br /&gt;
[2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Navaneeth K N&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4583</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4583"/>
		<updated>2014-02-09T15:28:55Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Improve the learning system */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;&lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt;This Page is under development&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2013#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/discuss-smc.org.in project mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
* Complexity: Advanced&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
* &#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
=== Background Reading ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
&lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
==SILPA BASED==&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Converting indic processing modules currently in SILPA into Jquery library  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Port some of the silpa algorithms to Jquery.&lt;br /&gt;
&lt;br /&gt;
Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript(jQuery) helps developers. For example, cross language transliteration can be done jquery too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in jquery. ALternatively these js libraries can be used with node.js using server side js concept too.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript,JQuery, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; :Jishnu&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
====More Details====&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add&lt;br /&gt;
support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage.&lt;br /&gt;
All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
=== Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning&lt;br /&gt;
words. Today, when a word is learned, varnam takes all the possible&lt;br /&gt;
prefixes into account and learn all of them to improve future&lt;br /&gt;
suggestions. But sometimes, this is not enough to predict good&lt;br /&gt;
suggestions. An improvement is suggested which will try to infer the&lt;br /&gt;
base form of the word under learning&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Knowledge in C&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Medium&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download&lt;br /&gt;
the word corpus from offline IMEs like varnam-ibus[2]. This helps to&lt;br /&gt;
build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
[1]: http://www.varnamproject.com&lt;br /&gt;
[2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Navaneeth K N&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4582</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4582"/>
		<updated>2014-02-09T15:28:03Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Improvements to the REST API */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;&lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt;This Page is under development&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2013#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/discuss-smc.org.in project mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
* Complexity: Advanced&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
* &#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
=== Background Reading ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
&lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
==SILPA BASED==&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Converting indic processing modules currently in SILPA into Jquery library  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Port some of the silpa algorithms to Jquery.&lt;br /&gt;
&lt;br /&gt;
Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript(jQuery) helps developers. For example, cross language transliteration can be done jquery too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in jquery. ALternatively these js libraries can be used with node.js using server side js concept too.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript,JQuery, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; :Jishnu&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
====More Details====&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add&lt;br /&gt;
support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage.&lt;br /&gt;
All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
=== Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning&lt;br /&gt;
words. Today, when a word is learned, varnam takes all the possible&lt;br /&gt;
prefixes into account and learn all of them to improve future&lt;br /&gt;
suggestions. But sometimes, this is not enough to predict good&lt;br /&gt;
suggestions. An improvement is suggested which will try to infer the&lt;br /&gt;
base form of the word under learning&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download&lt;br /&gt;
the word corpus from offline IMEs like varnam-ibus[2]. This helps to&lt;br /&gt;
build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
[1]: http://www.varnamproject.com&lt;br /&gt;
[2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Navaneeth K N&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4581</id>
		<title>GSoC/2014/Project ideas</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=GSoC/2014/Project_ideas&amp;diff=4581"/>
		<updated>2014-02-09T15:27:30Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* Improvements to the REST API */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;center&amp;gt;&lt;br /&gt;
&amp;lt;font color=&amp;quot;red&amp;quot;&amp;gt;This Page is under development&amp;lt;/font&amp;gt;&lt;br /&gt;
&amp;lt;/center&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=Ideas for Google Summer of Code 2014=&lt;br /&gt;
* Please Read the [http://wiki.smc.org.in/SoC/2013#FAQ FAQ]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;big&amp;gt;&#039;&#039;&#039;Apart from the following ideas , you can propose your own ideas&#039;&#039;&#039;&amp;lt;/big&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to propose an idea, please do it in [http://lists.smc.org.in/listinfo.cgi/discuss-smc.org.in project mailing list]&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== A spell checker for Indic language that understands inflections ==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
SILPA project has a spellchecker written using python with a not so simple algorithm. But still it is not capable of handling inflection and agglutination occurring in Indian languages especially south Indian languages. The dictionary we have for Malayalam spellchecker have about 150000 words. Of course we can expand the dictionary, but that doesn&#039;t have much value since words can be formed in Malayalam or Tamil etc by joining multiple words. In addition to that, words get inflected based on grammar forms(sandhi), plural, gender etc. Hunspell has a system to handle this, but so far nobody succeeded in getting it working for multi level suffix stripping as required for Malayalam. Some times a Malayalam word can be formed by more than 5 words joining together. We will need a word splitting logic or a table taking care of all patterns. The project is to attempt solving this with hunspell. If that is not feasible(hunspell upstream is not active), develop an algorithm and implement it.&lt;br /&gt;
&lt;br /&gt;
Recently Tamil attempted developing a spellchecker using Hunspell with multi level suffix stripping. You can see the result here https://github.com/thamizha/solthiruthi. &lt;br /&gt;
Our attempt should be first to use Hunspell to achieve spellchecking with agglutination and inflection. Probably it will require lot of scripting to generate suffix patterns, we can ask help from existing language communities too. If Hunspell has limitation with multi level suffxes- sometimes Indian languages require more than 5 levels of suffix stripping, we need to document it(bug and documentation) and try to attempt python based solution on top of SILPA framework.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12558 Savannah Task]&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Average level understanding of grammar system of at least one Indian language&lt;br /&gt;
* Complexity: Advanced&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Santhosh Thottingal&lt;br /&gt;
&lt;br /&gt;
==Indic rendering support in ConTeXt==&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
ConTeXt is another TeX macro system similar to LaTeX but much more suitable for design. To find more information about ConTeXt, see the wiki http://wiki.contextgarden.net/Main_Page. ConTeXt MKII  have Indic language rendering support using XeTeX. but MKII is deprecated, and the new MKIV backend doesn&#039;t support Indic rendering yet. The aim of this project is to add support to Inidic rendering to ConTeXt MKIV. XeTeX is using Harfbuzz to do correct Indic rendering.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;[https://savannah.nongnu.org/task/index.php?12559 Savannah Task]&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Understanding of the TeX system, experience in either LaTeX or ConTeXt and basic understanding of Indic language rendering. MKIV uses Lua, familiarity with Lua, opentype specifications or Harfbuzz will be added advantage.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039; : Rajeesh K Nambiar&lt;br /&gt;
* &#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;More Details&#039;&#039;&#039;: A partially working patch by Rajeesh for MKIV lua code is available. ConTeXt mkii (deprecated) can work with XeTeX backend for Indic rendering. Here is a sample file:&lt;br /&gt;
 \usemodule[simplefonts]&lt;br /&gt;
 \definefontfeature[malayalam][script=mlym]&lt;br /&gt;
 \setmainfont[Rachana][features=malayalam]&lt;br /&gt;
 \starttext&lt;br /&gt;
 മലയാളം \TeX ഉപയോഗിച്ച് ടൈപ്പ്സെറ്റ് ചെയ്തത്&lt;br /&gt;
 \stoptext&lt;br /&gt;
Generate the output using command&lt;br /&gt;
 texexec --xetex &amp;lt;file.tex&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Language model and Acoustic model for Malayalam language for speech recognition system in CMU Sphinx==&lt;br /&gt;
&lt;br /&gt;
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools, which can be used to develop speech recognition system in any language. To develop an automatic speech recognition system in a language, acoustic model and language model has to framed for that particular language.  Acoustic models characterize how sound changes over time. It captures the characteristics of basic recognition units. The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language. Once these two models are developed, it will be useful to every one doing research in speech processing. For Indian languages Hindi, Tamil, Telugu and Marati, ASR systems have been developed using sphinx engine. In this project work is aimed at developing acoustic model and language model for Malayalam.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Deepa P. Gopinath&lt;br /&gt;
=== Background Reading ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~gopalakr/publications/spdatabases_specom05.pdf &#039;Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems&#039;], Gopalakrishna  Anumanchipalli, Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh, R.N.V. Sitaram, S P Kishore&lt;br /&gt;
&lt;br /&gt;
* [http://www.aclweb.org/anthology/W/W12/W12-5808.pdf &amp;quot;Automatic Pronunciation Evaluation And Mispronunciation Detection Using CMUSphinx&amp;quot;], Ronanki Srikanth, James Salsman&lt;br /&gt;
&lt;br /&gt;
* http://www.speech.cs.cmu.edu/&lt;br /&gt;
* http://cmusphinx.sourceforge.net/wiki/tutorial&lt;br /&gt;
&lt;br /&gt;
* [http://www.ijarcsse.com &amp;quot;HTK Based Telugu Speech Recognition&amp;quot;], P. Vijai Bhaskar, AVNIET ,Hyderabad, Prof. Dr. S. Rama Mohan Rao, A.Gopi &lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.cmu.edu/~araza/Automatic_Speech_Recognition_System_for_Urdu.PDF &amp;quot;Design and  Development of an Automatic Speech Recognition System for Urdu&amp;quot;], Agha Ali Raza,  M.Sc. Thesis, FAST‐National University of Computer and Emerging Sciences &lt;br /&gt;
&lt;br /&gt;
* [http://www.ccis2k.org/iajit/PDF/vol.6,no.2/11IASRUCSS186.pdf &amp;quot;Investigation Arabic Speech Recognition Using CMU Sphinx System&amp;quot;], Hassan Satori1, 2, Hussein Hiyassat3, Mostafa Harti1, 2, and Noureddine Chenfour&lt;br /&gt;
&lt;br /&gt;
* [http://www.try.idv.tw/static-resources/homework/pr/PR_Final_Report.pdf &amp;quot;Understanding the CMU Sphinx Speech Recognition System&amp;quot;], Chun-Feng Liao&lt;br /&gt;
&lt;br /&gt;
==SILPA BASED==&lt;br /&gt;
&lt;br /&gt;
===Provide REST API for new flask based Silpa, including conversion of templates to this REST API from JSON RPC===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Silpa is now relying on JSONRPC. We need to, either completely move to REST API or provide REST API as an additional feature.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Python , Flask , Jinja , HTML, Javascript&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Converting indic processing modules currently in SILPA into Jquery library  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: Port some of the silpa algorithms to Jquery.&lt;br /&gt;
&lt;br /&gt;
Several modules, alogorithms in SILPA project is done in python now. But porting them to javascript(jQuery) helps developers. For example, cross language transliteration can be done jquery too if we port the algorithm and transliteration rules. Similarly the approximate search can be ported. A flexibile fuzzy search on the web pages will be possible if we have the algorithm in jquery. ALternatively these js libraries can be used with node.js using server side js concept too.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: javascript,JQuery, python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; :Jishnu&lt;br /&gt;
&lt;br /&gt;
===  Improving cross language transliteration system.  ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
Currently only Kannada and Malayalam are perfect rest all are first converted to Malayalam then to English due to lack of language internal. Also currently for English to Indic we use CMUDict so transliteration capability is limited to words in CMUDict only probably we could develop better method for English to Indic transliteration&lt;br /&gt;
&lt;br /&gt;
CLDR has transliteration data for Indic languages. We can explore it and see the feasibility. For an intermediate representation of the scripts either IPA can be used or ISO 15919 standard can be used. All these must be supplemented with exception rules and special case handling to achieve more perfect result.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;:python&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Vasudev/Jishnu&lt;br /&gt;
&lt;br /&gt;
=== Internationalize SILPA project with Wikimedia jquery projects ,  Improve  the webfonts module in Silpa using jquery.webfonts and provide more Indic and complex fonts as part of it ===&lt;br /&gt;
&#039;&#039;&#039;Project&#039;&#039;&#039;: &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Internationalize SILPA&#039;&#039;&#039; :-  SILPA project has many Indic language applications, but as of now, if somebody want to input in Indian languages, there is no built in tool in it. Similarly, the application is not internationalized. Both of these can be achieved by using the [//github.com/wikimedia/jquery.ime jquery.ime] and [//github.com/wikimedia/jquery.ime jquery.i18n] libraries from Wikimedia. A sample implementation is avaliable in our [http://smc.org.in website]. The i18n should be in the SILPA flask framework with a nice templating system. Similarly the interface should have webfonts using [https://github.com/wikimedia/jquery.webfonts jquery.webfonts] library.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Improve  the webfonts &#039;&#039;&#039; :- &lt;br /&gt;
* Currently Silpa provides 36 webfonts. add more fonts to this collection.&lt;br /&gt;
* Rewrote webfonts module to use the features of jquery.webfonts&lt;br /&gt;
* reate a repo as per jquery.webfonts specification&lt;br /&gt;
* Provide a clean api so that other websites can use our webfonts in their websites&lt;br /&gt;
* Document the usage&lt;br /&gt;
* Provide font preview and download options  &lt;br /&gt;
* **This is partly done**. &lt;br /&gt;
&lt;br /&gt;
====More Details====&lt;br /&gt;
* [https://github.com/wikimedia/jquery.i18n jquery.i18n]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.ime jquery.ime]&lt;br /&gt;
* [https://github.com/wikimedia/jquery.webfonts jquery.webfonts]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: jQuery, css, html5, Python , flask , technical understanding about fonts&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039; : Jishnu/Vasudev&lt;br /&gt;
&lt;br /&gt;
==Language filter for diaspora==&lt;br /&gt;
&lt;br /&gt;
Diaspora is a Free Software, federated social networking platform. Diaspora users post in many languages. When people use more than one language in their posts, it is inconvenient for people who don&#039;t understand a language. This task is to tag every post with languages used in the post, ideally detected automatically, but with an option to override it. Once each post has a language tag, people should be able to choose their preferred language and posts in other languages should be hidden by default. Also provide an option to translate posts and comments.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Ruby on Rails&lt;br /&gt;
* &#039;&#039;&#039;Mentor&#039;&#039;&#039;: Pirate Praveen, Ershad K&lt;br /&gt;
&lt;br /&gt;
==Varnam Based==&lt;br /&gt;
===Improvements to the REST API===&lt;br /&gt;
&lt;br /&gt;
This includes rewrite of the current implementation in `golang` and add&lt;br /&gt;
support for WebSockets to improve the input experience. This also&lt;br /&gt;
includes making scripts that would ease embedding input on any webpage.&lt;br /&gt;
All the changes done will go live on[1]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Expertise required&#039;&#039;&#039;: Basic understanding of golang and C&lt;br /&gt;
&#039;&#039;&#039;Complexity&#039;&#039;&#039; : Advanced&lt;br /&gt;
&lt;br /&gt;
=== Improve the learning system===&lt;br /&gt;
&lt;br /&gt;
The main goal of this is to improve how varnam tokenizes when learning&lt;br /&gt;
words. Today, when a word is learned, varnam takes all the possible&lt;br /&gt;
prefixes into account and learn all of them to improve future&lt;br /&gt;
suggestions. But sometimes, this is not enough to predict good&lt;br /&gt;
suggestions. An improvement is suggested which will try to infer the&lt;br /&gt;
base form of the word under learning&lt;br /&gt;
&lt;br /&gt;
=== Word corpus synchronization ===&lt;br /&gt;
&lt;br /&gt;
Create a cross-platform synchronization tool which can upload/download&lt;br /&gt;
the word corpus from offline IMEs like varnam-ibus[2]. This helps to&lt;br /&gt;
build the online words corpus easily.&lt;br /&gt;
&lt;br /&gt;
[1]: http://www.varnamproject.com&lt;br /&gt;
[2]: https://gitorious.org/varnamproject/libvarnam-ibus/&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Mentor&#039;&#039;&#039;:  Navaneeth K N&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4094</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4094"/>
		<updated>2013-06-22T06:00:07Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* ഉദാഹരണങ്ങള്‍ */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു പ്രോഗ്രാം ആണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. ഒരുതവണ ഇങ്ങനെ എഴുതിയാൽ വർണ്ണം &amp;quot;മലയാളം&amp;quot; എന്ന വാക്കും ആ വാക്ക് എഴുതുവാൻ സാധിക്കുന്ന എല്ലാ patterns ഉം പഠിക്കുന്നു. അതിനുശേഷം &amp;quot;malayalam&amp;quot; എന്നെഴുതിയാൽ മതി. വർണ്ണം പഠിക്കുന്ന വാക്കുകൾ വർണ്ണം ഉപയോഗിക്കുന്ന എല്ലാവർക്കും ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
കൂടുതൽ വിവരങ്ങൾ [http://www.varnamproject.com www.varnamproject.com] എന്ന വിലാസത്തിൽ ലഭ്യമാണ്&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
#ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
#ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/varnam-browser-addons varnam-browser-addons] for the source code.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഉദാഹരണങ്ങള്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Input before learning !! Input after learning !! Output&lt;br /&gt;
|-&lt;br /&gt;
| malayaaLam || malayalam || മലയാളം&lt;br /&gt;
|-&lt;br /&gt;
| paTTikkuka || padikkuka || പഠിക്കുക&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|-&lt;br /&gt;
| aanJanEyan || anjaneyan OR aanjaneyan || ആഞ്ജനേയൻ&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം വളരെ എളുപ്പത്തിൽ നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ embed ചെയ്യാവുന്നതാണ്. നിങ്ങളുടെ അപ്പ്ലിക്കേഷൻ stand-alone desktop അപ്പ്ലിക്കേഷനാണെങ്കിൽ [https://github.com/navaneeth/libvarnam libvarnam] നേരിട്ട് ലിങ്ക് ചെയ്യാവുന്നതാണ്. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/libvarnam libvarnam] പ്രൊജക്ട് നോക്കുക.&lt;br /&gt;
&lt;br /&gt;
വെബ്ബ് അപ്പ്ലിക്കേഷനുകൾക്ക് REST API ഉപയോഗിക്കാവുന്നതാണ്. http://www.varnamproject.com/tl?text=&amp;lt;text to transliterate&amp;gt;&amp;amp;lang=&amp;lt;lang-code&amp;gt;, ഉപയോഗിച്ച് നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ വർണ്ണം ഉപയോഗിക്കാവുന്നതാണ്. ഇതിനായി നേരിട്ട് varnamproject.com ഉപയോഗിക്കുകയോ അല്ലെങ്കിൽ വർണ്ണം വെബ്ബ് നിങ്ങളുടെ സെർവറിൽ ഇൻസ്റ്റാൾ ചെയ്ത് അത് ഉപയോഗിക്കുയൊ ചെയ്യാം. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/varnamproject.com varnamproject source] സന്ദർശിക്കുക.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം പ്രൊഗ്രാമ്മിങ്ങ് API&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;C/C++&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   #include &amp;quot;varnam.h&amp;quot;&lt;br /&gt;
   int main(int args, char **argv)&lt;br /&gt;
   {&lt;br /&gt;
     int rc, i;&lt;br /&gt;
     char *error;&lt;br /&gt;
     varnam *handle;&lt;br /&gt;
     varray *result;&lt;br /&gt;
     vword *word;&lt;br /&gt;
   &lt;br /&gt;
     rc = varnam_init(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;, &amp;amp;handle, &amp;amp;error);&lt;br /&gt;
     if (rc != VARNAM_SUCCESS)&lt;br /&gt;
     {&lt;br /&gt;
        printf (&amp;quot;Initialization failed. %s\n&amp;quot;, error);&lt;br /&gt;
        return 1;&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     rc = varnam_transliterate (handle, &amp;quot;malayalam&amp;quot;, &amp;amp;result);&lt;br /&gt;
     if (rc != VARNAM_SUCCESS)&lt;br /&gt;
     {&lt;br /&gt;
        printf (&amp;quot;Transliteration failed. %s\n&amp;quot;, varnam_get_last_error(handle));&lt;br /&gt;
        return 1;&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     for (i = 0; i &amp;lt; varray_length (result); i++)&lt;br /&gt;
     {&lt;br /&gt;
        word = varray_get (result, i);&lt;br /&gt;
        printf (&amp;quot;%d, %s\n&amp;quot;, word-confidence, word-&amp;gt;text);&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     return 0;&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam libvarnam] for C API source.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;Java&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   Varnam varnam = new Varnam(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;);&lt;br /&gt;
   varnam.enableSuggestions(&amp;quot;learnings.varnam.hi&amp;quot;);&lt;br /&gt;
   List&amp;lt;Word&amp;gt; words = varnam.transliterate(&amp;quot;hindi&amp;quot;);&lt;br /&gt;
   for (Word word : words) {&lt;br /&gt;
       System.out.println(word.getConfidence() + &amp;quot; - &amp;quot; + word.getText());&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam-java libvarnam-java] for JAVA API source.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;NodeJS&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   var v = require(&#039;bindings&#039;)(&#039;varnam.node&#039;),&lt;br /&gt;
       file = &amp;quot;ml-unicode.vst&amp;quot;;&lt;br /&gt;
   &lt;br /&gt;
   var varnam = new v.Varnam(file, &amp;quot;learned&amp;quot;);&lt;br /&gt;
   &lt;br /&gt;
   for (i = 0; i &amp;lt; 10; i++) {&lt;br /&gt;
       varnam.transliterate(&amp;quot;malayalam&amp;quot;, function(err, result) {&lt;br /&gt;
            console.log(result);&lt;br /&gt;
       });&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam-java libvarnam-nodejs] for NodeJS API source. NodeJS uses asynchronous API. &lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഭാവി പരിപാടികൾ&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
* Adding support for more languages to &#039;&#039;libvarnam&#039;&#039;.&lt;br /&gt;
* IME for Linux &amp;amp; Windows.&lt;br /&gt;
* More programming language support.&lt;br /&gt;
* Improved performance for varnamproject.com&lt;br /&gt;
&lt;br /&gt;
If you are willing to contribute to the varnamproject, please write to &amp;quot;varnamproject@googlegroups.com&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പിഴവുകളും നിര്‍​ദ്ദേശങ്ങളും&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
നിങ്ങളുടെ അഭിപ്രായങ്ങളും നിര്‍​ദ്ദേശങ്ങളും താഴെ കൊടുത്തിരിക്കുന്ന ഇ-മെയില്‍ വിലാസത്തില്‍ അയക്കുക. &lt;br /&gt;
&lt;br /&gt;
varnamproject@googlegroups.com&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പകര്‍പ്പവകാശം&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
The MIT License (MIT)&lt;br /&gt;
&lt;br /&gt;
Copyright (c) 2013 Navaneeth.K.N&lt;br /&gt;
&lt;br /&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &amp;quot;Software&amp;quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:&lt;br /&gt;
&lt;br /&gt;
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.&lt;br /&gt;
&lt;br /&gt;
THE SOFTWARE IS PROVIDED &amp;quot;AS IS&amp;quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4093</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4093"/>
		<updated>2013-06-22T05:53:57Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* വർണ്ണം */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു പ്രോഗ്രാം ആണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. ഒരുതവണ ഇങ്ങനെ എഴുതിയാൽ വർണ്ണം &amp;quot;മലയാളം&amp;quot; എന്ന വാക്കും ആ വാക്ക് എഴുതുവാൻ സാധിക്കുന്ന എല്ലാ patterns ഉം പഠിക്കുന്നു. അതിനുശേഷം &amp;quot;malayalam&amp;quot; എന്നെഴുതിയാൽ മതി. വർണ്ണം പഠിക്കുന്ന വാക്കുകൾ വർണ്ണം ഉപയോഗിക്കുന്ന എല്ലാവർക്കും ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
കൂടുതൽ വിവരങ്ങൾ [http://www.varnamproject.com www.varnamproject.com] എന്ന വിലാസത്തിൽ ലഭ്യമാണ്&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
#ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
#ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/varnam-browser-addons varnam-browser-addons] for the source code.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഉദാഹരണങ്ങള്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Input before learning !! Input after learning !! Output&lt;br /&gt;
|-&lt;br /&gt;
| malayaaLam || malayalam || മലയാളം&lt;br /&gt;
|-&lt;br /&gt;
| paTTikkuka || padikkuka || പഠിക്കുക&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം വളരെ എളുപ്പത്തിൽ നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ embed ചെയ്യാവുന്നതാണ്. നിങ്ങളുടെ അപ്പ്ലിക്കേഷൻ stand-alone desktop അപ്പ്ലിക്കേഷനാണെങ്കിൽ [https://github.com/navaneeth/libvarnam libvarnam] നേരിട്ട് ലിങ്ക് ചെയ്യാവുന്നതാണ്. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/libvarnam libvarnam] പ്രൊജക്ട് നോക്കുക.&lt;br /&gt;
&lt;br /&gt;
വെബ്ബ് അപ്പ്ലിക്കേഷനുകൾക്ക് REST API ഉപയോഗിക്കാവുന്നതാണ്. http://www.varnamproject.com/tl?text=&amp;lt;text to transliterate&amp;gt;&amp;amp;lang=&amp;lt;lang-code&amp;gt;, ഉപയോഗിച്ച് നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ വർണ്ണം ഉപയോഗിക്കാവുന്നതാണ്. ഇതിനായി നേരിട്ട് varnamproject.com ഉപയോഗിക്കുകയോ അല്ലെങ്കിൽ വർണ്ണം വെബ്ബ് നിങ്ങളുടെ സെർവറിൽ ഇൻസ്റ്റാൾ ചെയ്ത് അത് ഉപയോഗിക്കുയൊ ചെയ്യാം. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/varnamproject.com varnamproject source] സന്ദർശിക്കുക.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം പ്രൊഗ്രാമ്മിങ്ങ് API&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;C/C++&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   #include &amp;quot;varnam.h&amp;quot;&lt;br /&gt;
   int main(int args, char **argv)&lt;br /&gt;
   {&lt;br /&gt;
     int rc, i;&lt;br /&gt;
     char *error;&lt;br /&gt;
     varnam *handle;&lt;br /&gt;
     varray *result;&lt;br /&gt;
     vword *word;&lt;br /&gt;
   &lt;br /&gt;
     rc = varnam_init(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;, &amp;amp;handle, &amp;amp;error);&lt;br /&gt;
     if (rc != VARNAM_SUCCESS)&lt;br /&gt;
     {&lt;br /&gt;
        printf (&amp;quot;Initialization failed. %s\n&amp;quot;, error);&lt;br /&gt;
        return 1;&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     rc = varnam_transliterate (handle, &amp;quot;malayalam&amp;quot;, &amp;amp;result);&lt;br /&gt;
     if (rc != VARNAM_SUCCESS)&lt;br /&gt;
     {&lt;br /&gt;
        printf (&amp;quot;Transliteration failed. %s\n&amp;quot;, varnam_get_last_error(handle));&lt;br /&gt;
        return 1;&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     for (i = 0; i &amp;lt; varray_length (result); i++)&lt;br /&gt;
     {&lt;br /&gt;
        word = varray_get (result, i);&lt;br /&gt;
        printf (&amp;quot;%d, %s\n&amp;quot;, word-confidence, word-&amp;gt;text);&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     return 0;&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam libvarnam] for C API source.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;Java&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   Varnam varnam = new Varnam(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;);&lt;br /&gt;
   varnam.enableSuggestions(&amp;quot;learnings.varnam.hi&amp;quot;);&lt;br /&gt;
   List&amp;lt;Word&amp;gt; words = varnam.transliterate(&amp;quot;hindi&amp;quot;);&lt;br /&gt;
   for (Word word : words) {&lt;br /&gt;
       System.out.println(word.getConfidence() + &amp;quot; - &amp;quot; + word.getText());&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam-java libvarnam-java] for JAVA API source.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;NodeJS&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   var v = require(&#039;bindings&#039;)(&#039;varnam.node&#039;),&lt;br /&gt;
       file = &amp;quot;ml-unicode.vst&amp;quot;;&lt;br /&gt;
   &lt;br /&gt;
   var varnam = new v.Varnam(file, &amp;quot;learned&amp;quot;);&lt;br /&gt;
   &lt;br /&gt;
   for (i = 0; i &amp;lt; 10; i++) {&lt;br /&gt;
       varnam.transliterate(&amp;quot;malayalam&amp;quot;, function(err, result) {&lt;br /&gt;
            console.log(result);&lt;br /&gt;
       });&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam-java libvarnam-nodejs] for NodeJS API source. NodeJS uses asynchronous API. &lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഭാവി പരിപാടികൾ&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
* Adding support for more languages to &#039;&#039;libvarnam&#039;&#039;.&lt;br /&gt;
* IME for Linux &amp;amp; Windows.&lt;br /&gt;
* More programming language support.&lt;br /&gt;
* Improved performance for varnamproject.com&lt;br /&gt;
&lt;br /&gt;
If you are willing to contribute to the varnamproject, please write to &amp;quot;varnamproject@googlegroups.com&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പിഴവുകളും നിര്‍​ദ്ദേശങ്ങളും&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
നിങ്ങളുടെ അഭിപ്രായങ്ങളും നിര്‍​ദ്ദേശങ്ങളും താഴെ കൊടുത്തിരിക്കുന്ന ഇ-മെയില്‍ വിലാസത്തില്‍ അയക്കുക. &lt;br /&gt;
&lt;br /&gt;
varnamproject@googlegroups.com&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പകര്‍പ്പവകാശം&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
The MIT License (MIT)&lt;br /&gt;
&lt;br /&gt;
Copyright (c) 2013 Navaneeth.K.N&lt;br /&gt;
&lt;br /&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &amp;quot;Software&amp;quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:&lt;br /&gt;
&lt;br /&gt;
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.&lt;br /&gt;
&lt;br /&gt;
THE SOFTWARE IS PROVIDED &amp;quot;AS IS&amp;quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4092</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4092"/>
		<updated>2013-06-22T05:10:18Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു ഉപകരണമാണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. ഒരുതവണ ഇങ്ങനെ എഴുതിയാൽ വർണ്ണം &amp;quot;മലയാളം&amp;quot; എന്ന വാക്കും ആ വാക്ക് എഴുതുവാൻ സാധിക്കുന്ന എല്ലാ patterns ഉം പഠിക്കുന്നു. അതിനുശേഷം &amp;quot;malayalam&amp;quot; എന്നെഴുതിയാൽ മതി. വർണ്ണം പഠിക്കുന്ന വാക്കുകൾ വർണ്ണം ഉപയോഗിക്കുന്ന എല്ലാവർക്കും ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
കൂടുതൽ വിവരങ്ങൾ [http://www.varnamproject.com www.varnamproject.com] എന്ന വിലാസത്തിൽ ലഭ്യമാണ്&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
#ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
#ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/varnam-browser-addons varnam-browser-addons] for the source code.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഉദാഹരണങ്ങള്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Input before learning !! Input after learning !! Output&lt;br /&gt;
|-&lt;br /&gt;
| malayaaLam || malayalam || മലയാളം&lt;br /&gt;
|-&lt;br /&gt;
| paTTikkuka || padikkuka || പഠിക്കുക&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം വളരെ എളുപ്പത്തിൽ നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ embed ചെയ്യാവുന്നതാണ്. നിങ്ങളുടെ അപ്പ്ലിക്കേഷൻ stand-alone desktop അപ്പ്ലിക്കേഷനാണെങ്കിൽ [https://github.com/navaneeth/libvarnam libvarnam] നേരിട്ട് ലിങ്ക് ചെയ്യാവുന്നതാണ്. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/libvarnam libvarnam] പ്രൊജക്ട് നോക്കുക.&lt;br /&gt;
&lt;br /&gt;
വെബ്ബ് അപ്പ്ലിക്കേഷനുകൾക്ക് REST API ഉപയോഗിക്കാവുന്നതാണ്. http://www.varnamproject.com/tl?text=&amp;lt;text to transliterate&amp;gt;&amp;amp;lang=&amp;lt;lang-code&amp;gt;, ഉപയോഗിച്ച് നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ വർണ്ണം ഉപയോഗിക്കാവുന്നതാണ്. ഇതിനായി നേരിട്ട് varnamproject.com ഉപയോഗിക്കുകയോ അല്ലെങ്കിൽ വർണ്ണം വെബ്ബ് നിങ്ങളുടെ സെർവറിൽ ഇൻസ്റ്റാൾ ചെയ്ത് അത് ഉപയോഗിക്കുയൊ ചെയ്യാം. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/varnamproject.com varnamproject source] സന്ദർശിക്കുക.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം പ്രൊഗ്രാമ്മിങ്ങ് API&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;C/C++&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   #include &amp;quot;varnam.h&amp;quot;&lt;br /&gt;
   int main(int args, char **argv)&lt;br /&gt;
   {&lt;br /&gt;
     int rc, i;&lt;br /&gt;
     char *error;&lt;br /&gt;
     varnam *handle;&lt;br /&gt;
     varray *result;&lt;br /&gt;
     vword *word;&lt;br /&gt;
   &lt;br /&gt;
     rc = varnam_init(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;, &amp;amp;handle, &amp;amp;error);&lt;br /&gt;
     if (rc != VARNAM_SUCCESS)&lt;br /&gt;
     {&lt;br /&gt;
        printf (&amp;quot;Initialization failed. %s\n&amp;quot;, error);&lt;br /&gt;
        return 1;&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     rc = varnam_transliterate (handle, &amp;quot;malayalam&amp;quot;, &amp;amp;result);&lt;br /&gt;
     if (rc != VARNAM_SUCCESS)&lt;br /&gt;
     {&lt;br /&gt;
        printf (&amp;quot;Transliteration failed. %s\n&amp;quot;, varnam_get_last_error(handle));&lt;br /&gt;
        return 1;&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     for (i = 0; i &amp;lt; varray_length (result); i++)&lt;br /&gt;
     {&lt;br /&gt;
        word = varray_get (result, i);&lt;br /&gt;
        printf (&amp;quot;%d, %s\n&amp;quot;, word-confidence, word-&amp;gt;text);&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     return 0;&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam libvarnam] for C API source.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;Java&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   Varnam varnam = new Varnam(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;);&lt;br /&gt;
   varnam.enableSuggestions(&amp;quot;learnings.varnam.hi&amp;quot;);&lt;br /&gt;
   List&amp;lt;Word&amp;gt; words = varnam.transliterate(&amp;quot;hindi&amp;quot;);&lt;br /&gt;
   for (Word word : words) {&lt;br /&gt;
       System.out.println(word.getConfidence() + &amp;quot; - &amp;quot; + word.getText());&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam-java libvarnam-java] for JAVA API source.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;NodeJS&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   var v = require(&#039;bindings&#039;)(&#039;varnam.node&#039;),&lt;br /&gt;
       file = &amp;quot;ml-unicode.vst&amp;quot;;&lt;br /&gt;
   &lt;br /&gt;
   var varnam = new v.Varnam(file, &amp;quot;learned&amp;quot;);&lt;br /&gt;
   &lt;br /&gt;
   for (i = 0; i &amp;lt; 10; i++) {&lt;br /&gt;
       varnam.transliterate(&amp;quot;malayalam&amp;quot;, function(err, result) {&lt;br /&gt;
            console.log(result);&lt;br /&gt;
       });&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam-java libvarnam-nodejs] for NodeJS API source. NodeJS uses asynchronous API. &lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഭാവി പരിപാടികൾ&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
* Adding support for more languages to &#039;&#039;libvarnam&#039;&#039;.&lt;br /&gt;
* IME for Linux &amp;amp; Windows.&lt;br /&gt;
* More programming language support.&lt;br /&gt;
* Improved performance for varnamproject.com&lt;br /&gt;
&lt;br /&gt;
If you are willing to contribute to the varnamproject, please write to &amp;quot;varnamproject@googlegroups.com&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പിഴവുകളും നിര്‍​ദ്ദേശങ്ങളും&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
നിങ്ങളുടെ അഭിപ്രായങ്ങളും നിര്‍​ദ്ദേശങ്ങളും താഴെ കൊടുത്തിരിക്കുന്ന ഇ-മെയില്‍ വിലാസത്തില്‍ അയക്കുക. &lt;br /&gt;
&lt;br /&gt;
varnamproject@googlegroups.com&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പകര്‍പ്പവകാശം&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
The MIT License (MIT)&lt;br /&gt;
&lt;br /&gt;
Copyright (c) 2013 Navaneeth.K.N&lt;br /&gt;
&lt;br /&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &amp;quot;Software&amp;quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:&lt;br /&gt;
&lt;br /&gt;
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.&lt;br /&gt;
&lt;br /&gt;
THE SOFTWARE IS PROVIDED &amp;quot;AS IS&amp;quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4091</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4091"/>
		<updated>2013-06-22T05:03:21Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു ഉപകരണമാണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. ഒരുതവണ ഇങ്ങനെ എഴുതിയാൽ വർണ്ണം &amp;quot;മലയാളം&amp;quot; എന്ന വാക്കും ആ വാക്ക് എഴുതുവാൻ സാധിക്കുന്ന എല്ലാ patterns ഉം പഠിക്കുന്നു. അതിനുശേഷം &amp;quot;malayalam&amp;quot; എന്നെഴുതിയാൽ മതി. വർണ്ണം പഠിക്കുന്ന വാക്കുകൾ വർണ്ണം ഉപയോഗിക്കുന്ന എല്ലാവർക്കും ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
കൂടുതൽ വിവരങ്ങൾ [http://www.varnamproject.com www.varnamproject.com] എന്ന വിലാസത്തിൽ ലഭ്യമാണ്&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
#ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
#ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/varnam-browser-addons varnam-browser-addons] for the source code.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഉദാഹരണങ്ങള്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Input before learning !! Input after learning !! Output&lt;br /&gt;
|-&lt;br /&gt;
| malayaaLam || malayalam || മലയാളം&lt;br /&gt;
|-&lt;br /&gt;
| paTTikkuka || padikkuka || പഠിക്കുക&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം വളരെ എളുപ്പത്തിൽ നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ embed ചെയ്യാവുന്നതാണ്. നിങ്ങളുടെ അപ്പ്ലിക്കേഷൻ stand-alone desktop അപ്പ്ലിക്കേഷനാണെങ്കിൽ [https://github.com/navaneeth/libvarnam libvarnam] നേരിട്ട് ലിങ്ക് ചെയ്യാവുന്നതാണ്. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/libvarnam libvarnam] പ്രൊജക്ട് നോക്കുക.&lt;br /&gt;
&lt;br /&gt;
വെബ്ബ് അപ്പ്ലിക്കേഷനുകൾക്ക് REST API ഉപയോഗിക്കാവുന്നതാണ്. http://www.varnamproject.com/tl?text=&amp;lt;text to transliterate&amp;gt;&amp;amp;lang=&amp;lt;lang-code&amp;gt;, ഉപയോഗിച്ച് നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ വർണ്ണം ഉപയോഗിക്കാവുന്നതാണ്. ഇതിനായി നേരിട്ട് varnamproject.com ഉപയോഗിക്കുകയോ അല്ലെങ്കിൽ വർണ്ണം വെബ്ബ് നിങ്ങളുടെ സെർവറിൽ ഇൻസ്റ്റാൾ ചെയ്ത് അത് ഉപയോഗിക്കുയൊ ചെയ്യാം. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/varnamproject.com varnamproject source] സന്ദർശിക്കുക.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം പ്രൊഗ്രാമ്മിങ്ങ് API&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;C/C++&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   #include &amp;quot;varnam.h&amp;quot;&lt;br /&gt;
   int main(int args, char **argv)&lt;br /&gt;
   {&lt;br /&gt;
     int rc, i;&lt;br /&gt;
     char *error;&lt;br /&gt;
     varnam *handle;&lt;br /&gt;
     varray *result;&lt;br /&gt;
     vword *word;&lt;br /&gt;
   &lt;br /&gt;
     rc = varnam_init(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;, &amp;amp;handle, &amp;amp;error);&lt;br /&gt;
     if (rc != VARNAM_SUCCESS)&lt;br /&gt;
     {&lt;br /&gt;
        printf (&amp;quot;Initialization failed. %s\n&amp;quot;, error);&lt;br /&gt;
        return 1;&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     rc = varnam_transliterate (handle, &amp;quot;malayalam&amp;quot;, &amp;amp;result);&lt;br /&gt;
     if (rc != VARNAM_SUCCESS)&lt;br /&gt;
     {&lt;br /&gt;
        printf (&amp;quot;Transliteration failed. %s\n&amp;quot;, varnam_get_last_error(handle));&lt;br /&gt;
        return 1;&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     for (i = 0; i &amp;lt; varray_length (result); i++)&lt;br /&gt;
     {&lt;br /&gt;
        word = varray_get (result, i);&lt;br /&gt;
        printf (&amp;quot;%d, %s\n&amp;quot;, word-confidence, word-&amp;gt;text);&lt;br /&gt;
     }&lt;br /&gt;
   &lt;br /&gt;
     return 0;&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam libvarnam] for C API source.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;Java&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   Varnam varnam = new Varnam(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;);&lt;br /&gt;
   varnam.enableSuggestions(&amp;quot;learnings.varnam.hi&amp;quot;);&lt;br /&gt;
   List&amp;lt;Word&amp;gt; words = varnam.transliterate(&amp;quot;hindi&amp;quot;);&lt;br /&gt;
   for (Word word : words) {&lt;br /&gt;
       System.out.println(word.getConfidence() + &amp;quot; - &amp;quot; + word.getText());&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam-java libvarnam-java] for JAVA API source.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;NodeJS&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
   var v = require(&#039;bindings&#039;)(&#039;varnam.node&#039;),&lt;br /&gt;
       file = &amp;quot;ml-unicode.vst&amp;quot;;&lt;br /&gt;
   &lt;br /&gt;
   var varnam = new v.Varnam(file, &amp;quot;learned&amp;quot;);&lt;br /&gt;
   &lt;br /&gt;
   for (i = 0; i &amp;lt; 10; i++) {&lt;br /&gt;
       varnam.transliterate(&amp;quot;malayalam&amp;quot;, function(err, result) {&lt;br /&gt;
            console.log(result);&lt;br /&gt;
       });&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
See [https://github.com/navaneeth/libvarnam-java libvarnam-nodejs] for NodeJS API source.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പിഴവുകളും നിര്‍​ദ്ദേശങ്ങളും&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
നിങ്ങളുടെ അഭിപ്രായങ്ങളും നിര്‍​ദ്ദേശങ്ങളും താഴെ കൊടുത്തിരിക്കുന്ന ഇ-മെയില്‍ വിലാസത്തില്‍ അയക്കുക. &lt;br /&gt;
&lt;br /&gt;
varnamproject@googlegroups.com&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പകര്‍പ്പവകാശം&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
The MIT License (MIT)&lt;br /&gt;
&lt;br /&gt;
Copyright (c) 2013 Navaneeth.K.N&lt;br /&gt;
&lt;br /&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &amp;quot;Software&amp;quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:&lt;br /&gt;
&lt;br /&gt;
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.&lt;br /&gt;
&lt;br /&gt;
THE SOFTWARE IS PROVIDED &amp;quot;AS IS&amp;quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4090</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4090"/>
		<updated>2013-06-22T04:44:01Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു ഉപകരണമാണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. ഒരുതവണ ഇങ്ങനെ എഴുതിയാൽ വർണ്ണം &amp;quot;മലയാളം&amp;quot; എന്ന വാക്കും ആ വാക്ക് എഴുതുവാൻ സാധിക്കുന്ന എല്ലാ patterns ഉം പഠിക്കുന്നു. അതിനുശേഷം &amp;quot;malayalam&amp;quot; എന്നെഴുതിയാൽ മതി. വർണ്ണം പഠിക്കുന്ന വാക്കുകൾ വർണ്ണം ഉപയോഗിക്കുന്ന എല്ലാവർക്കും ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
കൂടുതൽ വിവരങ്ങൾ [http://www.varnamproject.com www.varnamproject.com] എന്ന വിലാസത്തിൽ ലഭ്യമാണ്&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
#ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
#ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;ഉദാഹരണങ്ങള്‍&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Input before learning !! Input after learning !! Output&lt;br /&gt;
|-&lt;br /&gt;
| malayaaLam || malayalam || മലയാളം&lt;br /&gt;
|-&lt;br /&gt;
| paTTikkuka || padikkuka || പഠിക്കുക&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം വളരെ എളുപ്പത്തിൽ നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ embed ചെയ്യാവുന്നതാണ്. നിങ്ങളുടെ അപ്പ്ലിക്കേഷൻ stand-alone desktop അപ്പ്ലിക്കേഷനാണെങ്കിൽ [https://github.com/navaneeth/libvarnam libvarnam] നേരിട്ട് ലിങ്ക് ചെയ്യാവുന്നതാണ്. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/libvarnam libvarnam] പ്രൊജക്ട് നോക്കുക.&lt;br /&gt;
&lt;br /&gt;
വെബ്ബ് അപ്പ്ലിക്കേഷനുകൾക്ക് REST API ഉപയോഗിക്കാവുന്നതാണ്. http://www.varnamproject.com/tl?text=&amp;lt;text to transliterate&amp;gt;&amp;amp;lang=&amp;lt;lang-code&amp;gt;, ഉപയോഗിച്ച് നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ വർണ്ണം ഉപയോഗിക്കാവുന്നതാണ്. ഇതിനായി നേരിട്ട് varnamproject.com ഉപയോഗിക്കുകയോ അല്ലെങ്കിൽ വർണ്ണം വെബ്ബ് നിങ്ങളുടെ സെർവറിൽ ഇൻസ്റ്റാൾ ചെയ്ത് അത് ഉപയോഗിക്കുയൊ ചെയ്യാം. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/varnamproject.com varnamproject source] സന്ദർശിക്കുക.&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;വർണ്ണം പ്രൊഗ്രാമ്മിങ്ങ് API&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;Java&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
   Varnam varnam = new Varnam(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;);&lt;br /&gt;
   varnam.enableSuggestions(&amp;quot;learnings.varnam.hi&amp;quot;);&lt;br /&gt;
   List&amp;lt;Word&amp;gt; words = varnam.transliterate(&amp;quot;hindi&amp;quot;);&lt;br /&gt;
   for (Word word : words) {&lt;br /&gt;
       System.out.println(word.getConfidence() + &amp;quot; - &amp;quot; + word.getText());&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;NodeJS&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
   var v = require(&#039;bindings&#039;)(&#039;varnam.node&#039;),&lt;br /&gt;
       file = &amp;quot;ml-unicode.vst&amp;quot;;&lt;br /&gt;
   &lt;br /&gt;
   var varnam = new v.Varnam(file, &amp;quot;learned&amp;quot;);&lt;br /&gt;
   &lt;br /&gt;
   for (i = 0; i &amp;lt; 10; i++) {&lt;br /&gt;
       varnam.transliterate(&amp;quot;malayalam&amp;quot;, function(err, result) {&lt;br /&gt;
            console.log(result);&lt;br /&gt;
       });&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പിഴവുകളും നിര്‍​ദ്ദേശങ്ങളും&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
നിങ്ങളുടെ അഭിപ്രായങ്ങളും നിര്‍​ദ്ദേശങ്ങളും താഴെ കൊടുത്തിരിക്കുന്ന ഇ-മെയില്‍ വിലാസത്തില്‍ അയക്കുക. &lt;br /&gt;
&lt;br /&gt;
varnamproject@googlegroups.com&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;പകര്‍പ്പവകാശം&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
The MIT License (MIT)&lt;br /&gt;
&lt;br /&gt;
Copyright (c) 2013 Navaneeth.K.N&lt;br /&gt;
&lt;br /&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &amp;quot;Software&amp;quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:&lt;br /&gt;
&lt;br /&gt;
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.&lt;br /&gt;
&lt;br /&gt;
THE SOFTWARE IS PROVIDED &amp;quot;AS IS&amp;quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4089</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4089"/>
		<updated>2013-06-22T04:32:18Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: API usage&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു ഉപകരണമാണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. ഒരുതവണ ഇങ്ങനെ എഴുതിയാൽ വർണ്ണം &amp;quot;മലയാളം&amp;quot; എന്ന വാക്കും ആ വാക്ക് എഴുതുവാൻ സാധിക്കുന്ന എല്ലാ patterns ഉം പഠിക്കുന്നു. അതിനുശേഷം &amp;quot;malayalam&amp;quot; എന്നെഴുതിയാൽ മതി. വർണ്ണം പഠിക്കുന്ന വാക്കുകൾ വർണ്ണം ഉപയോഗിക്കുന്ന എല്ലാവർക്കും ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
കൂടുതൽ വിവരങ്ങൾ [http://www.varnamproject.com www.varnamproject.com] എന്ന വിലാസത്തിൽ ലഭ്യമാണ്&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
#ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
#ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;ഉദാഹരണങ്ങള്‍&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Input before learning !! Input after learning !! Output&lt;br /&gt;
|-&lt;br /&gt;
| malayaaLam || malayalam || മലയാളം&lt;br /&gt;
|-&lt;br /&gt;
| paTTikkuka || padikkuka || പഠിക്കുക&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം വളരെ എളുപ്പത്തിൽ നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ embed ചെയ്യാവുന്നതാണ്. നിങ്ങളുടെ അപ്പ്ലിക്കേഷൻ stand-alone desktop അപ്പ്ലിക്കേഷനാണെങ്കിൽ [https://github.com/navaneeth/libvarnam libvarnam] നേരിട്ട് ലിങ്ക് ചെയ്യാവുന്നതാണ്. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/libvarnam libvarnam] പ്രൊജക്ട് നോക്കുക.&lt;br /&gt;
&lt;br /&gt;
വെബ്ബ് അപ്പ്ലിക്കേഷനുകൾക്ക് REST API ഉപയോഗിക്കാവുന്നതാണ്. http://www.varnamproject.com/tl?text=&amp;lt;text to transliterate&amp;gt;&amp;amp;lang=&amp;lt;lang-code&amp;gt;, ഉപയോഗിച്ച് നിങ്ങളുടെ അപ്പ്ലിക്കേഷനിൽ വർണ്ണം ഉപയോഗിക്കാവുന്നതാണ്. ഇതിനായി നേരിട്ട് varnamproject.com ഉപയോഗിക്കുകയോ അല്ലെങ്കിൽ വർണ്ണം വെബ്ബ് നിങ്ങളുടെ സെർവറിൽ ഇൻസ്റ്റാൾ ചെയ്ത് അത് ഉപയോഗിക്കുയൊ ചെയ്യാം. കൂടുതൽ വിവരങ്ങൾക്ക് [https://github.com/navaneeth/varnamproject.com varnamproject source] സന്ദർശിക്കുക.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം പ്രൊഗ്രാമ്മിങ്ങ് API&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;Java&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
   Varnam varnam = new Varnam(&amp;quot;/usr/local/share/varnam/vst/hi-unicode.vst&amp;quot;);&lt;br /&gt;
   varnam.enableSuggestions(&amp;quot;learnings.varnam.hi&amp;quot;);&lt;br /&gt;
   List&amp;lt;Word&amp;gt; words = varnam.transliterate(&amp;quot;hindi&amp;quot;);&lt;br /&gt;
   for (Word word : words) {&lt;br /&gt;
       System.out.println(word.getConfidence() + &amp;quot; - &amp;quot; + word.getText());&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
==&#039;&#039;NodeJS&#039;&#039;==&lt;br /&gt;
&lt;br /&gt;
   var v = require(&#039;bindings&#039;)(&#039;varnam.node&#039;),&lt;br /&gt;
       file = &amp;quot;ml-unicode.vst&amp;quot;;&lt;br /&gt;
&lt;br /&gt;
   var varnam = new v.Varnam(file, &amp;quot;learned&amp;quot;);&lt;br /&gt;
&lt;br /&gt;
   for (i = 0; i &amp;lt; 10; i++) {&lt;br /&gt;
       varnam.transliterate(&amp;quot;malayalam&amp;quot;, function(err, result) {&lt;br /&gt;
            console.log(result);&lt;br /&gt;
       });&lt;br /&gt;
   }&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;പിഴവുകളും നിര്‍​ദ്ദേശങ്ങളും&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
നിങ്ങളുടെ അഭിപ്രായങ്ങളും നിര്‍​ദ്ദേശങ്ങളും താഴെ കൊടുത്തിരിക്കുന്ന ഇ-മെയില്‍ വിലാസത്തില്‍ അയക്കുക. &lt;br /&gt;
&lt;br /&gt;
varnamproject@googlegroups.com&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;പകര്‍പ്പവകാശം&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
The MIT License (MIT)&lt;br /&gt;
&lt;br /&gt;
Copyright (c) 2013 Navaneeth.K.N&lt;br /&gt;
&lt;br /&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &amp;quot;Software&amp;quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:&lt;br /&gt;
&lt;br /&gt;
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.&lt;br /&gt;
&lt;br /&gt;
THE SOFTWARE IS PROVIDED &amp;quot;AS IS&amp;quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4088</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4088"/>
		<updated>2013-06-22T04:14:57Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: /* വർണ്ണം */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു ഉപകരണമാണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. ഒരുതവണ ഇങ്ങനെ എഴുതിയാൽ വർണ്ണം &amp;quot;മലയാളം&amp;quot; എന്ന വാക്കും ആ വാക്ക് എഴുതുവാൻ സാധിക്കുന്ന എല്ലാ patterns ഉം പഠിക്കുന്നു. അതിനുശേഷം &amp;quot;malayalam&amp;quot; എന്നെഴുതിയാൽ മതി. വർണ്ണം പഠിക്കുന്ന വാക്കുകൾ വർണ്ണം ഉപയോഗിക്കുന്ന എല്ലാവർക്കും ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
കൂടുതൽ വിവരങ്ങൾ [http://www.varnamproject.com www.varnamproject.com] എന്ന വിലാസത്തിൽ ലഭ്യമാണ്&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
#ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
#ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;ഉദാഹരണങ്ങള്‍&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Input before learning !! Input after learning !! Output&lt;br /&gt;
|-&lt;br /&gt;
| malayaaLam || malayalam || മലയാളം&lt;br /&gt;
|-&lt;br /&gt;
| paTTikkuka || padikkuka || പഠിക്കുക&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;പിഴവുകളും നിര്‍​ദ്ദേശങ്ങളും&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
നിങ്ങളുടെ അഭിപ്രായങ്ങളും നിര്‍​ദ്ദേശങ്ങളും താഴെ കൊടുത്തിരിക്കുന്ന ഇ-മെയില്‍ വിലാസത്തില്‍ അയക്കുക. &lt;br /&gt;
&lt;br /&gt;
varnamproject@googlegroups.com&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;പകര്‍പ്പവകാശം&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
The MIT License (MIT)&lt;br /&gt;
&lt;br /&gt;
Copyright (c) 2013 Navaneeth.K.N&lt;br /&gt;
&lt;br /&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &amp;quot;Software&amp;quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:&lt;br /&gt;
&lt;br /&gt;
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.&lt;br /&gt;
&lt;br /&gt;
THE SOFTWARE IS PROVIDED &amp;quot;AS IS&amp;quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4087</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4087"/>
		<updated>2013-06-22T04:12:17Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: Added examples&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു ഉപകരണമാണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. ഒരുതവണ ഇങ്ങനെ phonetically എഴുതിയാൽ വർണ്ണം &amp;quot;മലയാളം&amp;quot; എന്ന വാക്കും ആ വാക്ക് എഴുതുവാൻ സാധിക്കുന്ന എല്ലാ patterns ഉം പഠിക്കുന്നു. അതിനുശേഷം &amp;quot;malayalam&amp;quot; എന്നെഴുതിയാൽ മതി. വർണ്ണം പഠിക്കുന്ന വാക്കുകൾ വർണ്ണം ഉപയോഗിക്കുന്ന എല്ലാവർക്കും ലഭ്യമാണ്.  &lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
#ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
#ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;ഉദാഹരണങ്ങള്‍&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Input before learning !! Input after learning !! Output&lt;br /&gt;
|-&lt;br /&gt;
| malayaaLam || malayalam || മലയാളം&lt;br /&gt;
|-&lt;br /&gt;
| paTTikkuka || padikkuka || പഠിക്കുക&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|-&lt;br /&gt;
| vaikunnEraTH || vaikunnerath || വൈകുന്നേരത്ത്&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;പിഴവുകളും നിര്‍​ദ്ദേശങ്ങളും&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
നിങ്ങളുടെ അഭിപ്രായങ്ങളും നിര്‍​ദ്ദേശങ്ങളും താഴെ കൊടുത്തിരിക്കുന്ന ഇ-മെയില്‍ വിലാസത്തില്‍ അയക്കുക. &lt;br /&gt;
&lt;br /&gt;
varnamproject@googlegroups.com&lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;പകര്‍പ്പവകാശം&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
The MIT License (MIT)&lt;br /&gt;
&lt;br /&gt;
Copyright (c) 2013 Navaneeth.K.N&lt;br /&gt;
&lt;br /&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the &amp;quot;Software&amp;quot;), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:&lt;br /&gt;
&lt;br /&gt;
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.&lt;br /&gt;
&lt;br /&gt;
THE SOFTWARE IS PROVIDED &amp;quot;AS IS&amp;quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
	<entry>
		<id>https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4086</id>
		<title>വര്‍ണ്ണം</title>
		<link rel="alternate" type="text/html" href="https://wiki.smc.org.in/index.php?title=%E0%B4%B5%E0%B4%B0%E0%B5%8D%E2%80%8D%E0%B4%A3%E0%B5%8D%E0%B4%A3%E0%B4%82&amp;diff=4086"/>
		<updated>2013-06-22T03:45:09Z</updated>

		<summary type="html">&lt;p&gt;Navaneethkn: First version&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{prettyurl|varnam}}&lt;br /&gt;
===&#039;&#039;&#039;വർണ്ണം&#039;&#039;&#039; ===&lt;br /&gt;
&lt;br /&gt;
മലയാളവും മറ്റ് ഇന്ത്യൻ ഭാഷകളും എഴുതാനുള്ള ഒരു ഉപകരണമാണ് വർണ്ണം. &lt;br /&gt;
&lt;br /&gt;
സ്വനലേഖ ഉപയോഗിക്കുന്നത്പോലെ വർണ്ണത്തിലും ഉപയോക്താവ് എഴുതുന്നത് മംഗ്ലീഷിലാണ്. മംഗ്ലീഷ് ഉപയോഗിച്ച് transliteration ചെയ്യുന്ന ഉപകരണങ്ങളിൽ &amp;quot;മലയാളം&amp;quot; എന്ന വാക്ക് എഴുതുവാൻ &amp;quot;malayaaLam&amp;quot; എന്നാണ് എഴുതുക. വർണത്തിലും ഇതേ രീതി തന്നെയാണ് ഉപയോഗിക്കുന്നത്. പക്ഷെ, ഈ രീതിയിൽ ഒരുതവണ എഴുതിയാൽ മതിയാകും. അടുത്ത തവണ &amp;quot;malayalam&amp;quot; എന്ന് എഴുതിയാൽ മതി.  &lt;br /&gt;
&lt;br /&gt;
===&#039;&#039;&#039;ഇന്‍സ്റ്റാളേഷന്‍&#039;&#039;&#039;===&lt;br /&gt;
&lt;br /&gt;
വർണ്ണം ഫയർഫോക്സിന്റേയും ക്രോമിന്റേയും addon ആയി ലഭ്യമാണ്.&lt;br /&gt;
&lt;br /&gt;
ക്രോം [https://chrome.google.com/webstore/detail/varnam-ime/abcfkeabpcanobhdmcmdabejaamephaf Link]&lt;br /&gt;
ഫയർഫോക്സ്് [https://addons.mozilla.org/en-US/firefox/addon/varnam-transliteration-base/ Link]&lt;br /&gt;
&lt;br /&gt;
ഡൌൺലോഡ് ചെയ്തതിനു ശേഷം മലയാളം എഴുതാനുദ്ദേശിക്കുന്ന textbox ഇൽ Right click ചെയ്ത് varnam മെനുവിൽ നിന്ന് മലയാളം തിരഞ്ഞെടുക്കുക. എന്നിട്ട് മംഗ്ലീഷിൽ എഴുതിയാൽ മതി. വർണ്ണം മലയാളം വാക്കുകൾ ഒരു സജഷൻ ലിസ്റ്റിൽ കാണിക്കും. addon ഉപയോഗിച്ച് google ചാറ്റിലൂം facebook ചാറ്റിലൂമെല്ലാം മലയാളം നേരിട്ട് എഴുതാവുന്നതാണ്.&lt;br /&gt;
&lt;br /&gt;
ഉദാഹരണങ്ങള്‍&lt;/div&gt;</summary>
		<author><name>Navaneethkn</name></author>
	</entry>
</feed>