User:Vidya

From SMC Wiki

Personal Information

Email Address: vidya.vnv@gmail.com

Blog URL: http://www.quieterminal.wordpress.com

Freenode IRC Nick: wannaC

Your university and current education: Netaji Subhas Institute of Technology, Senior Year, B.E. in Information Technology

Association with SMC

Why do you want to work with the Swathanthra Malayalam Computing? Coming from a Malayali family but living far away(in Delhi) where you get less exposure to your culture, it is important for me to find ways in order to associate with it. SMC provides me a platform where I can enhance my skills as well as get acquainted with my native language. Also it gives me an opportunity to work in Open Source with an incentive to acquire more knowledge about Malayalam(my native language). I would like to get associated with an Indian organisation which collaborates with the government which is what SMC does.

Do you have any past involvement with the Swathanthra Malayalam Computing or another open source project as a contributor? No, this is my first attempt.

Did you participate with the past GSoC programs, if so which years, which organisations? No

Do you have other obligations between May and August ? Please note that we expect the Summer of Code to be a full time, 40 hour a week commitment I will be having final year examinations during the month of May but I know how to prioritise my work and will give 100% to it.

Will you continue contributing/ supporting the Swathanthra Malayalam Computing after the GSoC 2014 program, if yes, which area(s), you are interested in? Yes. Mainly in Transliteration module.

Why should we choose you over other applicants? I have the necessary programming skills and I am dedicated to the task to which I have committed. Also I believe in team work and want to work in a open source project. I am really excited to work on a widely used free software project.

Project Proposal

OVERVIEW Converting indic processing modules currently in SILPA into javascript modules library involves porting all the modules in SILPA written in Python to javascript

PURPOSE OF THE PROJECT


IMPLEMENTATION

RELEVANT EXPERIENCE: I am proficient in Python. I have worked with several APIs which extract huge amounts of data. These scripts are being used by the organisation(InfoAssembly) where I interned. Sentiment Analysis of IMDB movies reviews using Python's NLTK library, Conversion into ISO Date format, Scraped 150 websites using Scrapy are some of the projects that I have worked on. I have worked with NodeJS and Javascript to build a Chrome Extension. Among databases I have experience in MongoDb and MySQL. I have experience in softwares such as WEKA, Matlab and Octave. I used to program in C/C++ but lately I have adopted Python.

Tentative Timeline

21st March - 4th April Study the proposed javascript module pattern properly and get familiarised with all the modules.

5th April - 20th April Build a javascript prototype for Transliteration Module.


Community Bonding Period

21st April - 1st May Discuss the modules to be ported and brainstorm about the algorithms that could be used. Refine the objectives of the proposal.

2nd May - 18th May Get the Javascript module for Transliteration module reviewed. Test it and discuss other techniques which could be used.

19th May - 31st May Improve the ported Transliteration module. Prepare modules for ApproxSearch, SpellChecker and Payyans.

31st May - 23rd June Extensive testing and improvements required is to be discussed


27th June - 12th July Prepare module for Soundex.

12th July - 5th August Prepare module for Chardetails, TextSimilarity, Silpa Sort and Indic Stemmer

5th August - 10th August Porting all modules and perform a unit test.

11th August - 18th August(Pencils Down) Improving documentation and writing tests for each module

Other Relevant Information

Tell us about something you have created I built a chrome extension to extract data out of website like hyperlinks, About us information, Products, Investor Relationships and other business related information. All this was stored in a database which can be used to access the information at a later stage. I developed a recommendation engine for a startup using PHP and MYSQL based on item-to-item collaborative filtering. I analysed the IMDB movies reviews using Sentiment Analyisis with Python's NLTK library to an accuracy of 89%

Have you communicated with a potential mentor? If so, who? Yes. I communicated with Santosh.

SMC Wiki link: http://wiki.smc.org.in/User:Vidya