User:Abhineet

Personal Information
Email Address: agarwal.abhi93@gmail.com Telephone: +919966551158 Freenode IRC Nick: Abhineet Github: abhineet08 University and current education: 4th year, B.Tech [ Computer Science & Engineering ] and M.S [Computatonal Natural Sciences], IIIT Hyderabad, Hyderabad, India. Why do you want to work with the Swathanthra Malayalam Computing? I always wanted to contribute and had a concern to improve open source applications. SMC has projects which provides me a platform to do so. A GSoC mentoring organization on Indian Language is a even more compelling cause. Do you have any past involvement with the Swathanthra Malayalam Computing or another open source project as a contributor? I have been recently fiddling around with diaspora’s codebase and recently I added support for a wide range of Emoticons(Emoji) to Diaspora. (Interestingly diaspora being a social networking platform didn’t have Emoticons support till now) Do you have other obligations between May and August ? I don’t have any sort of obligations between May and August and can work for 40 hours a week with full commitment

Will you continue contributing/ supporting the Swathanthra Malayalam Computing after the GSoC 2014 program, if yes, which area(s), you are interested in? Yes, I will continue to contribute in projects which focus on cross-language experimenting and use technologies which I am familiar with ( RoR, C, Python, Java , Javascript etc.)

Why should we choose you over other applicants? I have decent experience with RoR, have worked as a summer intern in Groupon and was author of the backend script Nightcrawler (performs insert, upsert , update on required fields and imports data from Mysql to MongoDB) and  was UI developer for HAWK tool( used for curation of deals and plotting using HighchartsAPI). I have done my B.Tech project in LTRC( Linguistics Centre) at IIIT Hyderabad. The Project Customizes a NLIDB system to work on Hindi Language (a cross-language experiment). This project being an intersection of both RoR and application of Linguistics, I think I’ll be a good fit.

Proposal Overview
Project Summary Diaspora is a social networking platform. It has features which enables its users to follow their interests but it doesn’t have any language filter i.e users being able to post on languages which are recognized and tagged accordingly. A user who is acquainted to certain language will also not want posts from other languages to show up on his wall. This project is about enabling this feature and allow user to view translated posts.

Cause it fulfills Social networking platforms are built for active involvement of a user in the language he/she is comfortable with. Language should not be a barrier for such platforms. If they want better understanding of other posts they should be able to get them in the language they’re used to. This project fulfills both the needs.

Implementation Overview
Language Tagging : All the posts and comments in Diaspora will be tagged automatically with their respective languages and manual tagging will be enabled for modification by the user. While posting user will be given an option of “Tag Automatically” which would be the ideal case else User will use HashTag (given the option of auto-complete) as in the new user page to choose the language of his choice.For comments we will use tag them automatically.

We will use “whatlanguage” gem for the automatic detection of languages. This gem ideally uses statistical language identification API which will analyze the post or comment and score them accordingly so speedwise it will be the best fit for language identification purpose.

“act_as_taggable_on” gem will be used for the purpose of tagging the post with the language given by detection or the language specified by the user. It has been already used by diaspora so it goes with the standard convention.

Show posts with preferred languages: While signing up for a new account the user chooses his preferred language by entering it into a dialogue box which will be provided additionally to the existing “What are you into” box. User enters his preferred languages which would contain auto-complete feature too.

Besides this user can add, delete his preferred languages in “Account Settings” section. A new field with label “Preferred Languages” will be added to the above mentioned section.

Using the above settings, all the posts to the user stream which are in the languages mentioned in his preferred language settings will only be visible by default.

Translate: Due to the complexity involved in setting up a translation system we can use Google translate or Bing API for the translation purpose, but recently due to the unavailability of Google Translate API we prefer using Bing API wherein we can translate 1 million characters in total but then we would need to wait for a month for further translation.

Note: Incase any of the steps turn out to be resource hogging and affect our performance issues, we will use job scheduling using redis.

Communication with mentor I have been regularly in talks with Ershad K (irc nick: ershad) about the implementation details and possible challenges. I have kept him updated regarding my progress and ideas so far.

Other Programming Activities I have worked as a summer intern at Groupon for the last summer. Groupon is hosted on RoR.