User:Karthiksenthil

Language Filter for Diaspora

Personal information
Name: Karthik Senthil Email Address: karthik.senthil94@gmail.com Freenode IRC Nick: skarthik Github handle: karthiksenthil Location: Bangalore,India(UTC +5.30)

Your university and current education: I am a second year undergraduate pursuing Information Technology and Engineering at National Institute of Technology Karnataka(NITK) Why do you want to work with the Swathanthra Malayalam Computing? I am a staunch devotee of the open source community and hence always looking for an occasion to contribute back to the community. Further, SMC is providing an amazing opportunity to work on a prime open source project like Diaspora. I would like to utilise this chance to establish myself in the Open Source community and I feel that SMC plays a critical role in this ambition.

Do you have any past involvement with the Swathanthra Malayalam Computing or another open source project as a contributor? No, I have do not have any prior contributions to SMC projects. I have worked on various open source projects(at the college level). Currently, I am working on a college project to build an online tool which allows developers to collaborate and learn new technologies in an interactive manner. This is developed using Ruby on Rails technology.

Did you participate with the past GSoC programs, if so which years, which organizations? No, this my first attempt in contributing to the community through GSoC.

Do you have other obligations between May and August? Please note that we expect the Summer of Code to be a full time, 40 hour a week commitment I do not have any other obligations between the months of May and August, and hence prepared to devote 40 hours(or more) per week towards GSoC.

Will you continue contributing/ supporting the Swathanthra Malayalam Computing after the GSoC 2014 program, if yes, which area(s), you are interested in? Contributing to the open source community is my passion, and thus I will not restrict this activity only for the GSoC program. I will continue to regularly participate in all activities of SMC. I am particularly interested in applications based on Ruby or Ruby on Rails technology. Why should we choose you over other applicants? My introduction to open source software was first through web development using Ruby on Rails, and from then on I have continued to gain knowledge and experience in that field. I feel that this foundation will simplify various tasks in the project for me. Apart from this I am dedicated towards my ambition and will not misjudge such an opportunity to contribute to the community.

Proposal Description

Overview This project basically involves the categorising of posts based on their language, on the open source social networking platform, Diaspora. When users post in different languages(to an aspect), it is inconvenient for the receivers who don't understand the language.The primary goal of this project is to address this disadvantage.

Need addressed A social network can be described as a network of social interactions which involves cultural and moral exchanges. In the advanced world of today, language should not be a barrier for such an exchange.Hence there is a need to enable the end user to customize his/her preferences of languages and accordingly filter the incoming posts based on these preferences. A further step to this project would be allowing the translation of the posts to a preferred language of the user.

Implementation Details and Ideas A solution to the above stated problem would be to tag a post with the language that it is posted in. This would enable filtering of the posts when rendered to a user(based on the preferences).

The implementation of this idea of tagging and filtering is described by the following 5 deliverables :  Add a new column called languages_preferred to the users table.This field is serialized to store multiple languages(that are preferred) per user. Tag a post using the acts-as-taggable-on (https://github.com/mbleigh/acts-as-taggable-on) gem with the language it is written in.The language is detected automatically by using a gem that works independently in local environment and does not depend on any external services. The gem whatlanguage (https://github.com/peterc/whatlanguage) is one such option. At the receiver's side, filter the incoming post by looking up his/her language preferences. This ensures that there is no breach in security or protocol used to federate the posts in Diaspora. A UI is integrated for every user to be able to add/edit his/her language preferences. Translation of posts(or comments) can also be integrated using the globalize gem.  Note : The user is also given an option to override the automatic detection of the language of posts.

Tentative Timeline

Upto 19th May(Pre-coding phase): -Getting more familiar with the codebase of Diaspora to understand the functionality of all modules and features. -Discussions with mentor and some core developers of Diaspora about the action plan for the implementation of project. This will also include re-assessment of the gems to be used in the application. -Schedule meetings and hangouts on a weekly or daily basis.

Week 1 – Week 2(Coding phase): -Migration to add the new field languages_preferred to users model. -Design actions to add data to the above created field. -Design view templates for the UI to enable users to add language_preferences. -Implement views and integrate with existing UI of Diaspora. -Test the above components using RSpec.

Week 3 – Week 4(Coding phase): -Add the tagging feature to a post using the acts-as-taggable-on gem i.e tag a new post with the language used. -Auto-identify the post language using the gem whatlanguage and then tag the post with that identified language. Add an additional post option to override the auto-detection of language(and allow to manually tag the same). -Test the tagging feature and the filtering of posts based on its tags.

Week 5 ( Midterm evaluation): Check status of current progress and implementation

Week 6 – Week 7(Coding phase): -Add the post filtering feature on the receiving side of a post. Here, hide the posts which are not in the receiver's preferred languages not complete remove it(provide an option for a user to view these hidden posts). -Add action listeners to these options provided to the receiver. -Test the filtering feature through Rspec tests.

Week 8 – Week 9 (Coding phase): -Work on the translation feature of posts. -Give an option of translating a post( which is hidden) to a language preferred by the receiver. -Discuss details of this feature with core developers of Diaspora to decide upon its implementation and gems(if any) to be used.

Week 10 – Week 11( Testing phase): -Check for security/protocol breaches caused by the above features. -Test all the above added options and features. -Ensure all the existing Rspec tests and Travis build pass.

Week 12 – Week 15 (Bug fixing phase, Pencils down): -Fix issues(if any) caused by the language filter and translation feature. -Code clean up,final evaluation and retrospection.

Apart from this estimated timeline, I would personally like to contribute to the Diaspora application(through SMC) even after the GSoC period.

Note regarding Translation feature of posts/comments

After integrating the language filter feature, the translation feature will be taken up as a separate module. Due to the disadvantages of translation_tables in globalize, I have decided to replace the globalize gem with other language translation gems(which rather use APIs like Google or Bing). I have hereby listed out a few of the gems that I have explored: 1) to_lang ( https://github.com/jimmycuadra/to_lang ) 2) easy_translate ( https://github.com/seejohnrun/easy_translate ) 3) language-translator I am sure that apart from these gems, external API calls can also be made for the same.

There are many client side language translators as well supported by jQuery (using Google API). However these might be 3rd party softwares and can cause concerns related to security or robustness of Diaspora.

Of course, as mentioned earlier there will be a thorough discussion about this feature(like server load,security concerns,etc) with the core developers of Diaspora, before implementing the same.

Workflow diagram



Communication with mentors and relevant members I initially began with discussing various ideas of implementation with the mentor Ershad K (IRC nick : ershad) on the smc-students-project mailing list[1]. From there, the discussion was taken a step further by involving the core developers of Diaspora ( on loomio[2]). This turned out to be a critical decision, as the idea was inspected by the developers from various angles. Further I had a very fruitful discussion with the Diaspora developer Jonne Hass on the diaspora-dev IRC channel. A log of this discussion can be found here[3]. This final implementation was drafted after considering all the above discussions.

Relevant links Mailing list [1] http://lists.smc.org.in/pipermail/student-projects-smc.org.in/2014-March/000076.html Loomio Discussion[2] https://www.loomio.org/d/4vTqCj5X/language-filter-for-diaspora-as-a-gsoc-project IRC log[3] https://drive.google.com/file/d/0B38Iq5yT6FcCeUk1VmJKXy1wMmc/edit?usp=sharing

My opensource activities -Active member of BRUG(Bangalore Ruby Users Group) and NITK-RUG. -Will be attending my first open source conference, RubyConf India 2014 (scheduled on 22-23 March) under the sponsorship of organisers. -Currently working on an open source project(in college) to develop an online tool for developers to collaborate and learn softwares interactively.This application is developed using RoR.

Other programming activities and contributions -Winter Internship as a Software as a Service(SaaS) Developer under BigBHK(www.bigbhk.com) based Ruby on Rails technology. -Academic project(2011-2012) to simulate an online banking software using C++ and graphics.