User:Deepakrocks0009/gsoc 2014 proposal

From SMC Wiki

Google Summer of Code 2013 Proposal for Swathanthra Malalayalam Computing


Personal Information

Email Address                    ː deepak.kumar.ece11@iitbhu.ac.in
Blog URL                         ː http://deepakiitbhu.blogspot.com/
Freenode IRC Nick                ː deepak
University and current education ː BTech Electronics Engineering, Indian Institute of Technology (B.H.U.) Varanasi


Why do you want to work with the Swathanthra Malayalam Computing?

Swathanthra Malayalam Computing is the best platform for me to showcase my talent and prove my self. Moreover seeing my fellow batch mates I was inspired to contribute to open source project. The project Grandham is totally of my field of interest. In addition to this the mentors are very helpful and are always ready to help any time. I am looking my carrier in software field and want to become a good developer so SMC's platform is a golden opportunity.

Do you have any past involvement with the Swathanthra Malayalam Computing or another open source project as a contributor?

No.I was not familiar with SMC. I have not contributed to any open source project.

Did you participate with the past GSoC programs, if so which years, which organizations?

No. I have not participated in GSoC programs before.

Do you have other obligations between May and August ? Please note that we expect the Summer of Code to be a full time, 40 hour a week commitment

I will try to give more than 40 hour a week as I will try to do my work on weekends also. I am very dedicated towards my studies and passion as I used to study 12 hour daily during my preparation time. I will try to finish the project requirements 2-3 weeks before the dead line.

Will you continue contributing/ supporting the Swathanthra Malayalam Computing after the GSoC 2013 program, if yes, which area(s), you are interested in?

I will contribute to SMC even after GSOC'14 and if I am not selected then also I will be in touch with SMC and mentors. I will try to develop advance features in Grandham (conversion to MARC 21 file of other language file other than english) in future. Moreover I will try to extend project to other form of records other than bibliographic record.

Why should we choose you over other applicants?

I have been working on ruby on rails for last 1 year and I had done my summer internship'13 too on ruby on rails developing a web application for company www.schoolmitra.com. In that project I had worked on student database management, student report card management, bulletin board feature, alert system on mobile and email and event management system. I have also created a basic blog app using ruby on rails framework. Swanthanthra Malayalam Computing will be a great platform for me to contribute as the organisation has project 'Grandham' which belongs to my field of interest. I am familiar with most of the gems and tools which will be used in the project. Moreover the project is basically implementation of encoding and decoding algorithm and as I am good in algorithm and programming that will help me. I am very enthusiastic and passionate about the project and I will work with full dedication and determination.

Proposal Description

An overview of your proposal

Grandham is a project of an organization SMC to maintain and record bibliographic record. The current project allows authentic users to create bibliographic record (books), edit and update it. There are different tabs of books, authors, publishers and libraries. Feature to add book cover and searching of items on the basis of language are also there. But if we want to process the record, it is difficult for computers to read them in their original form. The records need to be in converted into proper format in order to make it machine readable that format should be internationally acceptable. So we want to introduce MARC 21 feature in Grandham for all types of bibliographic records. MARC 21 is a set of digital format for the description of items such as books, bibliographic record etc. There are several versions of MARC out of which MARC 21 is most predominant. Using the format of MARC 21 we will convert record to machine readable form.

The need you believe it fulfills

Grandham helps to maintain and manage bibliographic records. Its very difficult to manage large number of records that too huge in size. So, Grandham helps in managing them. The new import and export feature of MARC 21 records will help to process the records in computer. So making the records machine readable will help to analyze and process it in efficient manner.

Any relevant experience you have

I have experience in ruby and its framework ruby on rails in which Grandham project is written. I know algorithm and different gems which will be used in the project. Also I have basic understanding about how to decode and encode data from one record to another. I am familiar with the basic database management which is required in the project. Moreover I have basic understanding of Ajax, CSS, and javascript which will help to improve front end of the project.

How you intend to implement your proposal

In order to implement MARC 21 import/export feature in Grandham I need to understand it thoroughly. MARC 21 is a huge document having 800+ fields and 3000+ subfields and it follows a particular format for encoding /decoding data. 1. First we need to import MARC file which can be done through ruby’s ‘file_filed_tag’ which supports all type of file. Once the file is selected by browsing and submitted, we need to read it in controller. Moreover we will have an option to select files from existing records too. Once we read it as a string we can decode it using ruby-marc library function MARC::Reader.decode(string).

2. Once we have decoded data describing leader, tags, control fields, and field value now we need to match the fields and tags with database value and find what they exactly mean. If any of the field does not matches with database it will go to default row of table and will be assign null value. It will be a tough task to make a huge database of MARC 21 control fields and sub fields and for this we will be using rails’ STI method. Using single table inheritance method we will be avoiding extra tables. We can omit some of the fields which are rarely used.

3. For adding columns in book table we just need to create another table. This is necessary as we want to decode every field of MARC record.

4. Converting existing bibliographic record into MARC 21 format record is basically an encoding process and for this we will be using MARC8 encoding scheme. We need to first create a blank marc record file using MARC::Record.new function. And after this we will append data to this file and by using library function MARC::Writer.encode(record) we will get encoded MARC record. The main task in this process is to append data to record. Every record has title, language_id, price, isbn, pages, year and many other fields. From these fields we need to generate leader, directory of tags, fields, sub-field, and field value with the help of MARC 21 format. Once we have these array of data we can use ‘record.append’ function to append it to record.

5. For any changes in existing MARC record first it need to be decoded and after the proper changes it can be again change back to marc record. In another way if we need to add any particular filed and subfield value it can be done through append function.

6. Since there will be a lot of information in a MARC 21 record and all of it may be not useful for a reader so we will restrict the flow of data in view of record. However we need to extract every field of the record as it may contain useful information.

7. For making front end better we can use bootswatch.com themes. Features like how many people liked the record, and how many viewed the record can be made using gem socialization.

8. A small profile for authors and publishers can be introduced which will have fields like name, his books, his qualification etc.

9. Another feature which can be introduced for the books is comment for receiving users review. This can be simply implemented using gem ‘act_as_commentable’.

A rough timeline for your progress with phases

Duration Description Mile Stone
Before May 27 Before Announcement of Candidates Familiarize with the version control system,code,documentation of spell checker module of SILPA, hunspell working and requirements of project.Try hunspell in malayalam and other languages.
May 28 – June 16 Before Official Coding Period Starts To do self coding with python to further improve my understanding of various concepts involved.Start learning hunspell algorithm.During this period I will remain in constant touch with my mentor to be absolutely clear of my future goals.
June 17 – July 3 Official Coding Period Starts Coding,Testing and Debugging of various features in spell checker.

Starts scripting of various suffix and prefix patterns in malayalam. Start writing affix file for hunspell. Presentation of components to mentor weekly.

July 3 - July 31 Preparing for mid term evaluation Do further scripting for inflecting and agglutinating words.

Scripting for different compound words.Ask help from language communities for further scripting.Submission of files to mentor for evaluation.

Aug 1 - Aug 15 After mid term evaluation Refine the scripting as per mentors suggestions.

Scripting for multi level suffix stripping. Making changes so as to improve functionality.

August 16 - August 29 Before Final stage Implement multi suffix stripping property.

Completion of affix and dictionary files. Most of the time will be used for rigorous testing.

August 30 - September 10 Final Stage Documentation of the project.


A buffer of one week has been kept for unpredictable delay.

Tell us something about you have created

1.Event management for www.schoolmitra.com
2.Alert notification on mobile and email
3.Bulletin Board feature for schoolmitra for daily updates of school.
4.A basic beginner blog application http://deepakkeshri.herokuapp.com/

Have you communicated with a potential mentor? If so who?

Yeah I have communicated with Ershad Sir. He was very helping and understanding towards me.