User:Tachyons/GSoC-speech recognition: Difference between revisions
No edit summary |
|||
(6 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Developing Acoustic and Language Model for Malayalam Recognition = | |||
== Personal information == | == Personal information == | ||
Line 14: | Line 15: | ||
'''Do you have any past involvement with the Swathanthra Malayalam Computing or another open source project as a contributor?''' | '''Do you have any past involvement with the Swathanthra Malayalam Computing or another open source project as a contributor?''' | ||
I don't have considerable contribution to opensource projects .But I hope I can involve in open source projects in future | :I don't have considerable contribution to opensource projects .But I hope I can involve in open source projects in future | ||
here is my github profile [https://github.com/tachyons tachyons] | here is my github profile [https://github.com/tachyons tachyons] | ||
'''Did you participate with the past GSoC programs, if so which years, which organizations?''' | '''Did you participate with the past GSoC programs, if so which years, which organizations?''' | ||
No | :No | ||
'''Do you have other obligations between May and August ? Please note that we expect the Summer of Code to be a full time, 40 hour a week | '''Do you have other obligations between May and August ? Please note that we expect the Summer of Code to be a full time, 40 hour a week | ||
commitment''' | commitment''' | ||
NO , I don't have such obligations | :NO , I don't have such obligations | ||
'''Will you continue contributing/ supporting the Swathanthra Malayalam Computing after the GSoC 2013 program, if yes, which area(s), you are interested in?''' | '''Will you continue contributing/ supporting the Swathanthra Malayalam Computing after the GSoC 2013 program, if yes, which area(s), you are interested in?''' | ||
Yes I will , My interested fields are | :Yes I will , My interested fields are | ||
* Machine language translation system for malayalam | * Machine language translation system for malayalam | ||
Line 36: | Line 37: | ||
'''Why should we choose you over other applicants?''' | '''Why should we choose you over other applicants?''' | ||
I am a foss enthusiast and technology enthusiast. I am new to the field of digital sound processing ,But I have knowledge about digital signal processing and have solid background in major programming languages . I also planned to create a similar project as the academic project , So that I can spend more time and effort for this project | :I am a foss enthusiast and technology enthusiast. I am new to the field of digital sound processing ,But I have knowledge about digital signal processing and have solid background in major programming languages . I also planned to create a similar project as the academic project , So that I can spend more time and effort for this project | ||
== Proposal Description == | == Proposal Description == | ||
Line 42: | Line 43: | ||
'''Problem Statement''' | '''Problem Statement''' | ||
:Malayalam (മലയാളം), is a language spoken in India, predominantly in the state of Kerala. It is one of the 22 scheduled languages of India. Malayalam has official language status in the state of Kerala and in the union territories of Lakshadweep and Puducherry. It belongs to the Dravidian family of languages, and is spoken approximately by 33 million people according to the 2001 census. | |||
:Malayalam do not have existing speech processing systems ,So in order to develop a speech processing sysstem we have to create an an acoustic model and language Model has to be developed for malayalam . This project isto create an acoustic model and language Model for malayalam so that it will be helpful for future development in malayalam speech Recognition system | |||
'''Introduction''' | '''Introduction''' | ||
:Speech is the ancient but still more powerfull way of sharing of information . Even an illiterate person can share information via speech . But unlike other communication ways like text , Speech processing is very difficult to achive . Due to this complexity there is no perfect speech Recognition tools are available today . Carnegie Mellon University created an open source tool called CMU sphnix , It made considerable improvement in speech Recognition | |||
Speech | |||
:*'''Speech Recognition in CMU sphnix''' | |||
In CMU sphnix common way to recognize speech is the following: Take the wave form and split it on the utterance , Process that slice and determine match it with predefined database . | In CMU sphnix common way to recognize speech is the following: Take the wave form and split it on the utterance , Process that slice and determine match it with predefined database . | ||
Line 62: | Line 65: | ||
:* Acoustic model | :* Acoustic model | ||
:* The language model | :* The language model | ||
'''How and who it will benefit in society''' | |||
'''Previous attempts''' | |||
'''Challenges''' | |||
'''References''' | |||
'''Any relevant experience you have''' | '''Any relevant experience you have''' | ||
Line 67: | Line 78: | ||
'''How you intend to implement your proposal''' | '''How you intend to implement your proposal''' | ||
'''A rough | '''A rough time-line for your progress with phases''' | ||
'''Any other details you feel we should consider''' | '''Any other details you feel we should consider''' |
Latest revision as of 02:39, 2 May 2013
Developing Acoustic and Language Model for Malayalam Recognition
Personal information
Email Address : aboockervyd@gmail.com Blog URL : abvayad.wordpress.com Freenode IRC Nick : tachyons Your university and current education: CUSAT , B Tech Computer science
Why do you want to work with the Swathanthra Malayalam Computing?
I heard about smc by accident . I saw the mailing list from launchapad when I was helping one project for malayalam translation. I joined in that mailing list and really surprised with dedication of smc members for malayalam language computing. I asked to one of the members about the reason behind this dedication . His answer was simple "I am a malayalee ,It is our responsibility" . As a malayalee I also have responsibility to complete the dream "എന്റെ കമ്പ്യൂട്ടറിനു് എന്റെ ഭാഷ". So I decided to contribute to smc in this summer . I will work for smc even if I am not selected for GSoc , Because It is my responsibility
Do you have any past involvement with the Swathanthra Malayalam Computing or another open source project as a contributor?
- I don't have considerable contribution to opensource projects .But I hope I can involve in open source projects in future
here is my github profile tachyons
Did you participate with the past GSoC programs, if so which years, which organizations?
- No
Do you have other obligations between May and August ? Please note that we expect the Summer of Code to be a full time, 40 hour a week commitment
- NO , I don't have such obligations
Will you continue contributing/ supporting the Swathanthra Malayalam Computing after the GSoC 2013 program, if yes, which area(s), you are interested in?
- Yes I will , My interested fields are
- Machine language translation system for malayalam
- Malayalam Speech Recognition
- Malayalam text to speech(Improving Dhani)
Why should we choose you over other applicants?
- I am a foss enthusiast and technology enthusiast. I am new to the field of digital sound processing ,But I have knowledge about digital signal processing and have solid background in major programming languages . I also planned to create a similar project as the academic project , So that I can spend more time and effort for this project
Proposal Description
Problem Statement
- Malayalam (മലയാളം), is a language spoken in India, predominantly in the state of Kerala. It is one of the 22 scheduled languages of India. Malayalam has official language status in the state of Kerala and in the union territories of Lakshadweep and Puducherry. It belongs to the Dravidian family of languages, and is spoken approximately by 33 million people according to the 2001 census.
- Malayalam do not have existing speech processing systems ,So in order to develop a speech processing sysstem we have to create an an acoustic model and language Model has to be developed for malayalam . This project isto create an acoustic model and language Model for malayalam so that it will be helpful for future development in malayalam speech Recognition system
Introduction
- Speech is the ancient but still more powerfull way of sharing of information . Even an illiterate person can share information via speech . But unlike other communication ways like text , Speech processing is very difficult to achive . Due to this complexity there is no perfect speech Recognition tools are available today . Carnegie Mellon University created an open source tool called CMU sphnix , It made considerable improvement in speech Recognition
- Speech Recognition in CMU sphnix
In CMU sphnix common way to recognize speech is the following: Take the wave form and split it on the utterance , Process that slice and determine match it with predefined database .
- Models
- Model describes some mathematical object that gathers common attributes of the spoken word.
- Acoustic models The acoustic model establishes a mapping between phonemes and their possible acoustic manifestations . It characterize how sound changes over time. It captures the characteristics of basic recognition units
- The language model describes the likelihood, probability, or penalty taken when a sequence or collection of words is seen. It attempts to convey behavior of the language and tries to predict the occurrence of specific word sequences possible in the language.
- Model describes some mathematical object that gathers common attributes of the spoken word.
The need you believe it fulfills
- Acoustic model
- The language model
How and who it will benefit in society
Previous attempts
Challenges
References
Any relevant experience you have
How you intend to implement your proposal
A rough time-line for your progress with phases
Any other details you feel we should consider
Tell us about something you have created.
Have you communicated with a potential mentor? If so, who?
- Not yet
SMC Wiki link of your proposal
http://wiki.smc.org.in/User:Tachyons/GSoC-speech_recognition