GSoC/2016/IRC Meet - May 5

This is the log of the IRC meeting that was held on May 5 during community bonding period.

 nice, everyone's here?  hi :) present yup! Hello everyone :) anwar? present! Hello :D  Hi all asdofindia, you logging? strike that not exactly. I'm thinking of copy paste okay, let's start it's connected to TG na. it's almost halfway through community bonding period meanwhile, everyone is free from all exams now, right? arushi, said she has some projects going on yes Yes  sort of, :D ... have 2 back paper to attend to  but thats not much of an issue Yes. Am free from exams. :) okay, so you can use at least the next two weeks for community bonding :D asdofinida, yes i have some honours work. I will be free by 11th.  hehe.. sure :) present first of all, I promise that you will not be accused of making too much noise, whatever you talk in whichever channel of our community because, usually, when I contribute I feel like not exhibiting my code, or not talking about my ideas... ....thinking that people will think I'm self centered that should not be a problem at least for us in GSoC period. Be enthusiastic enough to talk about every little thing in your code at least that'll help a lot of people who have become inert in our community in getting inspired and starting coding it also shows the world that you're thinking about your project when you eat, sleep, or do anything. sure. :)  hehe :D woah. okay. :D  Anwar are you there ??? that's the kind of commitment Google actually expects form GSoCers  @nalin yes I am here speaking of community, if you notice, our mailing list isn't very active these days (because there are no active flame wars). Neither is the discourse. so, you have more freedom to use it in any way you want ok like, anwar_n sent a mail with his blog link. Good going anwar_n also, maybe, the 7 of you can form an unofficial group to keep each other going (just giving you ideas) +1 +1 why not meet here, instead of an added channel? coming to the main point I'm trying to put across. :D  +1 in open source world, you haven't communicated if you haven't communicated in public  cool :D  this is my first time I very exited so, whatever communications, chats, updates, phone calls, personal meets, anything you have doesn't exist until and unless you have documented it publicly  *am usually, the best way to avoid having to "document" is to make the initial conversation itself public. that's why we encourage you to discuss on discourse, mailing list, etc. ok.. let's use the mailing list/discourse more. got it :) yep. :D you might get ideas from different people. You might choose to implement a particular thing in your project in one way because someone in the community told you so. But if that communication happened in private, once upon a time later, there would be no record on why that was done such.  right for a really bad example, imagine someone gave you a really bad idea and you implemented it. You can't later point fingers  lol and vice versa, imagine someone helping you a lot with your project, but nobody will realize that effort they put in  also, a small corollary point. Get in touch with the entire community, not just your mentor(s) *big There are a lot of people who are friendly and give a lot of new ideas in our community. We couldn't have made all of them mentors and therefore, use your mentors only when there's something you absolutely *need* your mentor for. For example, discussion of the various ideas you have and soliciting new ideas, etc can happen in public and the entire community can pitch in. But, when you finally want to settle the question and choose between two ideas that you've narrowed down to and the community keeps on confusing you even more, then you can go to your mentor and ask their help on finalizing a choice ...instead if all your discussions are with the mentor alone, then it really beats the purpose of the community and GSoC (i hope that point makes sense)  +1+1 ok.. get in touch with the community regularly <imSreenadh> yeah sure yeah! so, I hope you've gone through https://discourse.indicproject.org/t/gsoc-2016-projects/35/1 and please create threads for the remaining projects. <imSreenadh> done :) yes If you're a fan of blogging, you can create a new category in your blog for GSoC updates. For example, http://gemiam.in/gsoc-community-engagement-period/, http://gemiam.in/what%20i%20learned%20today/gsoc-weekly-update-5/ actually, read through the blog of gem to look at how a successful GSoC project looks like also, http://ershadk.com/ wish we had a planet to add all blogs to we had it broke down since VPS migration let's get it back online soon the very act of publicly writing about your project in your blog keeps you motivated to code more, so that you can brag more, and that's a virtuous cycle hehe :D makes sense :D Now, about code, I think irshad was asking, where to host the GSoC code for all projects based on existing code, it's easy. Just fork and merge in the upstream code that makes me wonder, nobody has any problem setting up the dev environment or getting the (existing) projects running, do you? I really hope you do, because then you can fix those documentation or make small patches :D I am done with it <imSreenadh> That being said puts me into a situation. :D. Where should I be putting up my works. I did my major on the same topic and used GitHub. imSreenadh, same topic, but different work, right? <imSreenadh> yeah there is no specific requirement on the platform where we host the personal repo, right? for my case fork and merge is a bit tedious i am not going to use anything from the existing codes stultus, help me out here. I think ideally all work should happen in a repository under our organization (because otherwise it really won't survive after GSoC) I will be using Gitlab as my primary hosting platform (will be using GitHub as a mirror/ only for putting up PRs). All my CI will be hooked to it. development should happen on personal repos. PRs to organization repos. Isn't that more suitable? if you're writing brand new code, you can ask someone to create a repo under the organization with a nice name that fits and then code there. bsc, hmm yes, for projects that already have organization repos it is straightforward thats would work for me I got a repo under libindic. Will follow the forking workflow just clarifying I don't mean you should directly commit to organization repos. I just mean, all the code should eventually be in the organization repo after review yup. Agree with that. basically, code that is in your computer alone, doesn't count :D asdofindia, I'm yet to read the entire chat, but 1.start a project repo under the organization(github/gitlab) 2.students clone this to their personal accounts 3. give PRs to the main repo/upstream. this is how it should work ^^ agree. <imSreenadh> alright! <imSreenadh> in my case, its better to start a new one? o.O (If you had seen the google calendar invite, I put 10:30 as end time so we could rush through( stultus, does the organization require us to maintain blogs? (but we can continue talking all time)) asdofindia: What are we supposed to submit for mid term evaluation? <imSreenadh> ^ +1 erm, you just deliver on your milestones till then. <imSreenadh> :D ah, but I have seen file uploads from last year !? I think some projects have "discuss and figure out what to do" till almost mid-term. It might be a good idea to rethink those and put some tangible deliverables before mid-term Alright, got it From what I saw in the last time's GSoC process, on reaching mid-term we should be clear on how to proceed and have already started in that direction. actually, you must have gotten many new ideas on improving your proposals by now. Now that you're accepted you have the freedom to make "positive" changes to your proposals asdofindia, about mandatory requirement of blog updates. I think they were mandatory last year. stultus can confirm actually, everyone, arushi bsc irshad imSreenadh jerin malayaleecoder and anwar_n who seems to have connection issues, I missed this from the agenda. But can you make your proposals public? The best way to do so would be to transcribe the details into the discourse thread because, till last year, the proposals were copied to the wiki so it was public already have an abridged version there. original proposal draft is referenced there. asdofindia, yup. I am planning to clean up, elaborate and make it public. :) but this year it was google draft and so nobody can see. will do it soon Sure, will do that and we can't actually make the pdfs public because it has your personal details <imSreenadh> sure, it has been a difficult time since 3 days, thanks to bsnl broadband (speaking of the connectivity lately) asdofindia: any modifications necessary? modifications aren't *necessary*. Maybe it's a good idea to stick to the original proposal for historic purposes asdofindia, wiki, is better for preserving history. :) yeah, but discourse is better for quoting and sparking discussion and constructive criticism. also, discourse does show revision history asdofindia, ah.. didn't know that. for example, click on the pencil icon in https://discourse.indicproject.org/t/gsoc-2016-projects/35 asdofindia, so, discourse is fine there's a question still not answered, are blog updates mandatory? :D let's consider it mandatory. let's so blog != discourse ? here's how it goes. Deep discussions happen in discourse and/or mailing list. Quick questions in IRC. Documentation of code should go into code repostiories or official documentation. Then, what's left is documentation of the process. Discourse threads is full of discussions and is in a sense a detailed record of the process. Blog posts will give a condensed, structured overview with a logical flow in layman language where required. please keep personal blogs and consider that as mandatory. update weekly. just put bullet points if you are really busy on a particular week all clear ok personal blog is for your own record, you should experience the feeling of reading your old blogs later. ok <imSreenadh> okiee stultus, during community bonding period also? *clarifying or after the "actual work on project" starts so, in a sense, use discourse like twitter for quick updates and back and forth. Descriptive commit messages are fun too. But a blog post is like nothing else when it comes to describing a week in a thousand words. bsc, yes bsc, actual work already started :) bsc, these all are actual work, including this meeting  :D you know what I meant. <imSreenadh> :v the quotes was supposed to do something there. bsc, and there is no specified format for the blog, you can include your personal details ans showcase your writing skills stultus, ok. cool. :) I think there are two takeaway messages. 1) Think, talk, debate in public 2) Involve the entire community 3) Document everything. :) and adding to the point of making the proposals public, read other people's proposal when you get time bsc: I'd like to see your approach to the spellchecker. :) and this is mandatory for the ones who are working on the similar projects like bsc & jerin  actually, we should pitch them against each other .  :-o oh my the battle to split the word and irshad it will be nice if you can check our existing transliteration projects, (swanalekha&mozhi scheme) and varnam.  jerin, sure. let me cleanup and elaborate my proposal with details and post it in discourse. Want to get your opinion. :) asdofindia, :D reminds me, in open source, competition and rivalry is a major drive to code. <imSreenadh> hehe asdofindia, continue I still have an open challenge to make libindic obsolete with indicjs asdofindia, nope. Still we will need wrappers for all languages. So, libindic won't be obsolete. asdofindia, :P okay it's 11 and i've covered everything I had in mind. just wanted to ask, everyone completed all the gsoc procedures? :D stultus, asdofindia I think we can wind up for today. If no one else have any questions/doubts to shoot. I have no more. <imSreenadh> i ahve a topic related to shoot <imSreenadh> can i? <imSreenadh> :D jerin, no. I have to scan and upload that tax form thingy tomorrow. imSreenadh, don't ask to ask. just ask. :) don't ask permission to ask. ask. :D <imSreenadh> :D jinx <imSreenadh> lol <imSreenadh> Going into a topic related query, how vast should the LM,AM be by the end of the program (as far as GSoC is concerned and its evaluations, though i'll be trying to chip in works later as well) imSreenadh, what are LM and AM? Didn't get them. Language Model https://discourse.indicproject.org/t/language-model-and-acoustic-model-speech-recognition-development-for-malayalam-using-cmu-sphinx/33 <imSreenadh> oops.. Acoustic Model <imSreenadh> ya Ah. Ok. <imSreenadh> sorry about that. my bad i've checked the existing transliteration project, i've done some work in this field and it seems the exiting module performs very poor compared to machine learning techniques stultus: who's the mentor <imSreenadh> Deepa mam never came across that name before in the org pages I have no idea what AM and LM are technically. But, imSreenadh a rule of thumb would be, you have done a good job if the thing you build is usable stultus, that is why i wanted to start afresh like, at least it should detect Thrissur malayalam <imSreenadh> umm, actually thats the point, the one i built is a sample done with 93 odd words <imSreenadh> lol asdofindia, do we want a full-fledged product at the end, or a Proof-of-Concept? Or something that is atleast usable? <imSreenadh> the problem is training acoustic model and am concerned if i'l be bale get the training data, given the fact that its too dormant area <imSreenadh> able* oh. irshad, okey. I was not asking to add to them. I was just telling that you should check the projects that are trying to fix the same problem (here transliteration) data's bottlenecking everyone and btw nkn__ do we have any ML in varnam? irshad: you have the data? jerlin, yes i've the transliteration training data <nkn__> stultus: yes. varnam has a learning model nkn__, please have a chat with irshad when you have time :) see, the way I look at it is with this analogy. We say that neural networks can become Artificial Intelligence. But till it does become so it doesn't become so. Or, a better example, imagine wikipedia has only 100 articles. We can say wikipedia will be the largest encyclopedia if there's enough editors. And then if we say, hence I have a proof of concept and quit our work, we don't have a largest encyclopedia the actual work is in making things that work. <nkn__> irshad: I'm not sure about your proposal. But did you check varnam? <imSreenadh> by data, i meant the audio sample recording for the words the project should support. o.O asdofindia, sorry, but I didn't get that analogy. :( imSreenadh, I have zero knowledge in the area, so please talk with deepa teacher and fix the deliverables and all. <imSreenadh> i'll be meeting her monday. :) imSreenadh, cool convey my regards <imSreenadh> i had the very same issue when i started my major. <imSreenadh> so i had to cut down the vocabulary to 93 <nkn__> irshad: varnam also has a learning model built in and it can do predictive text input. as far as Malayalam is concerned, the output is pretty good. Hindi also should be good imSreenadh, and please have a look at the dhwani project if you haven't already <imSreenadh> so, just checking for a heads up :) (I'm not sure what to look for, but just have a look :D ) nkn: no i have not checked vernam stultus, :D eh, i mean, in some projects, creating a proof of concept isn't the hard part. It's collecting the data, making the proof of concept scale into a useful product. So, by the end of your project there's no sight of a good usable product, then it'll never be. consider making it easy to gather/train data a part of your project and then it becomes really great. asdofindia: but I believe the organization can crowdsource and get annotated data one individual working on it won't amount to much <imSreenadh> yea :/ yes, jerin. Create a process by which the organization can crowdsource. If you do that much, the rest can be done by the community. <imSreenadh> does anyone over here have exp with sphinx. (**just wondering**) nkn: i thought there was only one module for transliteration namely 'Transliteration'. i'll check varnam as soon as possible imSreenadh, yes. I tried to use it and failed miserably and was lazy to try again. My exp. :D <nkn__> irshad: you can try online - http://varnamproject.com/editor <imSreenadh> :D irshad, varnam is not in libindic yet <imSreenadh> i understand the lazy part nkn: ok <imSreenadh> :D irshad: varnam saves frequencies of words typed in and based on that gives the most frequent one irshad, libindic has only one transliteration module jerin, for example, think of how Google crowd sources transliteration. It trains its AI by making people select from a set of suggestions. Here, people are unknowingly creating data. Think of building things like that to get data for you language isn't a formal system, its glorious chaos bsc, that's why it's interesting .. you vs jerin :D asdofindia, :P don't worry we are not evaluating projects against other projects. so feel free to have as a strong healthy competition :P <imSreenadh> :D <imSreenadh> so i guess i'll discuss the "the end goal" with Deepa mam? :) asdofindia: google can add it in, we can't. Even here in the labs, they hire people to annotate data. The crowdsourcing idea is feasible only if you can land that much a crowd. :D btw shall we know the (home/work/college) locations of each of you. (if you don't mind sharing it)  ok. a general doubt. how many of the participants are Keralites Just to know who all will understand malayalam swearings. :D enikku manassilavum <imSreenadh> Am from Payyanur(Kannur, Kerala), final year, College of Engineering Trikaripur. <imSreenadh> enikummm enikkum :P <imSreenadh> adipoli :P stultus: institute, hyderabad. malayaleecoder, what is your real name? :P <imSreenadh> :D malayaleecoder, sorry I forgot, don't ask me to check the proposal ****** it is ... cool  malayaleecoder, and you are studying in ? <anwar_n> i can understand malayalam irshad, jerin and arushi are in iiit-H right? IITB stultus: yes malayaleecoder, cool jerin, I see there's a "Semi-automated annotation tool" in your proposals. Just make it as convenient as possible to make a set of volunteers build annotated data with that. asdofindia, dush.. jerin is also trying to make it less dependant on data.. Am going through his proposal. :D https://github.com/jerinphilip/sandhi-splitter bsc: but it requires data to train. what do you mean by data here? :-o bsc: A dictionary of sorts? jerin, word corpus for spellchecker. not training data. mia culpa asdofindia: when copying the irc chat, if possible, do delete out my name :P I try my best not to let it out through this account _/\_ <imSreenadh> :v malayaleecoder: why? what? :-o <imSreenadh> anonymous aye? :P jerin, we can use wikisource to create the training data, since we are proofreading there, it should be mostly error free malayaleecoder, I think there's a search result from bugzilla or something that shows your name with your nick actually, that's where I'm copy pasting from to test the tool :=D stulus: yes, but i am in J&K at this moment. I came to attend my cousins wedding. I'll be back to iiith 19th may. jerin, Nice.. Am selecting wikisource and random news paper portals. for testing purpose. irshad, cool :) I want to visit J&K someday :) asdofindia: How come you dwell so much into bugzilla :\ thats impressive arushi, there? malayaleecoder, use the search engine luke erm, malayaleecoder I was googling to make sure I was not making your identity public before sending that congrats mail to the list stulus: you are welcome anytime :) <imSreenadh> fellow participants, is there anything that v should be concerned about, considering the hell lot of queries going on in gsoc-student-mail list irshad, :) asdofindia, stultus : thats weird, I don't get it in my search ?! anyway, forget it imSreenadh: Just let payoneer take 2%, upload tax form and you're good. imSreenadh, I finally decided to refer only the official site. ML became too noisy malayaleecoder, https://encrypted.google.com/search?hl=en&q=malayaleecoder%20****** <imSreenadh> tax form is done. there is a lot going on, i can't keep track [rolling eyes] <anwar_n> okey good night guyz. sweet dreams....... shall we wind up?? <imSreenadh> bsc, lol night is still young, folks <imSreenadh> still [rolling eyes] <imSreenadh> :D stultus: so, any fixed meeting schedule? I am slowly dialing down the nightowlness.. :D :| stultus, you got your laptop back? bsc, rain is over, this is trees showering bsc, fixed the old one :P stultus, Masha Dinka. stultus, ask me next year also what my laptop model is. ok? bsc, (and ordered a new one) Ah!! bsc, na ordered :P Ahankari! stultus, that order still didn't arrive yet? <imSreenadh> eh! bsc, read as asked mothalali to order stultus, :D So, am going. Gn8 everyone... jerin, I haven't followed up as I got my old one working :P stultus, asdofindia jerin imSreenadh anwar_n arushi irshad nkn__ <imSreenadh> nice meeting guys. :D I'll wait 10 more minutes to ping bsc good night so he wakes up from sleep <nkn__> good night guys... <imSreenadh> :v <imSreenadh> bye then. GN > jerin: stultus: so, any fixed meeting schedule? still not answered biweekly? as in fortnightly? asdofindia: that would be good :D
 * anwar_n (~linux@14.139.185.2) has joined
 * AndChat-4016 has quit (Quit: Bye)
 * Nalin (~Nalin.x.G@117.206.11.94) has joined
 * [gem] (~gem@106.51.29.74): Nandaja Varma
 * [gem] #smc-project
 * [gem] hitchcock.freenode.net :Sofia, BG, EU
 * [gem] is logged in as gem
 * [gem] End of WHOIS list.
 * arushi has quit (Ping timeout: 250 seconds)
 * arushi (yash@nat/iiit/x-hupbalvszuwmaauw) has joined
 * anwar_n has quit (Quit: Lost terminal)
 * nkn___ has quit (Remote host closed the connection)
 * nkn__ (~nkn__@122.167.46.248) has joined
 * anwar_n (~linux@14.139.185.2) has joined
 * stultus is back
 * bsc is trying to make the spell checker less dependent on data. The irony. :D
 * jerin has quit (Quit: Leaving.)
 * jerin (evilscient@unaffiliated/evilscientist) has joined
 * bsc from Kalady (Ernakulam, Kerala).
 * bsc may fall on the keyboard any moment.
 * Nalin has quit (Ping timeout: 276 seconds)
 * anwar_n has quit (Quit: leaving)
 * nkn__ has quit (Remote host closed the connection)
 * imSreenadh (75d52b23@gateway/web/freenode/ip.117.213.43.35) has left
 * nkn__ (~nkn__@122.167.46.248) has joined
 * stultus has quit (Ping timeout: 244 seconds)