Sometimes, when walking in the park near my house, I think about big ideas, like AI for Good. It should be much easier to think about small kinds of good ideas, that actually help people, instead of being simply boring exercises for young programmers. But it isn't very easy to find small projects that are helpful to society, stretch and showcase a student's competence, and are also fun to deal with! yeah, I know I am asking for much, but really? It ought to be simple!
I have come up with a few ideas, but mostly things work the other way round. Students have their own ideas of what they want to do, we discuss issues -- sometimes quite a bit -- and sometimes things emerge that work nicely. One of these examples was Hugo Machado's app AplicAi! Hugo just finished his undergrad degree at Dept of Informatica at PUC-Rio and this app was his end of course project and thesis.
Hugo came up with a very interesting idea. He noticed that students at PUC-Rio tend to be worried about not knowing enough about new technologies and about not being ready for the market when they finish their degrees. He thought these students might want to do "tasks" for employers, to add to their resumes some professional experience. Clearly this is a good deal for employers (e.g. start-up workers or researchers) who might get extremely clever and dedicated workers for no money at all. Hugo wanted to create an app to match students and their competencies to employers and their proposed tasks.
This is an interesting AI task in itself, recommender systems are used everywhere, from big places like Netflix (who wants to serve you the best films for you) to small outlets who need to know where to spend their restricted sales budget. Because Hugo wanted to do a mobile app he actually could showcase his coding abilities with Flutter, BLOC, Flask and Firebase. And he could showcase his abilities with AI technologies like word embeddings, word2vec, word distance mover, etc. Together with Hugo's advisor, Prof Markus Endler, I worked on getting his recommender system as good as we could and showing (through evaluation) that his system worked as well as the ones we were emulating.
Hugo was very lucky to be able to find cleaned up data of a reasonable size to bootstrap his recommender system: an employment agency had collected data about 50K job offers and respective developer profiles. We would have preferred data about a wider range of jobs and profiles, but the data about developers should be able to get us a baseline for many technical areas.
This is a socially useful project, as the demand from PUC students for 'internships' is real. It is also extremely helpful to both small companies or researchers within the University, who can count on specific help for their projects, from the students with the requested competencies. Of course, a prototype for an undergraduate thesis can only do so much, but since the code is open (https://github.com/hugomachado93/aplicai) and based on open data, it can be re-used with newer NLP technology (perhaps BERT instead of word2vec), and hopefully more data (to extend it to other domains of skills besides software development).
Hugo's thesis was a fun project for all, I think. More importantly, it has many possible developments, so I hope the video about it will appear soon. And I'm looking forward to the writing of our joint paper!
Main sources:
1. @article{DBLP:journals/corr/LeQA17,
author = {Van{-}Duyet Le and Vo Minh Quan and Dang Quang An}, title = {Skill2vec: Machine Learning Approaches for Determining the Relevant Skill from Job Description}, journal = {CoRR}, volume = {abs/1707.09751}, year = {2017}, url = {http://arxiv.org/abs/1707.09751}, archivePrefix = {arXiv}, eprint = {1707.09751}, timestamp = {Mon, 13 Aug 2018 16:47:57 +0200}, biburl = {https://dblp.org/rec/journals/corr/LeQA17.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
2. Simon Hughes. How We Data-Mine Related Tech Skills.
2015. URL: http : / / insights . dice . com /
2015/03/16/how-we-data-mine-related-tech-skills/ (visited on 09/12/2017)
No comments:
Post a Comment