Monday, October 13, 2014

World domination Brain-style

As I was discussing in "A modest proposal for Portuguese"  we have so far two open-source lexical resources for Portuguese (OpenWordNet-PT and NomLex-PT) but I can see many different applications for these and many interesting ways of building from these in to new resources and systems.

Despite the lack of official funding our group  (Alexandre RademakerGerard de Melo, Livy Real, Claudia Freitas, Dario Oliveira, Suemi Higuchi, Fabricio Chalub) is growing and we even managed to finally get a paper in one of the most important conferences in Computational Linguistics in Brazil, PROPOR 2014 edition. This was about the use of corpora to improve NomLex-PT, the slides (presented by Alexandre) are here.

This last September, apart from our Workshop on Logics and Ontologies for Natural Languages, we've managed to have at least  three informal working meetings (one in the Livraria Argumento, one in FGV, after my talk and one in PUC with Dario, who's usually in Sao Paulo).

 Besides the FGV talk, I spoke both at PUC-Rio and at COPPE about our generic plan of world domination Brain-style... Jokes apart, I tried to describe how we can work in several fronts (building lexical resources, using these resources for information extraction, ontology building, reasoning with logic from text  and question answering,  amongst others..) in an informal, but tightly integrated collaboration.

I think it's working. This year we had some eight papers in total. Two at the Global WordNet Association meeting proceedings: one a progress report on OpenWordNet-PT, the other a description of how we created NomLex-PT from a translation of the original NomLex enriched with electronic dictionaries and manually verified. Then  one poster at LREC, explaining how we grew NomLex-PT  and then integrated it with OpenWorNet-PT. Then the PROPOR paper above which extends NomLex-PT with corpora information from the AC/DC collection. And two extra posters for the workshop TorPorEsp, one of the Verb Lexicon  of OpenWordNet-PT and one completing our stock of Portuguese nominalizations, this time using the Spanish lexicon AnCora-Nom.

Need to add to this list the paper on the application of the work on Portuguese lexical resources to our historical corpus, the Dictionary of Historic Brazilian Biographies, Fun Information Extraction from a Historical Dictionary and the work in English on the Implicative Lexicon (ImpLex) at CICling, Sense-Specific Implicative Commitments.

Finally there is the paper with Vivek Nigam on using term rewriting in Maude to reason with natural language representations, "Towards a Rewriting Framework for Textual Entailment". I gave a very short talk about it in Brasilia, at LSFA 2014, and a more leisurely on at COPPE, where after lunch Celina, Simone, Petrucio and I went to see this beautiful colonial church.

