Monday, December 28, 2020

Portuguese on my mind



Marcelo Finger and Thiago Pardo

just scored high for NLP and AI in Brazil. They now head a big, new, and shining project with IBM and FAPESP  in the new center for AI, just launched in October 2020.

The picture above is from their project NLP2 Resources to Bring NLP of Portuguese to State-of-Ardescription in http://c4ai.inova.usp.br/nlp2-en/. 

This is great news! I have been saying and writing for over ten years now that we need lexical and semantic open-source resources for Portuguese. 

In particular, I'm very pleased because the two blue links in the picture above, both refer to my work. 

The Universal Dependencies for Portuguese is my work with Alexandre Rademaker, Fabricio Chalub, Claudia Freitas, Livy Real, and Eckhart Bick from 2017. The second blue link SICK-BR is my work with Livy Real, Ana Rodrigues, Andressa Vieira e Silva, Beatriz Albiero, Bruna Thalenberg, Bruno Guide, Cindy Silva, Guilherme de Oliveira Lima, Igor CS Câmara, Miloš Stanojević and Rodrigo Souza. 

I have mentioned to both Marcelo and Thiago that it would be nice to get our papers on their page. These are our goals too and we have been working on them for quite a while. They are working on it!



Sunday, December 27, 2020

von Neuman and understanding

So the joke goes something like this: A student asked John Von Neumann when he would start understanding things. And van Neuman replied "Young man, in mathematics you don't understand things. You just get used to them." 
 
I like this way of putting things. I can recall vividly all the ways things that I now hold dear in mathematics were once completely impenetrable. 

From the time I was first taught in high school about matrices and I kept asking myself and the teacher, but `what difference does it make if I write numbers on shapes or not?' It shouldn't make any difference how we lay out the numbers! To when I finally got the hang of Linear Algebra, after months of resisting it: the notion that a vector space could be defined by the properties it has, instead of by what it is. This was, mind-boggling and it still is, a little. 

More disturbing still was when I first learned about Natural Deduction from my first Logic teacher (Luiz Carlos Pereira) and I had the bad idea of telling him that I couldn't see the point of having axioms, sequents and natural deduction to define the "same" system.  It seemed to me that logicians hadn't gotten their formalizations sorted out, yet. Ah, the foolishness of youth.  Now I can see that some of this Bourbakianism of believing in "the most" perfect formalization of mathematical concepts is not only silly, it's bad for mathematics and for science in general. (also the most embarrassing detail is that Luiz Carlos told Prof Prawitz about my stupid remarks.) 

oh well, everyone knows I have very strong convictions, that is --translating-- that I am as stubborn as a mule. Mostly my stupidity is harmless, like, it took me forever to learn how to bike, because I was convinced that it was not possible for bikes to stay upright. or when I decided that people could not swim, as otherwise, why would anyone drown? Again it took me forever to learn how to swim, since I believed it was impossible to float.

But despite all my blunders, I still think that it's important to have your own ideas on the mathematical concepts you're taught and what they mean and don't mean. and whether they `have legs' (will go far or not). So I hope to re-ignite the more abstract side of this blog and to discuss a few more mathematics, while I can. The worst thing that can happen is that I am wrong. I have plenty of experience in this department.

ps: Check out the Wikipedia page on "The Martians", I had no idea all these guys were from Budapest!

Sunday, December 13, 2020

Things I am proud of

 


In these times of pandemic, I (and I believe everyone else too) get depressed about what I have (not) been doing with my life: all the ways that I could be a socially more useful person and I am not. all the infinite hours that I spent fighting bugs in programs or bugs in my understanding of things, that I could and should have spent fighting the bad guys in the actual world.

So as I way to cheer myself up I thought I'd write a bit about stuff I have been doing that I think is cool, that, to put it the Marie-Kondo way, gives me joy. Funnily enough, these things are hard to put in resumes or curriculum vitae. But since this is "candy for the soul" and too much candy does make you sick, I think I will do it in small doses, in several posts, with a decent amount of time between them.

Because I was looking for something else, I found the message from 2002 when Bob Rosebrugh invited me to be an editor of TAC (Theory and Applications of Categories). TAC was one of the cleverest things that category theorists did very early on (together with managing to keep a mailing list going). We've had one of the first open-source journals in Mathematics,  since 1995. And I was lucky enough to be invited to its Editorial Board in 2002. Maybe it was the allure of "industrial mathematics", I was in Xerox PARC then. Who knows? 

When I joined  TAC's Editorial Board there was only one woman there, Susan Niefield. She had been the only woman there since 1995.  Now there are six of us, better. TAC has published around 770 articles in its entire career, so far. More recently, having been active in NLP where a single conference, EMNLP 2020 has had the following data:

After receiving 3677 submissions, 3359 of these went through review, of which 754 were accepted to EMNLP and 520 were accepted to Findings of EMNLP. This gives an acceptance rate of 22.4% for EMNLP and a further 15.5% for Findings.

I found out the extent to which the numbers can be different. very different indeed. 

Calling a `citation' the minimum unit of measurement of productivity in Academia is very misleading too. Everyone knows this! But as we are always reminded (e.g. Dunne's short summary) people measure what they can, or "You get what you measure".  But more than individual researchers' gaming the system and/or groups of scientists or publishers ganging up in 'citation farms' (which Dunne discusses), there are also the societal prejudices and old structures conspiring against women, black or brown researchers, gender non-conforming researchers, researchers not from the Global North, etc that change the landscape of academic fields. And keeping working at pointing out these things, in the long run, can be extremely tiring. Deadlines always coincide (Murphy's Law), disease and small (and big disasters) always occur, and constructing things (even simple, small ones like a workshop) is always much more work and time spent than you could possibly estimate.

So, to begin with, a list of things I'm proud of, and I might (or not) discuss these in future blog posts, as time permits:

1. Editorial Boards of TAC, Logical Methods in Computer Science, Logica Universalis, Compositionality.

2. Industry Advisory Board of the Masters in NLP program of UC Santa Cruz.

3. Scientific Advisory Board of the Institute of Logic, Language and Computation (ILLC), University of Amsterdam.

4. Council of the Division of Logic, Methodology and Philosophy of Science and Technology of the International Union of History and Philosophy of Science and Technology, 2020-2023.

5. Ambassador for Logic, Vienna Centre for Logic and Algorithms. 

Special mention to the ``Encontro Brasileiro de Mulheres Matematicas" at IMPA in 2019, where I talked about how Applied Category Theory is the way I want to connect algebra, programming, and logic, but especially why I think we need to pay attention to gender gaps in maths, computer science, and logic.


Wednesday, November 18, 2020

Counting Intuitions

 This post is an exercise in thinking about vague things.

I think everyone can agree that there are infinity more ways for things to be bad, to not work, as there are for them to work. For things to work, you need a big conjunction of things, you need to be healthy, you need a decent house,  you need good food, you need friends, you need amusements, you need purpose, you need things to be good for your friends, etc...you know the list just keeps growing. While for things to not work, only one of them must be missing. So clearly it's much easier for things to not work, then for them to work, regardless of whichever priority order you put on your personal list.

Even people like me, who do not like probabilities and that have difficulties with them can see that the bad scenarios are much more probable in the big scheme of things than the good ones.

But there is some help to be had. By and large it seems that verifying that something is true is much easier than discovering when something is true. This (possibly very large) gap between the easiness of checking a given answer versus the hardness of coming up with a possible answer is one of the mathematicians most used tools. Note that we do not have a proof for it, it is just obvious for, say, equations of second degree or systems of linear equations. But we extrapolate, we reckon it might happen in all branches of mathematics.

Another helping tool is symmetry: we believe it is everywhere and we love it. It does help to half the work in many situations and the Universe does seem to have a penchant for it. or maybe it's just us, humans, that see it everywhere, when it's only there in some very special places. I don't know of any attempt to measure the amount of symmetry of the Universe, but I know that mathematicians, if they can,  will make things symmetric: symmetric things are prettier.

And a third helping tool is `adversarial thinking', in whichever way you may want to think about it. So this might be games, where there are proponents and opponents and they battle their wits over the truth or falsity of propositions (the mathematician thinking about it might play both sides and hence, perhaps see more clearly the weaknesses of arguments of the other side). Or it might be adversarial training in machine learning, which I don't really enough about to pass judgement on.

In any case, these generic tools are about trying to make the problem easier, about simplifying problems by trying to see what would happen, if they were indeed simpler and easier. 

But of course we know that many times things that are simpler, that look intuitive and clear are just plainly  wrong. The Sun does not move around the Earth; heavier things do not fall faster; things that do seem to stop if not subject to forces non-stop,  actually would keep going non-stop if the attrition of other forces did not stop them. Similarly, lovely graphs in Geometry prove wrong things, because you cannot trust graphs (Escher pictures anyone?)


So one of main points of the apprenticeship in Mathematics is learning to distrust your intuitions. A bit like philosophers who start asking "why" about any and everything, mathematicians have to learn to read the books of they favorite authors doubting every word and trying to prove or disprove every sentence. Being a mathematician is about verifying always; trusting only in special occasions, if at all.



Saturday, November 14, 2020

Partiality Insights

 No, I don't mean political nepotism nor do I mean favoritism within families. 

By partiality, I just mean the prosaic fact that, sometimes, functions are not defined everywhere. 

From humdrum 'step functions' 


to the beautiful poetry of '1/x':
Anyone, mathematician or not, can see the beautiful poetry of missing a zero, but gaining two infinities!

However, Category Theory has only total functions and we need to deal, with grace, as point out Cockett and Garner, with partial functions. What can we do?

It turns out that several partial solutions are available and here are some of the ones I know about:

1. We can use some 3-valued logic, where the third truth-value is some sort of undefined (and there are a few extra choices to be made here);

2. We can use the exceptions monad T(A)= A+1, where A stand for the normal values and 1 is the error of type A;

3. We can talk, like Fourman and Scott do, of "existentials that are uniquely defined";

4. We can try to choose between de Paola and Heller's 'dominical categories', Rosolini's 'P-categories' or Cockett's restriction categories.

Now, if we were to do dialectica constructions, paying attention to partiality, which of these alternatives would be easiest for us? Are there other constructions that are better?

Logic: a quote or two



Confirming once again that nothing is black or white, but an infinitude of greyness, TU - Wien is celebrating World Logic Day 2021. As they wrote:

To enhance public understanding of logic and its implications for science, technology and innovation, in 2019 UNESCO proclaimed the 14 of January the"World Logic Day". The date was selected in honour of Alfred Tarski (born on January 14th) and Kurt Gödel (who died on this date).  We, as the Vienna Center for Logic and Algorithms (VCLA at TU Wien), would like to celebrate World Logic Day 2021.

[...]

We would be honoured if you would be an Ambassador and Supporter of World Logic Day 2021. In the affirmative case, please send us a quote about logic not exceeding 50 words. 

I do want to be an Ambassador and Supporter of Logic, not of Logic Day, so I have been thinking about it on and off for some days. 

I'm not good with epigrams and such-like. I wish I could make slogans like some of my friends. Maybe I should crowdsource this task on Twitter. 

But I do feel that one of the best things ever said about logic is the cartoon from the New Yorker above. This cartoon used to hang from Martin Hyland's door in the old DPMMS building in Mill Lane, when I was doing my phd.

Monday, October 26, 2020

Perils and tribulations of fake Publishing

 


I really must be doing my real work, instead of worrying about the misdeeds in the publishing world.

But given that finding old things in the internet is so difficult and that remembering fraudster's names is so hard, here goes a quick post with a collection of links that hopefully won't disappear too quickly.

First my favourite fraud, from 2010

[PDF] Ike Antkare one of the great stars in the scientific firmament

to the new version in the book

Gaming the Metrics: Misconduct and Manipulation in Academic Research

https://escholarship.org/uc/item/6096m1sp

ISBN

9780262356565

Publication Date

2020-01-28

Then the Japanese health science scandal, which is more serious as the science our doctors practice comes from these faulty clinical trials and meta-studies.

Researcher at the center of an epic fraud remains an enigma to those who exposed him

A little note (from 2014) in the Atlantic:

More Computer-Generated Nonsense Papers Pulled From Science Journals



 

Tuesday, October 13, 2020

Ada Lovelace Day 2020: Andrea Loparic


This year I want to celebrate a female logician from Brazil that I admire a lot. There are many female logicians from Brazil that I admire and I am always worried that choosing any one of them to start from might cause difficulties with the others. This is a reasonable worry, methinks.

But I am taking a leaf from Tim Gowers' book.  Gowers was the person who initiated the boycotting of Elsevier in 2011. When people asked him why he was singling out Elsevier as a bad  player and boycotting them instead of, say, Springer, he replied that the boycott had to start somewhere and that Elsevier were really egregious in the behavior. Dualizing all these bad things, it seems clear that, if I am going to celebrate several female logicians from Brazil, I might as well start with Andrea Loparic, as she is definitely and clearly very good.

Phil Papers has only four of her papers, so far:

  1. Semantical Analysis of Arruda da Costap Systems and Adjacent Non-Replacement Relevant Systems.Richard Routley & Andréa Loparić - 1978 - Studia Logica 37 (4):301 - 320.

  2.  12
    Two Systems of Deontic Logic.Andréa Loparic & L. Puga - 1986 - Bulletin of the Section of Logic 15 (4):137-141.
      
  3.  17
    Valuation Semantics for Intuitionic Propositional Calculus and Some of its Subcalculi.Andréa Loparić - 2010 - Principia: An International Journal of Epistemology 14 (1):125-33.
    In this paper, we present valuation semantics for the Propositional Intuitionistic Calculus (also called Heyting Calculus) and three important subcalculi: the Implicative, the Positive and the Minimal Calculus (also known as Kolmogoroff or Johansson Calculus). Algorithms based in our definitions yields decision methods for these calculi. DOI:10.5007/1808-1711.2010v14n1p125.
     
  4.  15
    The Method of Valuations in Modal Logic.Andréa Loparic - 1978 - Bulletin of the Section of Logic 7 (2):91-91.

    (Google Scholar has more, but it's difficult to know which ones are hers)

    I really would like to do some logic work with her and we kind of started something. We were checking out the Kleene constructive propositional theorems. For me this was the basis of the project to benchmark linear logic, described in 

    Carlos Olarte, Valeria de Paiva, Elaine Pimentel, Giselle Reis. The ILLTP Library for Intuitionistic Linear Logic. arXiv preprint arXiv:1904.06850, 01 February 2019. after Linearity 2018. [PDF
     
    She told me that she had taught these exercises in Kleene's book so many times that it would be easy to reproduce their proofs. I still want those proofs, but I'd prefer them in ND or sequent calculus, instead of axiomatic proofs. 

    I am told that Andrea will not like this celebration, as she apparently does not approve of feminist claims and demands and Ada Lovelace is about demanding that attention be paid to our  achievements. I hope this is not so, or that at least, she's amused, instead of irritated by the celebration!

Friday, October 9, 2020

Mathematistan for All

I do love this map from the Zoog video https://www.youtube.com/watch?v=XqpvBaiJRHo.

Yes, Proof Theory does not show up, neither do Recursion or Set Theory, but the Axiom of Choice is a bright lighthouse in the Ocean of Logic, which at least has a long coast (beach, anyone?) of Category Theory. 

But I hope we can get a nice map of Logic along similar principles and with a similar aesthetics, one of these days. Andres Villaveces tells me he and his students are building one. Meanwhile here's the announcement of my talk at his seminar. The video of me talking pure CT to Andres Villaveces students about Dialectica constructions seems to have disappeared, the link he sent me goes nowhere now. Sad.
 

Thursday, October 8, 2020

Everyone needs a YouTube channel

But mine is not working quite yet. 

Somehow simply copying videos doesn't seem to work. 

Everything that I have tried ends up missing audio or only recording a few seconds or some other disaster. 

 So here is a YouTube list: 

Benchmarking Linear Logic Theorems (Oct2020), talk at the Augusta Colloquium


 Natal and Modal Type Theory (2015)

At the MIT Seminar talking about

relevance logic (July 2020)

Talking at FGV about


semantic parsing in Portuguese.

Talking in the SuperGroup about Lambek Dialectica categories


Talk at Logicos em Quarentena (February 2020) on Structural and Distributional Representations




Talking at IMPA on the First Meeting of Brazilian Mathematics Women (2019)


  Sharing the


TUTORIA with Jaqueline

 


PARC Forum invited by Craig Eldershaw, 2009

Monday, August 31, 2020

What did you do in your Summer vacation?



This is kind of silly. It would make more sense to have this list hanging of the homepage of OWN-PT itself, and maybe it will get there soon, but for the time being I need a list of our publications about OpenWordnet-PT and so this is it. Mostly copied from Alexandre's list of publications (thanks for being so organized Alexandre!!!).

  1. de Paiva, Valeria, and Alexandre Rademaker. 2012. “Revisiting a Brazilian WordNet.” In Proceedings of Global Wordnet Conference. Matsue: Global Wordnet Association.
  2. de Paiva, Valeria, Alexandre Rademaker, and Gerard de Melo. 2012. “OpenWordNet-PT: An Open Brazilian Wordnet for Reasoning.” In Proceedings of COLING 2012: Demonstration Papers, 353–60. Mumbai, India: The COLING 2012 Organizing Committee. http://www.aclweb.org/anthology/C12-3044.
  3. Rademaker, Alexandre, Valeria de Paiva, Gerard de Melo, Livy Real, and Maira Gatti. 2014. “OpenWordNet-PT: A Project Report.” In Proceedings of the 7th Global WordNet Conference, edited by Heili Orav, Christiane Fellbaum, and Piek Vossen. Tartu, Estonia. http://globalwordnet.org/global-wordnet-conferences-2/
  4. Real, Livy, Alexandre Rademaker, Valeria de Paiva, and Gerard de Melo. 2014. “Embedding NomLex-BR Nominalizations into OpenWordnet-PT.” In Proceedings of the 7th Global WordNet Conference, edited by Heili Orav, Christiane Fellbaum, and Piek Vossen, 378–82. Tartu, Estonia. http://globalwordnet.org/global-wordnet-conferences-2/
  5. de Paiva, Valeria, Livy Real, Alexandre Rademaker, and Gerard de Melo. 26AD. “NomLex-PT: A Lexicon of Portuguese Nominalizations.” In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), edited by Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis. Reykjavik, Iceland: European Language Resources Association (ELRA)
  6. Freitas, Cláudia, Valeria de Paiva, Alexandre Rademaker, Gerard de Melo, Livy Real, and Anne de Araujo Correia da Silva. 2014. “Extending a Lexicon of Portuguese Nominalizations with Data from Corpora.” In Computational Processing of the Portuguese Language, 11th International Conference, PROPOR 2014, edited by Jorge Baptista, Nuno Mamede, Sara Candeias, Ivandré Paraboni, Thiago A. S. Pardo, and Maria das Graças Volpe Nunes. São Carlos, Brazil: Springer.
  7. de Paiva, Valeria, Cláudia Freitas, Livy Real, and Alexandre Rademaker. 2014. “Improving the Verb Lexicon of OpenWordnet-PT.” In Proceedings of Workshop on Tools and Resources for Automatically Processing Portuguese and Spanish (ToRPorEsp), edited by Laura Alonso Alemany, Muntsa Padró, Alexandre Rademaker, and Aline Villavicencio. São Carlos, Brazil: Biblioteca Digital Brasileira de Computação, UFMG, Brazil. http://www.lbd.dcc.ufmg.br/bdbcomp/servlet/Evento?id=755.
  8. de Paiva, Valeria De, Dário Oliveira, Suemi Higuchi, Alexandre Rademaker, and Gerard De Melo. 2014. “Exploratory Information Extraction from a Historical Dictionary.” In IEEE 10th International Conference on e-Science (e-Science), 2:11–18. IEEE. https://doi.org/http://dx.doi.org/10.1109/eScience.2014.50.
  9. Oliveira, Hugo Gonçalo, Valeria de Paiva, Cláudia Freitas, Alexandre Rademaker, Livy Real, and Alberto Simões. 2015. “As Wordnets Do Português.” Oslo Studies in Language 7 (1): 397–424
  10. Rademaker, Alexandre, Dário Augusto Borges Oliveira, Valeria de Paiva, Suemi Higuchi, Asla Medeiros e Sá, and Moacyr Alvim. 2015. “A Linked Open Data Architecture for the Historical Archives of the Getulio Vargas Foundation.” International Journal on Digital Libraries 15 (2-4): 153–67. https://doi.org/10.1007/s00799-015-0147-1
  11. Real, Livy, Fabricio Chalub, Valeria de Paiva, Claudia Freitas, and Alexandre Rademaker. 2015. “Seeing Is Correcting: Curating Lexical Resources Using Social Interfaces.” In Proceedings of 53rd Annual Meeting of The Association for Computational Linguistics and The 7th International Joint Conference on Natural Language Processing of Asian Federation of Natural Language Processing - Fourth Workshop on Linked Data in Linguistics: Resources and Applications (LDL 2015). Beijing, China.
  12. Paiva, Valeria de, Livy Real, Hugo Gonçalo Oliveira, Alexandre Rademaker, Cláudia Freitas, and Alberto Simões. 2016. “An Overview of Portuguese WordNets.” In Global Wordnet Conference 2016. Bucharest, Romenia.
  13. Real, Livy, Valeria de Paiva, Fabricio Chalub, and Alexandre Rademaker. 2016. “Gentle with Gentilics.” In Joint Second Workshop on Language and Ontologies (LangOnto2) and Terminology and Knowledge Structures (TermiKS) (Co-Located with LREC 2016). Slovenia.
  14. Chalub, Fabricio, Livy Real, Alexandre Rademaker, and Valeria de Paiva. 2016. “Semantic Links for Portuguese.” In 10th Edition of Its Language Resources and Evaluation Conference (LREC). Portoroz, Slovenia.
  15. de Paiva, Valeria, Fabricio Chalub, Livy Real, and Alexandre Rademaker. 2016. “Making Virtue of Necessity: a Verb Lexicon.” In PROPOR – International Conference on the Computational Processing of Portuguese. Tomar, Portugal.
  16. Rademaker, Alexandre, Valeria de Paiva, Fabricio Chalub, Livy Real, and Claudia Freitas. 2016. “Introducing OpenWordnet-PT: an Open Portuguese Wordnet for Reasoning.” In International FrameNet Workshop Part of 9th International Conference on Construction Grammar (ICCG9), edited by Tiago Timponi Torrent. Juiz de Fora, Brazil: Universidade Federal de Juiz de Fora - UFJF.
  17. Rademaker, Alexandre, Fabricio Chalub, Livy Real, Cláudia Freitas, Eckhard Bick, and Valeria de Paiva Universal Dependencies for Portuguese. 2017. “Universal Dependencies for Portuguese.” In Proceedings of the Fourth International Conference on Dependency Linguistics (Depling), 197–206. Pisa, Italy.
  18. Muniz, Henrique, Fabricio Chalub, Alexandre Rademaker, and Valeria de Paiva. 2018. “Extending Wordnet to Geological Times.” In Global Wordnet Conference 2018. Singapore.
  19. Real, Livy, Alexandre Rademaker, Fabricio Chalub, and Valeria de Paiva. 2018. “Towards Temporal Reasoning in Portuguese.” In Proceedings of 6th Workshop on Linked Data in Linguistics. Miyazaki, Japan. http://lrec-conf.org/workshops/lrec2018/W23/summaries/8_W23.html.
  20. de Paiva, Valeria, Alexandre Rademaker, Livy Real, Fabricio Chalub, and Gerard de Melo. 2018. “OpenWordNet-PT: Taking Stock.” Proceedings of Fifth Workshop on Natural Language and Computer Science (Affiliated with Federated Logic Conference 2018). Oxford, UK. https://doi.org/10.29007/tvgw
  21. Cid, Alessandra, Alexandre Rademaker, Bruno Cuconato, and Valeria de Paiva. 2018. “Linguistic Legal Concept Extraction in Portuguese.” In Legal Knowledge and Information Systems, edited by Monica Palmirani. Vol. 313. Frontiers in Artificial Intelligence and Applications. http://ebooks.iospress.nl/volumearticle/50848
  22. de Paiva, Valeria de, and Alexandre Rademaker. 2019. “Portuguese Manners of Speaking.” In Proceedings of the 10th Global Wordnet Conference. Global Wordnet Association.


Sunday, August 30, 2020

Wildfires near us

This has been a difficult week: heatwave, pandemic, possible power cuts and to cap it all, wildfires! the possible need to evacuate, the need to prepare go-bags, to find documents and photos. our bags are still sitting by the porch. The wildfires are only 40% contained, as I type this. Very unsettling! and we're the lucky ones: the ones who did not have to evacuate, who only had to consider it.

It took quite a lot of determination to keep making jokes about it. Like the one I made in facebook, stealing from someone I don't know on twitter.

"It’s raining ash in California, forcing us to wear a different kind of mask than we wear for the pandemic when we go buy the generator we need for either rolling blackouts or preemptive outages so we can work from home if we haven’t been evacuated, if we have work or our house hasn't burned down. well, I don't need a generator, I just need some more wine to keep it going!"


But all is fine, thank you! The heatwave has gone away, the air is clear again, there's wine in the house and I've managed to chair sessions at AiML and WBL, and watch lots of interesting talks. I even managed to produce slides and talk for some 20 min on Friday. So we're back to the old worries about all the deadlines missed and all the work not delivered, yet. but hey, this is normal! Wildfires no, they're not normal.

and this picture is from 2016, not now, but somehow is even more frightening that I had no idea it was happening!


Sunday, August 9, 2020

Understanding Portuguese

(Illustration by Jana Walczyk)

It might be possible to find students and programmers to develop an old dream of mine, I am told. This dream is a project about producing logic from texts in  Portuguese. I have been giving talks about this project since 2010 (paper from 2011). Thus I want to explain (to possible volunteers) what does this project entail, what is the work that we should be doing, and why.

Explaining why we should be doing this work is very easy. 

The amount of information published in scientific articles,  preprints, news, blog posts, fiction, as well as unstructured data has increased many-fold in the last few years. A major bottleneck in the discovery of relevant information for business and researchers alike arises when connecting new results with the previously established state-of-the-art. A potential solution to this problem is to transform the unstructured raw-text of the novel information onto structured database entries, which would allow us to reason with this new information in the same way that one already organizes and reasons with the previous content, using Knowledge Graphs. Thus this would allow programmatic querying of the content, checking it for contradictions, checking for new changes, as well as all manners of analytics of this content. The fact that one can do most of this processing in English, but not in Portuguese (or for that matter not in many other languages) should be a reason for concern.  Brazilian science, as well as its industry, cannot progress as well as others, if our native language is not processed as well as others.

Semantic Parsing Portuguese

Now explaining exactly what the work on a semantic parser for Portugues amounts to, is somewhat harder. The project of transforming unstructured text into knowledge is very hard, language is way too ambiguous and difficult to deal with. While many open-source tools and resources for processing English texts exist, very few can be used for Portuguese. So we describe in parallel what we do have for English and what we need to build for Portuguese.

The project of extracting semantic information from English sentences is very hard. ur best shot can be seen at the moment in the preliminary demo. This prototype, developed by Katerina Kalouli and Dick Crouch, goes over ideas developed when I worked with Crouch at Xerox PARC, but re-implements these ideas from scratch, using new technologies for all software that is proprietary technology of either Xerox PARC or Microsoft. (There is a paper explaining the system and a version showing how this can be hybridized with machine learning systems.)  

This new semantic parser project has a pipeline that depends on several other open-source projects: we discuss these several "steps" below. 

Steps for Semantic Parser in Portuguese

Semantic parsers for English abound, but we are following a specific line of work that starts with Daniel Bobrow and  Ronald Kaplan at PARC.

1. Grammatical parsing is improving every year. A recent development is the new Stanford system called "Stanza".  Stanza is multilingual, includes Portugues, it is written in Python and has a better (less restrictive) license than the previous CoreNLP Stanford systems. We need to fine-tune it for our experiments.

2. The semantic parser we have in English depends on the grammatical parsing of sentences using the Stanford-Google based project "Universal Dependencies". Actually, it uses "enriched dependencies", we need to check how they behave for Portuguese.

The Universal Dependencies project has been going on since 2016.  This has already a branch in Portuguese, with which I am associated through my work with Alexandre Rademaker and Livy Real, but the corpus we have in Portuguese is small and there are still many issues with the Portuguese Universal Dependencies. These need expanding and possibly some annotation effort to increase the size of the corpus.

3. The semantic parser also depends essentially on Princeton WordNet.  Building up the Portuguese version of the WordNet thesaurus and dictionary has been a much harder task than we had anticipated, but our system (for browsing and downloading) has been in operation since 2012, here's the original description. It is still being constructed or is ``in progress", but it is getting close to the end of its first (translation only) phase. 

4. The semantic parser also depends on some version of tool for disambiguation and we have been using JIGSAW (available from GitHub), but this has not been updated since 2012. And this will not work for Portuguese. We need a tool for the disambiguation of Portuguese that can be plugged into this pipeline.

5. The system also depends on a generic upper ontology, for which we are using SUMO  in English. But an upper ontology is not enough to provide the world knowledge necessary for our applications. The project of expanding SUMO into an appropriate ontology for Brazilian culture, a Knowledge Graph for Brazil and its different facets (be they history, culture or geology or tourism, etc) is another major undertaking.

6. Finally, we need a reasoner on top of the representations that the semantic parser produces. This could be an off-the-shelf system like Lean or Isabelle, or it could be an NLI (Natural Language Inference) like the ones produced via neural nets and/or hybrid methods described in this SEMEVAL meeting special issue proceedings.

I need to emphasize that these steps can be done in any scientific or commercial field that one is interested in. We could do it for History, Chemistry, or Mathematics, for example. We could do it to help integrate IoT (Internet of Things) appliances or to help design customer service automated systems. Of course, an application to dialogue will require a further module, a dialogue manager, which orchestrates the possible conversations and actions of the automated system. The different domains should correspond to different Knowledge Graphs.

However, each one of these steps is a considerable amount of work, possibly worth a master thesis, or maybe even a PhD. Putting them all together should also be a major engineering feat. I hope we will find people willing to take up this challenge.

Sunday, July 26, 2020

Editing books

The picture above is what Amazon knows about the books I've edited. There are a few other things, (e.g. special issues of journals), but of course Amazon doesn't sell them, so they wouldn't know about those. Now anyone in the least competent would have their edited books as part of their curriculum and webpage, right? Oh well, I'm failing this test too, so far. Need to add them.
Actually, Amazon also knows about the cover of the last book above, first edition in 1993, re-issued as a paperback in 2006.

Now for special issues, I guess I need to create my own picture.

The curious incident of the dropped streaming

This is a picture of zoom minutes before Women in Logic 2020, the workshop associated with FSCD/IJCAR, started on 30 June 2020. This year I am *not* one of the organizers of "Women in Logic".
I had promised myself to try to do it for three years and then pass on the ball. I was thrilled to be able to pass the ball to the very competent hands of Sandra Alves, Sandra Kiefer and Ana Sokolova, the organizers this year! They did a splendid job and the workshop had 145 attendees during these trying pandemic times, a wonderful feat, if you ask me.

But we had a bit of an incident during Women in Logic 2020 this time. The workshop was going really well, when during Alexandra Silva's Invited talk ("An algebraic framework to reason about concurrency"), my chat started blipping with the organizers of FSCD/IJCAR asking "what's going on on your workshop? everything ok? YouTube took down the streaming!!! they say someone complained about the workshop". What?

I explained that there was nothing wrong happening, no zoom bombing, no glitches that we (me or the real organizers) could see, and urged them to complain to YT to get the stream back up again. YouTube eventually restarted the stream again (the next day--the workshop was one day only) and sent an unapologetic message, see below.

So yes, we don't know at all what happened. If they thought there was a trademark infringement or if some human being triggered the complaints procedure to annoy us. (some of our friends seem to think that the latter was the case!)

We have some reasons to believe that this was a childish act of sabotage:  because Alexandra had finished the CS part of her presentation and had started the discussion on why we need meetings like "Women in Logic". Initially firmly convinced that it was some sort of glitch of automatic algorithms I took on to Twitter and asked:

OK, a small typo in "down", but nothing too controversial.  Belnap's lattice, Kleene algebras and nominal type theory are perfectly good subjects in logic and computer science. The workshop was running on Zoom and was been streamed on YouTube, so the meeting carried on with further talks and a discussion at the end. But the reason for streaming the meeting was to support also people who didn't want to use zoom, and these people could not participate then.

Quite a number of people responded to my tweet.  Ian Stark asked "Do you get any indication of what YouTube judge you've infringed?" and we were told that something similar happened with POPL2019, so I wrote to Fritz Henglein to ask for information. (there wasn't much info to be had)

Sara Kalvala commented "It is completely bizarre that anyone would feel threatened by a bunch of women having a workshop on logic and complain to @youtube. Even more bizarre that @youtube would delete the video. But it won't stop us having more meetings". To this I replied "yes, totally bizarre! a small correction is that YT didn't delete the video, they simply took down the streaming. Since stopping the streaming is immediate, but reinstatement takes lots of human intervention, they put it back the next day, but the workshop was one day only!".

Anyways a small consolation (for me) was to see the comments from colleagues in FSCD/IJCAR saying "I thought you were exaggerating, guess you're right and doing the right thing!! Keep doing it!!".

And yes, I think we are doing the right thing. To begin with I was a little skeptical. I am used to being in a very masculine world, a world of very few women. I `grew up' in research being treated like one of the "lads" and not worrying too much about it. I was expecting things to improve, as numbers of women improved. But the numbers of women in logic and Computer Science not only did not improve, some of them got decidedly much worse.

Many of  the young women finishing PhDs in CS I talked to feel that a place like "Women in Logic"  made them feel less attacked, more protected and better able to speak and be themselves. And the fact that many of our sisters have been doing these meetings for more than twelve years (e.g.Women in Machine LearningWomen in Machine Learning and Data ScienceWomen in Biology, etc), with huge numbers in attendance, showed me  that I was wrong, that meetings where only women present work are sensible and helpful and a "good thing" altogether.




Monday, July 20, 2020

Slideshare is my friend. Sometimes.

When I first had  issues with Google sites, Slideshare turned out to be an easy place to add pdfs of  slides to (btw I still have issues with Google sites, still need to find a few hours to try to debug what's wrong with my old webpage!)

 I wish I had made more use of slideshare all along, as by now I have no idea where most of my powerpoint talks are. I hope they are somewhere between my dropbox, my Google Drive storage, my Apple storage, or any of the four hard drives sitting on my desk. But yes trying to find anything at all in all these possible places is quite hard. While the Slideshare stuff is easy to find.


I can see that this year, so far, I have given five talks -- actually I have given 6, as I have forgotten to add the slides for "Logicians in Quarantine", as they were very similar indeed to the ones for SRI. I can see that I need some change in my beamer style -- which is lovely and matches well my favorite PowerPoint template, so I can import between the platforms. But you can have too much of a good thing like pale purplish-blue! and I swear that it took me much more than 3 hours to get those slides uploaded. because one deck had a glitch and insisted on failing its upload over and over again.

Anyways some effort will be happening here to recover the talks and the writings associated with them. Because, yes, I am failing miserably this year on my "just ship it" approach to paper writing. We're at the end of July, so I needed to have 7 papers submitted (yes, we cannot guarantee acceptance, but we can make sure that submission is done!). Instead, I have one paper to appear with Paul Tarau and one submitted with Samuel Gomes da Silva. Definitely not good! need to change that asap.

Sunday, July 19, 2020

Logicians in Quarantine



 This post is a short shout-out to Bruno Lopes and Petrucio Viana for the brilliant idea of creating "Logicians in Quarantine/Logicos em Quarentena" the Brazilian Logic Society online seminar.  I was invited to help to bootstrap it and I gave the second talk,  "Between a Rock and a Hard Place: Structural and Distributional Meaning representations", mostly because I had just given this talk at SRI Menlo Park (on 5th March), so it was ready. But also because I wanted to show to my Brazilian friends how my work with language does connect to my work with logic. The transition is not so obvious. 

Joao Marcos gave the first talk "On classes of structures axiomatizable by universal d-Horn sentences and universal positive distinctions".  The seminar seems to be working extremely well, with all sorts of interesting talks. These are recorded, so if something happens, you can always watch it later on. 

The seminar is now part of the Logic SuperGroup another splendid idea! many kudos to Shay Logan, Shawn Standefer, and many others for the brilliant implementation of the idea of connecting all the logic seminars.