Saturday, December 24, 2022

Dialectica Categories in Computing

In the first half of the year we had an AMS Mathematical Research Community (MRC) on Applied Category Theory, in Beaver Hollow, NY. John Baez blogged about it in “Learn Applied Category Theory!” Here’s a poster advertising the whole program.

We also had a paper in the Notices of American Mathematical Society, describing our proposal:

Applied Category Theory in Chemistry, Computing, and Social Networks John Baez, Simon Cho, Daniel Cicala, Nina Otter, and Valeria de Paiva.

An MRC is a bit like a version “on steroids” of the ACT Adjoint School, as there are many more researchers in a research subgroup in the MRC than in a Adjoint School group. While my ACT Adjoint School group had four students and a TA, my MRC group had thirteen researchers. So from four to thirteen people you have more than three times the number of people to do interesting research with. Plenty of fun, but it can be unsettling to think about research problems for so many people you don’t know, especially for me (well, I haven’t had that much experience directing other people’s research).

I was very worried to begin with, but when we got going, it was exhilarating. Great fun to discuss serious math problems with people interested in similar problems, with similar tastes to yours and with plenty of ideas of their own. The program had three mentors (John, Nina and myself), and I ended up with a group of 13, subdivided into 4 smaller groups. Things were going very well (you can check some of the discussions in the Zulip channel [practice:Dialectica]) until two days before the meeting in upper New York state I got ill, and ended up in the ER in the Stanford hospital. Twice.

I couldn’t go to the research meeting in Beaver Hollow, but Jérémie Koenig, my trusted deputy, managed to get everyone working well together into the four groups we had established before:

Dialectica and Polynomials,
Dialectica and Games,
Dialectica and Lenses,
Dialectica Petri net Processes.

They sent me a lovely picture from Beaver Hollow.

The neat thing about MRCs is that they give the participants some funds to make academic visits and I was very touched when my group decided to come and visit me in California, using their funds. This meant that we had another smaller meeting in Cupertino, in the middle of the Summer. Well, it was already September and usually the weather over here is good most of the time, but we had some bad luck, and this week was a heatwave, the worst one this summer. But we had some fun. Now the problem is reworking what happened in the many discussions.

A third component of the MRC program is still to come. Of course we all hope that the math we discussed during the summer will mature nicely into preprints and papers and programs, and these are supposed to be presented in the AMS Joint Mathematics Meetings (JMM) in Boston in early January. So one organizer was chosen from each mentor’s group and I am very happy that Charlotte Aten accepted to be the organizer for our group, despite being in the last weeks of her PhD and interviewing for jobs. Then sometime in September we were told everyone needed to write up some ideas that they hoped could make solid papers before January 2023. I’m glad to report that every one of our four subgroups was able to produce an abstract and I hope the papers are in the process of being written as I write this.

Now this post was meant to be about the four subgroups above and why I am so excited for the mathematics we’re doing. But I am a bit behind with my processing of the knowledge we shared when people visited here (and over Zulip!). So, while I could possibly wait for the JMM to get to grips with more of the new math that people are producing, I think I will try to explain, at least a little, how the themes above interact.

I started my research career working on Gödel’s Dialectica Interpretation. On bad days I think not only have I never left the dialectica, but even today I hardly understand the stuff, after all these many years. Then I console myself with the thought that the stuff is really fascinating and full of surprising consequences and applications!

Someone else who’s awed by the Dialectica Interpretation is Jules Hedges. His blog post Lenses for Philosophers shows some of the badly understood consequences and connections of the work on categorical dialectica construction. In some ways the MRC Dialectica categories for Computing is an attempt to clarify some of the interrelations described in Hedges’ post.

Gödel’s Dialectica Interpretation is a proof interpretation of intuitionistic arithmetic into a quantifier-free theory of functionals of finite type, called System T.

The interpretation’s original use was to show the consistency of Heyting (or intuitionistic) arithmetic. When combined with Gödel’s double-negation interpretation, which reduces classical arithmetic to intuitionistic arithmetic, the Dialectica interpretation yields a reduction of the classical theory as well.

As explained by Fernando Ferreira,

The stated purpose of Gödel’s Dialectica paper is to present a consistency proof of first-order Peano arithmetic PA by way of an extension of the finitary viewpoint according to which the requirement that constructions must be of and over “concrete” objects is relaxed to be of and over certain abstracta, namely computable functionals of finite type over the natural numbers.

Thus the Dialectica interpretation can be considered a relaxed (or liberalized) form of Hilbert’s program, where instead of insisting on the finitary viewpoint, we accept ‘computable functionals of finite type’ as (sort-of) concrete objects. To understand the interpretation itself, I recommend A.S. Troelstra’s “Introductory Note to the Dialectica paper” in Gödel’s Collected works, volume II, where we can read that “Gödel presents his results as a contribution to a liberalized version of Hilbert’s program: to justify classical systems, in particular arithmetic, in terms of notions as intuitively clear as possible.” From there one can graduate to more complete work of Troelstra himself (Metamathematical investigation of intuitionistic arithmetic and analysis (1973)), of Avigad and Feferman (Godel’s Functional (“Dialectica”) Interpretation) and of Kohlenbach and his school.

The categorical internal versions of the Dialectica construction, starting in de Paiva’s thesis in 1988 (out as a technical report in 1991), have spelled out some of the requirements and consequences of the interpretation, especially its connection to Linear Logic. The categorical dialectica construction was generalized to fibrations, by Hyland (2002), Biering (2008) and Hofstra (2011). Generalizations to dependent type theories appeared in van Glehn (2016) and Moss (2018) and then together (2018). More recently Trotta et al, building on this work introduced Gödel fibrations and Gödel doctrines, rekindling the connections of the categorical dialectica construction to logic and making sure that the three dialectica principles (Independence of Premise, Markov Principle and Axiom of Choice) get their categorical explanation.

The main connections of the Dialectica construction discussed in the MRC, as anticipated above, are between ‘lenses’ and dialectica categories, between ‘polynomials’ and dialectica categories (see Nelson Niu’s blog post and Sean Moss’ talk), between ‘games’ and dialectica categories (already hinted at in Blass’ original paper on games for linear logic) and between concurrency theories and dialectica categories in the shape of Dialectica Petri nets Processes and their implementation in Agda. I hope the abstracts for the extended discussions of these four sets of connections will be available soon.

But it is clear that we’re only scratching the surface of this rich story of interconnections. Hopefully much more will come to light in the near future.

Extracting Mathematical Concepts

(Reposted from the Topos Institute blog, by Valeria de Paiva and Jacob Collard [originally posted Wednesday, 16 Nov 2022])

Humans are very good at recognizing the words they do not know, the concepts they haven’t met yet. To help with these they invented dictionaries, glossaries, encyclopedias, wikipedias, crib sheets, etc. Initially, they did it via laborious manual work, more recently through automated construction techniques. Natural language processing (NLP) tools have improved impressively in the last decade. Most of this incredible improvement happened to newspaper text and hence named entities of the kind found in news text can be detected and classified (usually into people, organizations, places and dates) with much greater accuracy than in the recent past.

Dr Johnson, the author of the 1755 A Dictionary of the English Language.

However, for domain specific text, like the academic literature in different sciences (e.g. Medicine, Biology, Chemistry), or in the Humanities (History, Sociology, Philosophy, etc.) things are more complicated. Thus extracting technical terms/concepts from papers is important, interesting and usually difficult.

How do you detect mathematical concepts in text? As a human mathematician (or not), you probably recognize that the ovals in the snippet in this picture correspond to mathematical concepts. But there is some uncertainty in this choice, and several metrics might be used to decide whether the automated system used is getting it right or not.

We wanted to see how well ‘automatic term extractors’ (ATE) systems work for mathematical text, especially for Category Theory, if we simply use them out-the-box. It seems reasonable to establish baselines using available, open source systems and this is what we do in our paper “Extracting Mathematical Concepts from Text”. The paper was presented at the 8th Workshop on Noisy User-generated Text (W-NUT) at Coling 2022 and you can find a short, technical video describing the work below.

This work is part of a larger project on Networked Mathematics that we started describing in “Introducing the MathFoldr Project” and broadened the discussion in “The many facets of Networked Mathematics”. We hope to provide more information on the project soon.

Thursday, December 1, 2022

Network Mathematics

The many facets of Networked Mathematics

by Valeria de PaivaMonday, 18 Apr 2022

Categories: [research]
Tags: [MathFoldr] , [AI] , [NLP]

Last year we started a discussion on the production of mathematics, the pre-industrial kind of process that we all still follow of solving problems and taking the solutions to the market of ideas (conferences, seminars, blog posts, twitter, coffee breaks in the department, etc.) as drafts/preprints/submissions and all the difficulties that this process involves. In particular we discussed the difficulties of checking whether the work had been done before or not, and the difficulties of searching for other, similar work that might help us with a given question.

As a group we settle on a path of trying to apply the new tools of NLP and knowledge representation (neural nets, word embeddings, transformers, ontologies, semantic web tools, etc.) to what might be called mathematical English. We reasoned that, yes eventually we will need to get to the semantics of equations and diagrams, but it would be useful to know how much we can learn simply from using the tools of NLP to extract mathematical concepts from abstracts of articles.

We also decided that Category Theory would be our prototype domain. First because category theory is about the organization of Mathematics in general, discovering the backbones of the subject and the hidden similarities between, in principle, unrelated sub-areas. But secondly because we are category theorists, after all. So this is the domain in which we are domain experts.

We realized early on that corpora of mathematical text were needed, and we hoped this wouldn’t be a problem. Category theory was, after all, one of the first areas of Mathematics to have a free open-source journal, namely TAC (Theory and Applications of Categories) in 1995. But we need much more data than simply one journal. While we also have both the nLab and the whole of the Wikipedia Math Portal, as beautiful mathematical open resources, even the licensing of the arXiv is more problematic than we knew.

As we explained in the earlier blog post, we have chosen to work with a corpus of TAC abstracts and one (snapshot) of the nLab from the beginning of 2021. The TAC abstracts are clean, with not so much LaTex thrown in, and form a small corpus of 3.2K sentences. The nLab is a bit more noisy, one needs to remove the wiki markup and sections and headings get lost, a traditional nightmare of processing scientific text.

We now have spaCy processing of all this data, as well as a collection of experiments on term extraction, using different tools: TextRank, OpenTapioca, DyGIE++ and Parmenides, a special tool of our NIST collaborators. There are plenty of interesting observations to make about the processing and the comparison between these tools, and we ~~are preparing~~ have now a write-up about it. However, the major problem with domain specific NLP strikes again: we have no golden standard for mathematical concepts in mathematical texts, much less for category theory concepts. Our predicament is not only ours: many others have the same issue and have tried to make do with not guaranteed good results.

Meanwhile, we have been working on the idea that formalized mathematics will eventually be the way of producing mathematics, which is a new kettle of fish altogether. To discuss the issues of formalization (somewhat an eventual goal) we have had in the Topos Colloquium talks by several people involved with automated deduction: Jeremy Avigad, Kevin Buzzard, Larry Paulson, Florian Rabe, for example. We hope to have more talks this year. Finally, we’re happy to be helping to organize a Hausdorff Institute of Mathematics trimester on “Prospects of Formal Mathematics” in 2024.

The “Prospects of formal mathematics” trimester program will run at the Hausdorff Institute of Mathematics from the 6th of May to the 16th of August 2024.