Thursday, February 2, 2023

The road to Networked Mathematics

 [Source]

Networked mathematics is one of my main projects at the Topos Institute (the other is Dialectica categories and its extended family of formalisms). We have not had much luck getting the funding agencies to pay attention to it, yet. But the work is progressing, albeit slowly.

Brendan and I discussed some initial ideas in the blog post Introducing the MathFoldr Project. Then I updated some of the previous ideas in the blog post The many facets of Networked Mathematics, as we consider MathFoldr a key component of Networked Mathematics.

We have been working on the suggestion that formalized mathematics will eventually be one of the main ways of producing mathematics. It seems sensible to believe that proof assistants will become really useful tools for mathematicians, and if so, formalized math will become the usual practice, not the exception.

To discuss the issues of formalized mathematics and to prepare for the Hausdorff Institute of Mathematics trimester on “Prospects of Formal Mathematics” in 2024 we have had in the Topos Colloquium talks by several people involved with proof assistants and automated deduction.

July 2021

We started with Larry Paulson on Formalising Contemporary Mathematics in Simple Type Theory. A recent talk by Larry, Formalised Mathematics: Obstacles and Achievements might be a better introduction to his research as far as mathematical formalization in Isabelle is concerned.

I worked for Larry as a postdoc in the Computer Lab, Cambridge University many moons ago, which is possibly the reason I am so fascinated by formalized mathematics. His blog, Machine Logic is super interesting and I always loved the idea that Isabelle was not conceived as a single logic, but as a generic logical framework, and it can represent reasoning in several different systems. (Together with Sara Kalvala I did some small experiments in Linear Logic in Isabelle.) I still want to develop the ideas of a project I wrote with Larry and Prof Roger Needham on modal logics for authentication and privacy, but it is getting late.

The vast library of mathematical results in Isabelle is called AFP, the Archive of Formal Proofs, or as they say, more precisely: A vast collection of Isabelle examples and applications is available from the Archive of Formal Proofs.

August 2021

We had Kevin Buzzard talking to us about What is the point of Lean’s maths library? Kevin’s style of online demo and talk was a huge success with the audience of applied category theorists, which helped to consolidate my ability to invite other speakers not exactly categorically minded. Kevin has this awesome sentence on his website:

“I believe that computers will be able to help humans with proofs within my lifetime, and I am actively trying to make this happen sooner by (a) helping to build a database of modern mathematical theorems and definitions and (b) trying to teach mathematics undergraduates how to use the software.”

Inspiring stuff!

September 2021

We had Florian Rabe talking about MMT: A UniFormal Approach to Knowledge Representation. Florian is interested in the more general problem of implementing “a universal framework for the formal representation of knowledge”, not only mathematical knowledge, but also other generic computing systems. His system is described in https://uniformal.github.io/. Florian’s system MMT has been in development since 2006 and he has worked with Prof. Dr. Michael Kohlhase in several versions of knowledge management at KWARC, Erlangen, Germany. One of Florian’s main goals is to obtain foundation-independent mechanized results, thus much effort is spent on languages and meta-languages. (I also worked with Florian and other colleagues on extending the Curry-Howard correspondence to Goguen’s institutions a long time ago.)

November 2021

We had Jeremy Avigad talking about Formal mathematics, dependent type theory, and the Topos Institute and Lean. This was similar in some ways to Buzzard’s talk, but coming from a more philosophical perspective, was easier to understand. As Avigad says “Formal mathematics is finally getting some recognition” and the press is ready to acknowledge the big pull of the area. Recent good press exemplified by Quanta: “Building the mathematical library of the future” (2020). Also at Quanta “At the Math Olympiad, computers prepare to go for the gold” (2021), and in Nature “Mathematicians welcome computer-assisted proof in ‘grand unification’ theory”" (2021).

June 2022

Christoph Benzmueller talked to us about Logico-pluralistic exploration of foundational theories with computers. Chris discussed his work on formalization of logic, particularly in formalizing computational metaphysics (instead of mathematics) with collaborators, including Dana Scott, Topos’ senior advisor. Personally I am very interested in Chris’ work on modal logic, also in the benchmarking of modal theorems, less so in God’s ontological existence arguments.

July 2022

Josef Urban talked about Combining learning and deduction over formal math corpora and his work on combining deep learning with automated deduction. Josef, unlike many others, has an agnostic attitude to the different logical foundations and merits of different provers and implementations, so it’s interesting to discuss those kinds of issues with him.

September 2022

Jacques Carette talked to us about What I learned from formalizing Category Theory in Agda, which was particularly interesting to me, as together with Eric Bond and Charlotte Aten I wanted to implement dialectica categories in Agda.

September also had Johan Commelin talking to us about Breaking the one-mind-barrier in mathematics using formal verification. Johan is doing impressive work on formalization of extremely recent work (2020) in Mathematics, the Tensor Liquid Experiment, of Peter Scholze. (if this sounds a bit like a band name, well, it is.)

October 2022

We had Angeliki Koutsoukou Argyraki talking about the Project Alexandria in Cambridge (the one on mathematical knowledge, not the Microsoft one on business knowledge). Her talk was entitled The new era of formalised mathematics and the ALEXANDRIA Project. This was very exciting because the project has a strong component in the direction we are working, that is the direction to do with mathematical language. Mathematical English poses special challenges for NLP (Natural Language Processing) tools.

 
And also in October we had Richard Zanibbi, in the most relevant talk for the directions we’re aiming at with Networked Mathematics. Richard talked about Mathematical Information Retrieval: Searching with Formulas and Text. His project MathSeer and the several demos he showed us were wonderful!

November 2022

We had Gilles Dowek talking about Dedukti in From the Universality of Mathematical Truth to the Interoperability of Proof Systems. Dedukti emphasizes the importance of rewriting in proofs. Gilles’ talk made a series of important points, and it reminded me of other work I’m doing with Luiz Carlos Pereira and Elaine Pimentel on Ecumenical Systems, based on an idea of Dag Prawitz.

So slowly, but surely we’re investigating the issues of formalization of mathematics, as well as its affordances. I hope we will continue to do so in the near future, as there are many loose ends that need tying up.

  • Goguen, J.A., Mossakowski, T., De Paiva, V., Rabe, F. and Schröder, L., 2007. An Institutional View on Categorical Logic. Int. J. Softw. Informatics, 1(1), pp.129-152.
  • E Pimentel, LC Pereira, V de Paiva,2021, An ecumenical notion of entailment, Synthese 198 (22), 5391-541.


Saturday, December 24, 2022

Dialectica Categories in Computing

 

In the first half of the year we had an AMS Mathematical Research Community (MRC) on Applied Category Theory, in Beaver Hollow, NY. John Baez blogged about it in “Learn Applied Category Theory!” Here’s a poster advertising the whole program.

We also had a paper in the Notices of American Mathematical Society, describing our proposal:

Applied Category Theory in Chemistry, Computing, and Social Networks John Baez, Simon Cho, Daniel Cicala, Nina Otter, and Valeria de Paiva.

An MRC is a bit like a version “on steroids” of the ACT Adjoint School, as there are many more researchers in a research subgroup in the MRC than in a Adjoint School group. While my ACT Adjoint School group had four students and a TA, my MRC group had thirteen researchers. So from four to thirteen people you have more than three times the number of people to do interesting research with. Plenty of fun, but it can be unsettling to think about research problems for so many people you don’t know, especially for me (well, I haven’t had that much experience directing other people’s research).

I was very worried to begin with, but when we got going, it was exhilarating. Great fun to discuss serious math problems with people interested in similar problems, with similar tastes to yours and with plenty of ideas of their own. The program had three mentors (John, Nina and myself), and I ended up with a group of 13, subdivided into 4 smaller groups. Things were going very well (you can check some of the discussions in the Zulip channel [practice:Dialectica]) until two days before the meeting in upper New York state I got ill, and ended up in the ER in the Stanford hospital. Twice.

I couldn’t go to the research meeting in Beaver Hollow, but Jérémie Koenig, my trusted deputy, managed to get everyone working well together into the four groups we had established before:

  1. Dialectica and Polynomials,
  2. Dialectica and Games,
  3. Dialectica and Lenses,
  4. Dialectica Petri net Processes.

They sent me a lovely picture from Beaver Hollow.



The neat thing about MRCs is that they give the participants some funds to make academic visits and I was very touched when my group decided to come and visit me in California, using their funds. This meant that we had another smaller meeting in Cupertino, in the middle of the Summer. Well, it was already September and usually the weather over here is good most of the time, but we had some bad luck, and this week was a heatwave, the worst one this summer. But we had some fun. Now the problem is reworking what happened in the many discussions.

A third component of the MRC program is still to come. Of course we all hope that the math we discussed during the summer will mature nicely into preprints and papers and programs, and these are supposed to be presented in the AMS Joint Mathematics Meetings (JMM) in Boston in early January. So one organizer was chosen from each mentor’s group and I am very happy that Charlotte Aten accepted to be the organizer for our group, despite being in the last weeks of her PhD and interviewing for jobs. Then sometime in September we were told everyone needed to write up some ideas that they hoped could make solid papers before January 2023. I’m glad to report that every one of our four subgroups was able to produce an abstract and I hope the papers are in the process of being written as I write this.

Now this post was meant to be about the four subgroups above and why I am so excited for the mathematics we’re doing. But I am a bit behind with my processing of the knowledge we shared when people visited here (and over Zulip!). So, while I could possibly wait for the JMM to get to grips with more of the new math that people are producing, I think I will try to explain, at least a little, how the themes above interact.

I started my research career working on Gödel’s Dialectica Interpretation. On bad days I think not only have I never left the dialectica, but even today I hardly understand the stuff, after all these many years. Then I console myself with the thought that the stuff is really fascinating and full of surprising consequences and applications!

Someone else who’s awed by the Dialectica Interpretation is Jules Hedges. His blog post Lenses for Philosophers shows some of the badly understood consequences and connections of the work on categorical dialectica construction. In some ways the MRC Dialectica categories for Computing is an attempt to clarify some of the interrelations described in Hedges’ post.

Gödel’s Dialectica Interpretation is a proof interpretation of intuitionistic arithmetic into a quantifier-free theory of functionals of finite type, called System T.

The interpretation’s original use was to show the consistency of Heyting (or intuitionistic) arithmetic. When combined with Gödel’s double-negation interpretation, which reduces classical arithmetic to intuitionistic arithmetic, the Dialectica interpretation yields a reduction of the classical theory as well.

As explained by Fernando Ferreira,

The stated purpose of Gödel’s Dialectica paper is to present a consistency proof of first-order Peano arithmetic PA by way of an extension of the finitary viewpoint according to which the requirement that constructions must be of and over “concrete” objects is relaxed to be of and over certain abstracta, namely computable functionals of finite type over the natural numbers.

Thus the Dialectica interpretation can be considered a relaxed (or liberalized) form of Hilbert’s program, where instead of insisting on the finitary viewpoint, we accept ‘computable functionals of finite type’ as (sort-of) concrete objects. To understand the interpretation itself, I recommend A.S. Troelstra’s “Introductory Note to the Dialectica paper” in Gödel’s Collected works, volume II, where we can read that “Gödel presents his results as a contribution to a liberalized version of Hilbert’s program: to justify classical systems, in particular arithmetic, in terms of notions as intuitively clear as possible.” From there one can graduate to more complete work of Troelstra himself (Metamathematical investigation of intuitionistic arithmetic and analysis (1973)), of Avigad and Feferman (Godel’s Functional (“Dialectica”) Interpretation) and of Kohlenbach and his school.

The categorical internal versions of the Dialectica construction, starting in de Paiva’s thesis in 1988 (out as a technical report in 1991), have spelled out some of the requirements and consequences of the interpretation, especially its connection to Linear Logic. The categorical dialectica construction was generalized to fibrations, by Hyland (2002), Biering (2008) and Hofstra (2011). Generalizations to dependent type theories appeared in van Glehn (2016) and Moss (2018) and then together (2018). More recently Trotta et al, building on this work introduced Gödel fibrations and Gödel doctrines, rekindling the connections of the categorical dialectica construction to logic and making sure that the three dialectica principles (Independence of Premise, Markov Principle and Axiom of Choice) get their categorical explanation.

The main connections of the Dialectica construction discussed in the MRC, as anticipated above, are between ‘lenses’ and dialectica categories, between ‘polynomials’ and dialectica categories (see Nelson Niu’s blog post and Sean Moss’ talk), between ‘games’ and dialectica categories (already hinted at in Blass’ original paper on games for linear logic) and between concurrency theories and dialectica categories in the shape of Dialectica Petri nets Processes and their implementation in Agda. I hope the abstracts for the extended discussions of these four sets of connections will be available soon.

But it is clear that we’re only scratching the surface of this rich story of interconnections. Hopefully much more will come to light in the near future.


Extracting Mathematical Concepts

 

(Reposted from the Topos Institute blog, by Valeria de Paiva and Jacob Collard [originally posted Wednesday, 16 Nov 2022])

Humans are very good at recognizing the words they do not know, the concepts they haven’t met yet. To help with these they invented dictionaries, glossaries, encyclopedias, wikipedias, crib sheets, etc. Initially, they did it via laborious manual work, more recently through automated construction techniques. Natural language processing (NLP) tools have improved impressively in the last decade. Most of this incredible improvement happened to newspaper text and hence named entities of the kind found in news text can be detected and classified (usually into people, organizations, places and dates) with much greater accuracy than in the recent past.



Dr Johnson, the author of the 1755 A Dictionary of the English Language.

However, for domain specific text, like the academic literature in different sciences (e.g. Medicine, Biology, Chemistry), or in the Humanities (History, Sociology, Philosophy, etc.) things are more complicated. Thus extracting technical terms/concepts from papers is important, interesting and usually difficult.

How do you detect mathematical concepts in text? As a human mathematician (or not), you probably recognize that the ovals in the snippet in this picture correspond to mathematical concepts. But there is some uncertainty in this choice, and several metrics might be used to decide whether the automated system used is getting it right or not.

We wanted to see how well ‘automatic term extractors’ (ATE) systems work for mathematical text, especially for Category Theory, if we simply use them out-the-box. It seems reasonable to establish baselines using available, open source systems and this is what we do in our paper “Extracting Mathematical Concepts from Text”. The paper was presented at the 8th Workshop on Noisy User-generated Text (W-NUT) at Coling 2022 and you can find a short, technical video describing the work below.

This work is part of a larger project on Networked Mathematics that we started describing in “Introducing the MathFoldr Project” and broadened the discussion in “The many facets of Networked Mathematics”. We hope to provide more information on the project soon.

Thursday, December 1, 2022

Network Mathematics

 

The many facets of Networked Mathematics

by Valeria de PaivaMonday, 18 Apr 2022

Categories: [research]
Tags: [MathFoldr] , [AI] , [NLP]

Last year we started a discussion on the production of mathematics, the pre-industrial kind of process that we all still follow of solving problems and taking the solutions to the market of ideas (conferences, seminars, blog posts, twitter, coffee breaks in the department, etc.) as drafts/preprints/submissions and all the difficulties that this process involves. In particular we discussed the difficulties of checking whether the work had been done before or not, and the difficulties of searching for other, similar work that might help us with a given question.

As a group we settle on a path of trying to apply the new tools of NLP and knowledge representation (neural nets, word embeddings, transformers, ontologies, semantic web tools, etc.) to what might be called mathematical English. We reasoned that, yes eventually we will need to get to the semantics of equations and diagrams, but it would be useful to know how much we can learn simply from using the tools of NLP to extract mathematical concepts from abstracts of articles.

We also decided that Category Theory would be our prototype domain. First because category theory is about the organization of Mathematics in general, discovering the backbones of the subject and the hidden similarities between, in principle, unrelated sub-areas. But secondly because we are category theorists, after all. So this is the domain in which we are domain experts.

We realized early on that corpora of mathematical text were needed, and we hoped this wouldn’t be a problem. Category theory was, after all, one of the first areas of Mathematics to have a free open-source journal, namely TAC (Theory and Applications of Categories) in 1995. But we need much more data than simply one journal. While we also have both the nLab and the whole of the Wikipedia Math Portal, as beautiful mathematical open resources,  even the licensing of the arXiv is more problematic than we knew.

As we explained in the earlier blog post, we have chosen to work with a corpus of TAC abstracts and one (snapshot) of the nLab from the beginning of 2021. The TAC abstracts are clean, with not so much LaTex thrown in, and form a small corpus of 3.2K sentences. The nLab is a bit more noisy, one needs to remove the wiki markup and sections and headings get lost, a traditional nightmare of processing scientific text.

We now have spaCy processing of all this data, as well as a collection of experiments on term extraction, using different tools: TextRank, OpenTapioca, DyGIE++ and Parmenides, a special tool of our NIST collaborators. There are plenty of interesting observations to make about the processing and the comparison between these tools, and we are preparing have now a write-up about it. However, the major problem with domain specific NLP strikes again: we have no golden standard for mathematical concepts in mathematical texts, much less for category theory concepts. Our predicament is not only ours: many others have the same issue and have tried to make do with not guaranteed good results. 

Meanwhile, we have been working on the idea that formalized mathematics will eventually be the way of producing mathematics, which is a new kettle of fish altogether. To discuss the issues of formalization (somewhat an eventual goal) we have had in the Topos Colloquium talks by several people involved with automated deduction: Jeremy Avigad, Kevin Buzzard, Larry Paulson, Florian Rabe, for example. We hope to have more talks this year. Finally, we’re happy to be helping to organize a Hausdorff Institute of Mathematics trimester on “Prospects of Formal Mathematics” in 2024.

The “Prospects of formal mathematics” trimester program will run at the Hausdorff Institute of Mathematics from the 6th of May to the 16th of August 2024.

Wednesday, November 30, 2022

Women in Logic website


 If there is anything the internet has shown me, it is that repetition is useful. More duplication of posts from the Topos Institute blog.

Women in Logic website launched

by Valeria de PaivaThursday, 17 Mar 2022

Categories: [announcement]
Tags: [logic]

World Logic Day is an international day proclaimed by UNESCO in association with the International Council for Philosophy and Human Sciences (CIPSH) in 2019. The idea is to celebrate it on the 14th of January every year.

Last year we celebrated Logic Day. Not this Logic:

 

but more like the Logic of the ideas described below

Also logic.

This year we decided to celebrate Logic Day by launching the 

 Women in Logic website:

Women in Logic (WiL) is a collective of women working in logic, as well as a forum that brings together women conducting research in logic and closely related areas, with the goals of enhancing the experience of women in these communities, making their achievements known, and increasing the number of women in logic.

The flagship activity of the collective is the Women in Logic workshop that has happened every year since 2017. This workshop is usually associated with one of the big conferences in logic for Computer Science, so either FLoC (Federated Logic conference), LiCS (Logic in Computer Science) or FSCD (Conference on Formal Structures for Computation and Deduction), so far. The workshop has happened in Reykjavík (Iceland), Oxford (UK), Vancouver (Canada), Paris (online due to covid) and Rome (online due to covid). The next workshop should be happening happened in Haifa, Israel, as part of FLoC, on July 31st, 2022.

Women are chronically underrepresented in the Logic communities in Computer Science, in Philosophy, in Cognitive Science and in Mathematics; consequently they sometimes feel both conspicuous and isolated, and hence there is a risk that the under-representation is self-perpetuating. The WiL workshops provide an opportunity for women to increase awareness of one another and one another’s work, to combat the feelings of isolation. It provides an environment where women can present to an audience comprising mostly women, replicating the experience that most men have at most CS meetings, and lowering the stress of the occasion; we hope that this will be particularly attractive to early-career women.

The workshop Women in Logic (WiL) is supported by small grants provided by ACM’s SIGLOG, the Vienna Center for Logic and Algorithms, the Institute of Logic, Language and Computation of the University of Amsterdam and the Topos Institute. Thank you friends!!


Sunday, November 27, 2022

Duplicating blog posts?

 


Content gets old and sometimes not accessible any more. Hence, I've decided to duplicate here my blogposts in the Topos blog. At Topos, I have to follow a style that sometimes clashes with mine, while here I can do things my own way. And this is a huge plus. 

Below you can see what was the blog post we wrote about our MathFoldr project 17 months ago. Much can happen in 17 months, but also much can not happen. Despite a very enthusiastic response from our ACT community, we were not able to convince the funding agencies that this is an important project. Thus, the work was carried on, but at a slow pace. I thought I'd update a little the original post with comments, after reproducing the original.

Introducing the MathFoldr Project

by Brendan Fong , Valeria de PaivaSunday, 11 Jul 2021

Categories: [research]
Tags: [MathFoldr] , [AI] , [NLP]

a telegram tape
How do we encode, connect, and share knowledge in the 21st Century?

At Topos we believe knowledge empowers people, and that our community’s expertise should be available to all who seek it out. This is why, for example, we broadcast most of our seminars live on YouTube, and actively support numerous open publishing projects, such as the journal Compositionality, the nLab community wiki, or simply making our books freely available online.

But simple availability is not enough. True access is more than an open door: it’s clear, legible street signs, elevators, and gently sloping on-ramps. And with modern AI and natural language processing tools, we believe it’s beyond time to build these accessibility tools for science and mathematics.

This blog post provides an overview of our nascent MathFoldr project, sharing our dreams and our approach so far—and, at the end, a way for you to help just by doing a concrete, 5 minute activity!

1How do we organise mathematics?

A cornerstone of accessibility is search, and math is not easy to search. A striking, recent example comes from Quanta Magazine, November 2019. A group of physicists discovered a useful identity relating eigenvectors and eigenvalues, and did not know if it was novel. To check, they emailed a number of mathematicians, including Fields Medallist Terence Tao. Despite believing the result was “so short and simple—it should have been in textbooks already”, Tao had not previously heard of it. This led to a paper submitted for publication and, soon after, the article in Quanta. In the weeks after the story emerged, more than three dozen previously published instances of the result were reported, dating back to 1934. How can it be that even eminent mathematicians cannot find a widely published, basic result within their field of expertise?

The simple answer is that the mathematical literature has grown far too vast for even an expert to keep track of it all. A recent analysis finds over 120,000 math papers published in 2017 alone, with this rate growing exponentially at 3% a year.

An image from 'Looking at the mathematics literature', by E. Dunne
E. Dunne, “Looking at the mathematics literature”, Notices of the American Mathematical Society, vol. 66, no. 2, pp. 227–230, 2019. ams.org/journals/notices/201902/rnoti-p227.pdf.

Our infrastructure for organizing and communicating these results has not kept up. The ramifications are significant: wasted search time, duplication of research, and missed connections between fields.

2The MathFoldr vision and path so far

We seek to address this with our MathFoldr project, part of our Networked Mathematics theme. MathFoldr will provide search and literature curation tools that will make mathematics more accessible, with the ultimate goal of transforming the way mathematics is created and navigated.

Today mathematics is rather artisanal: mathematicians craft pdfs of new knowledge, and share these via posting on websites and advertising them in talks. Many recent technologies, from Google, to GitHub, to materials discovered via NLP over materials science literature, show the potential for something much more efficient and effective.

Our strategy for improving this begins with improving the organization and dissemination of math via NLP-powered tools, such as search engines, knowledge graphs, and glossaries, as an entry way to shift publication practices towards ever more formal representations. So the first task is to build ontologies, and to get the community excited and involved with good UI/visualizations.

Right now, we’re doing pilot studies with two corpora, both available on our GitHub:

These pilot studies aim to create a synthesis of machine- and community-driven methods of extracting and curating an ontology of categorical concepts, which will then be maintained via WikiData. From this ontology, we will then build tools to do concept recognition and other semantic processing tasks. A prototype tool, led by Antonin Delpeuch, is nLab.OpenTapioca.

A screen capture from nLab.OpenTapioca
A screen capture from nLab.OpenTapioca. nLab.OpenTapioca takes text and annotates it by identifying categorical concepts. The concepts are WikiData entities, and each is linked to an nLab article.

To extract ontologies, we’ve been collaborating with Jacob Collard and Eswaran Subrahmanian at NIST (the US National Institute of Standards and Technology), who have built an exciting pipeline that preprocesses the text with spaCy, and then uses a root- and rule-based linguistics method (R&R) to extract concepts. You can navigate the results with their Parmesan interface:

A screen capture from Parmesan
A screen capture from Parmesan TAC, which displays subject-verb-object triples automatically extracted from the TAC corpus using the R&R method.

3What’s next?

At present, we’re thinking about how to further clean these corpora, and refine the R&R methods to extract more accurate and precise concepts.

Simultaneously, we’re also thinking about the word embedding models that are both used by these toolkits, and that we could use separately to refine search and other methods. A central problem is that mathematical text is quite different from standard newswire English, and so, as with any domain-specific text, we’re seeing a number of processing errors. Can we tune these better for mathematical text?

A visualisation of a word-embedding semantic model for category theory concepts, trained on the nLab corpus. The model extracts, from analysis of statistical patterns in text alone, the strong similarity between schemes, varieties, and manifolds.
A visualisation of a word-embedding semantic model for category theory concepts, trained on the nLab corpus. The model extracts, from analysis of statistical patterns in text alone, the strong similarity between schemes, varieties, and manifolds. (Click here for a larger version of this image.)

But ultimately, as with any data-driven enterprise, the quality of the output depends crucially on the quality of the input. And so throughout this all, we’re working to improve our corpora, to more accurately capture the expertise and intuitions of mathematicians. Here, we have a request of you: please contribute your expertise and intuitions.

More precisely, we’d love some help identifying concepts in abstracts from Theory and Applications of Categories, to produce what’s known as an “annotated corpus”, which we will share openly for NLP experiments to benefit the scientific community. To contribute, just choose your favorite TAC abstract, and click the button below!

========================

well, I now would say

>This is why, for example, we broadcast and record most of our seminars live on YouTube.

because having the recordings, after the event, seems to me, more important than broadcasting the talks. our permantly available library of exciting talks

Tuesday, October 25, 2022

Some weeks are really hard


 Our friend Veronica left us last Tuesday. Veronica was a wonderful friend, a person one could count on, for real. One who did not complain about things in life, and it was always ready for a lunch, a book discussion, a woman's march and/or anything else she thought was important. 

You really had to know her very well to realize she was fighting a cancer for many years. She was a true engineer, very proud of her work on improving performance of systems (at Samsung), but she rarely talked to us about her work. We have a women's lunch, the BWITCH (something like `Brazilian Women in Tech' lunch), which, like so many of these meetings, sometimes increased, sometimes waned a bit, depending on how busy people were. Lately, we were in a not-so-active phase, after many zoom meetings over the pandemic. And I thought everything was getting better for Veronica's health, I really don't know where I find this misplaced optimism of mine. 

On Sunday there was a memorial for her, and many people discovered many facets of Veronica's personality that we knew nothing about, like I learned that she did martial arts! The hiker, the pottery artist, the devoted wife and mother, the woman committed to improving society I knew well, but there were many other sides that I knew nothing about. I used to tease her quite a bit about being an engineer thru and thru and in the last weeks I had been planning to talk to her about persimmons. We have a persimmon tree and we don't like the fruit, which Veronica likes, so I always write to her to come and collect some, which she organizes carefully to make sure they finish ripening at the right time, in the right way. This year the fruit is ready to be picked up much earlier than usual, and I had been meaning to write her for a while saying it, when the horrible news arrived that she was in a coma. Some days later she was gone.  I did not manage to say goodbye. So I'm saying it here: Rest in power, Veronica!