Monday, June 2, 2014

Chocolate Boxes



Looking  at noun-noun compounds in the representation language associated to TIL (Textual Inference Logic, described in Contexts for Quantification)  and trying to decide which modifications one should make to the Abstract Knowledge Representation(AKR) treatment, if any.



1. AKR takes the view that given a noun-noun compound (like 'chocolate box') there is a relation between the two nouns, which we don't know what it is, (only using language), so we leave it unspecified.

2. So  whether we want to talk about :
{a Brad Pitt movie}
{a Tarantino movie}
{a James Bond movie}
{a tv program}
{a tv show}
{a birth certificate}
{a chocolate box}

We want to say that the representation of the noun-noun compound should be something like:
role(nn_element, HEAD, MOD)
instantiable(HEAD, cxt)
instantiable(MOD, cxt)
where MOD is the modifier noun.

3. Certain noun-noun compounds are lexicalized by WordNet such as {birth_certificate} or {tv_show}, so they correspond to a single concept as far as WordNet is concerned.
Clearly the line between noun-noun compounds that should be lexicalized in any generic ontology and the ones that should not is a very fluid one. Many people complain that WordNet does not have all the nn-compounds it should and the literature on nn-compounds (but more generally in multi-word expressions) is huge.

4. AKR embraces ambiguity, so it says that sometimes a lexicalization is a good idea and sometimes it isn't. ( the chocolate box above is not merely a box containing chocolates, it is also a box made of chocolate...) Hence AKR produces two solutions for compounds that are lexicalized like {a birth certificate}

a birth certificate
% Choices:
[choice([A1,A2], 1)
Conceptual Structure:
      role(cardinality_restriction,certificate-5,sg)
      role(nn_element,certificate-5,birth-4)
      subconcept(birth-4,[bear#v#1,...,have_a_bun_in_the_oven#v#1])
A1:
      subconcept(certificate-5,[birth_certificate#n#1])
A2:
      subconcept(certificate-5,[certificate#n#1,security#n#4])
Contextual Structure:
      context(t)
      instantiable(bear-4,t)
      instantiable(certificate-5,t)
      top_context(t)


The representation  above has two solutions: we have one concept in the solution that says that {birth_certificate} is a single entity,  and two concepts in the more generic interpretation of birth certificate that says that there is a noun "birth" and a noun "certificate" and we don't know what exactly is the relationship between the two.

5.  It's clear that if I say  {a Tarantino movie} I want to have two concepts, one concept for `Tarantino' which someone else should be able to say is a movie director and two, a movie that  hopefully is vaguely associated with Tarantino. and again hopefully someone else will  decide which one is the relationship between the two concepts.

When nn-compounds are lexicalized (like birth-certificate or tv-show) then maybe there is only a single concept, but it seems that there should be a range, some expressions really a single concept, others very weak relation between the nouns and others between the extremes.

6. This is   more complicated for "media genres", as sometimes one of the nouns stands in for the compound: thus {a documentary movie} maybe  should be equivalent to {a documentary}, and if documentary is a genre that applies to other media (e.g. a radio documentary), how should the mapping be that makes documentary the same as documentary movie, but independent enough to produce a sensible mapping for {radio documentary}?

 It seems to me that, in principle the semantic mapping should produce for  any nn-compound a pair of concepts: the concepts associated to the  HEAD noun,  and MOD(ifier) noun and an {underspecified relation between these} concepts, that gets resolved. Either:

A. It  transforms the two concepts into a single one (if the concept is lexicalized), e.g  {a tv show} should produce simply the ontological concept  TVShow and not an ambiguity between the generic relation  between nouns
(role(nn_element,show-7,TV-4)) and the lexicalized concept
(subconcept(show-7,[television_program#n#1]))

B. Or it produces the right relationship in the ontology between the concepts  if such exists

C. Or it produces a clear relation (Rel X) between the conepts, so that we know that we need to try to find out what this relation is.

This is to be contrasted to a noun phrase like an adjective-noun compound such a {a French actress} or {a hungry boy} that should always produce only one concept (the one for the HEAD noun) while the concept for the modifier, well, it just modifies the head.

Thus  the mapping to the ontology for a phrase like {a French actress} should always go to a concept
for actress, suitably modified by whatever meaning we think the adjective brings in.

No comments:

Post a Comment