Category acquisition in an LoT

Category acquisition in an LoT#

NOTE This is based on Piantadosi’s example here

try:
    # in colab
    import google.colab
    print('In colab, downloading LOTlib3')
    !git clone https://github.com/thelogicalgrammar/LOTlib3
except:
    # not in colab
    print('Not in colab!')

Not in colab!

Imports#

First we need to import a bunch of stuff:

import numpy as np

from LOTlib3.Miscellaneous import q, random
from LOTlib3.Grammar import Grammar
from LOTlib3.DataAndObjects import FunctionData, Obj
from LOTlib3.Hypotheses.LOTHypothesis import LOTHypothesis
from LOTlib3.Hypotheses.Priors.RationalRules import RationaRulesPrior
from LOTlib3.Hypotheses.Likelihoods.BinaryLikelihood import BinaryLikelihood
from LOTlib3.Eval import primitive
from LOTlib3.Miscellaneous import qq
from LOTlib3.TopN import TopN
from LOTlib3.Samplers.MetropolisHastings import MetropolisHastingsSampler

The model#

This is a version of the model in the paper we have discussed in class. The learner sees a series of objects and learns a posterior over logical expressions, where each logical expression encodes a category.

Data#

First, we need a function to generate the data. This is the function that Piantadosi uses:

def make_data(n=1, alpha=0.999):
    return [
        FunctionData(input=[Obj(shape='square', color='red')], output=True, alpha=alpha),
        FunctionData(input=[Obj(shape='square', color='blue')], output=False, alpha=alpha),
        FunctionData(input=[Obj(shape='triangle', color='blue')], output=False, alpha=alpha),
        FunctionData(input=[Obj(shape='triangle', color='red')], output=False, alpha=alpha)
    ]*n

But we can easily write other functions, e.g., one that randomly selects datapoints:

def make_data(n=1, alpha=0.999):
    return np.random.choice(
        [
            FunctionData(input=[Obj(shape='square', color='red')], output=True, alpha=alpha),
            FunctionData(input=[Obj(shape='square', color='blue')], output=False, alpha=alpha),
            FunctionData(input=[Obj(shape='triangle', color='blue')], output=False, alpha=alpha),
            FunctionData(input=[Obj(shape='triangle', color='red')], output=False, alpha=alpha)
        ],
        size=n
    )

Let’s test it:

make_data(10)

array([<<OBJECT: shape=square color=blue > -> False>,
       <<OBJECT: shape=triangle color=red > -> False>,
       <<OBJECT: shape=triangle color=red > -> False>,
       <<OBJECT: shape=square color=blue > -> False>,
       <<OBJECT: shape=square color=red > -> True>,
       <<OBJECT: shape=triangle color=red > -> False>,
       <<OBJECT: shape=triangle color=blue > -> False>,
       <<OBJECT: shape=triangle color=blue > -> False>,
       <<OBJECT: shape=triangle color=blue > -> False>,
       <<OBJECT: shape=square color=blue > -> False>], dtype=object)

Logical primitives#

These primitives are also defined in LOTlib3.Primitives.Features but I am reporting them here for completeness:

@primitive
def is_color_(x,y): 
    # simply check that the color attribute of
    # the object is y
    return (x.color == y)

@primitive
def is_shape_(x,y): 
    return (x.shape == y)

As described in the introductory file for LOTlib3, the decorator @primitive allows us to use a function as a terminal in our grammar.

Grammar#

The grammar that encodes ‘Disjunctive Normal Form’ can be found in DefaultGrammars.py (and imported with from LOTlib3.DefaultGrammars import DNF), but I am reporting it here for completeness:

DEFAULT_FEATURE_WEIGHT = 5.

grammar = Grammar()
# NOTE: empty name argument is interpreted as a function with an 'empty' name
# so it just works to add brackets around DISJ
grammar.add_rule('START', '', ['DISJ'], 1.0)
grammar.add_rule('START', '', ['PRE-PREDICATE'], DEFAULT_FEATURE_WEIGHT)
grammar.add_rule('START', 'True', None, DEFAULT_FEATURE_WEIGHT)
grammar.add_rule('START', 'False', None, DEFAULT_FEATURE_WEIGHT)

grammar.add_rule('DISJ', '',     ['CONJ'], 1.0)
grammar.add_rule('DISJ', '',     ['PRE-PREDICATE'], DEFAULT_FEATURE_WEIGHT)
grammar.add_rule('DISJ', 'or_',  ['PRE-PREDICATE', 'DISJ'], 1.0)

grammar.add_rule('CONJ', '',     ['PRE-PREDICATE'], DEFAULT_FEATURE_WEIGHT)
grammar.add_rule('CONJ', 'and_', ['PRE-PREDICATE', 'CONJ'], 1.0)

# A pre-predicate is how we treat negation
grammar.add_rule('PRE-PREDICATE', 'not_', ['PREDICATE'], DEFAULT_FEATURE_WEIGHT)
grammar.add_rule('PRE-PREDICATE', '',     ['PREDICATE'], DEFAULT_FEATURE_WEIGHT)

PRE-PREDICATE -> ['PREDICATE']	w/ p=5.0

We also need some predicates so that the grammar can interact with the observations:

# Two predicates for checking x's color and shape
# Note: per style, functions in the LOT end in _
grammar.add_rule('PREDICATE', 'is_color_', ['x', 'COLOR'], 1.0)
grammar.add_rule('PREDICATE', 'is_shape_', ['x', 'SHAPE'], 1.0)

# Some colors/shapes each (for this simple demo)
# These are written in quotes so they can be evaled
grammar.add_rule('COLOR', q('red'), None, 1.0)
grammar.add_rule('COLOR', q('blue'), None, 1.0)
grammar.add_rule('COLOR', q('green'), None, 1.0)
grammar.add_rule('COLOR', q('mauve'), None, 1.0)

grammar.add_rule('SHAPE', q('square'), None, 1.0)
grammar.add_rule('SHAPE', q('circle'), None, 1.0)
grammar.add_rule('SHAPE', q('triangle'), None, 1.0)
grammar.add_rule('SHAPE', q('diamond'), None, 1.0)

SHAPE -> 'diamond'	w/ p=1.0

Look at some example of sentences generated by the grammar:

hyp = grammar.generate()
hyp

not_(is_color_(x, 'red'))

Hypothesis#

The hypothesis class in Piantadosi’s code inherits from RationalRulesPrior, which implements a model from Goodman et al 2008, A Rational Analysis of Rule-Based Concept Learning. However, we found that that wasn’t working right so I am just removing it:

class MyHypothesis(BinaryLikelihood, LOTHypothesis):
    def __init__(self, **kwargs):
        
        # note that our grammar defined above is passed to 
        # MyHypothesis here
        LOTHypothesis.__init__(self, grammar=grammar, **kwargs)
        
        # this is a parameter from the model in Goodman et al 2008
        self.rrAlpha=2.0

With this, we can generate hypotheses from the grammar:

hypothesis = MyHypothesis()
hypothesis

lambda x: is_color_(x, 'green')

And use them, e.g., to calculate the likelihood of some data (here, the data sampled above):

data = make_data()
hypothesis.compute_likelihood(data)

-0.0005001250416822429

Run inference#

Finally, we can do inference and find the 10 highest posterior hypotheses:

h0 = MyHypothesis()
data = make_data()
top = TopN(N=10)

for i, h in enumerate(MetropolisHastingsSampler(h0, data, steps=10000)):
    top << h

And print them:

for h in top:
    print(h.posterior_score, h.prior, h.likelihood, qq(h))

-11.761285918026795 -4.158883083359667 -7.602402834667128 "lambda x: is_color_(x, 'red')"
-10.08780960949681 -10.085809109330082 -0.0020005001667289714 "lambda x: and_(is_color_(x, 'red'), not_(is_shape_(x, 'triangle')))"
-10.08780960949681 -10.085809109330082 -0.0020005001667289714 "lambda x: and_(is_shape_(x, 'square'), not_(is_color_(x, 'blue')))"
-10.08780960949681 -10.085809109330082 -0.0020005001667289714 "lambda x: and_(not_(is_shape_(x, 'triangle')), is_color_(x, 'red'))"
-10.08780960949681 -10.085809109330082 -0.0020005001667289714 "lambda x: and_(not_(is_color_(x, 'blue')), is_shape_(x, 'square'))"
-9.682344501388645 -9.680344001221917 -0.0020005001667289714 "lambda x: and_(not_(is_shape_(x, 'triangle')), not_(is_color_(x, 'blue')))"
-9.682344501388645 -9.680344001221917 -0.0020005001667289714 "lambda x: and_(not_(is_color_(x, 'blue')), not_(is_shape_(x, 'triangle')))"
-9.682344501388645 -9.680344001221917 -0.0020005001667289714 "lambda x: and_(is_color_(x, 'red'), is_shape_(x, 'square'))"
-9.682344501388645 -9.680344001221917 -0.0020005001667289714 "lambda x: and_(is_shape_(x, 'square'), is_color_(x, 'red'))"
-8.988697195787019 -1.386294361119889 -7.602402834667129 "lambda x: False"

top.best()

lambda x: False

This is a nice scaffolding, but much more can be done with this model. Here’s some suggestions for the remainder of the class:

EXERCISES

Add a feature to the grammar and objects, e.g. is_size_.

Add what we saw in the Piantadosi et al paper, namely a system where whether an object belongs to the category can depend on which other objects are present in the situation.

In order to do this, you’ll have to expand the expressive power of the grammar to include (higher order) functions.

NOTE: While this might seem a bit strange, we have similar phenomena in natural language. For instance, whether an object counts as ‘large’ or ‘small’ might depend on what other objects are relevant in the context. A ‘large’ mouse is smaller than a ‘small’ elephant.

Create and plot a ‘learning curve’:

Produce some dataset with e.g., 200 examples from a certain true hypothesis.

Create cumulative sets of data: [data[:10], data[:20], data[:30], ...] (can also be written as [data[:n] for n in range(10, 200, 10)]). This simulates an experiment where the participant sees more and more of the data.

Train the model on each data, so that we have a series of posteriors trained on increasingly large portions of the data.

Plot the posterior probabilities of the overall most common hypotheses over time.