Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some equivalent Atomese program representations for manipulation and evaluation #11

Open
ngeiswei opened this issue Jun 19, 2018 · 2 comments
Labels
enhancement New feature or request

Comments

@ngeiswei
Copy link
Member

ngeiswei commented Jun 19, 2018

Overview

This issue contains some considerations regarding various ways Atomese
programs could be represented, manipulated and evaluated.

Warning: it is not a plan for immediate actions, just some considerations.

Motivation

As suggested by @Bitseat in order to avoid hacking too much the
atomese interpreter
https://github.com/opencog/atomspace/blob/master/opencog/atoms/execution/Instantiator.h#L141
an option would be to unfold an Atomese program to be readily
interpretable by Instantiator::execute.

For instance given the data set represented as

(Similarity (stv 1 1)
  (List (Schema "o") (Schema "i1") (Schema "i2"))
  (Set
    (List (Node "r1") (List (Number 1) (Number 0) (Number 1)))
    (List (Node "r2") (List (Number 1) (Number 1) (Number 0)))
    (List (Node "r3") (List (Number 0) (Number 0) (Number 0)))))

and the combo program

(Plus (Schema "i1") (Schema "i2"))

It could be unfolded into

(Set
  (List (Node "r1") (Plus (Number 0) (Number 1)))
  (List (Node "r2") (Plus (Number 1) (Number 0)))
  (List (Node "r3") (Plus (Number 0) (Number 0))))

which passed to the Atomese interpreter would return the desired
result

(Set
  (List (Node "r1") (Number 1))
  (List (Node "r2") (Number 1))
  (List (Node "r3") (Number 0)))

However I'm thinking we can probably take a middle ground approach
where the unfolding would be much lighter and wouldn't involve hacking
the interpreter so that Plus, etc would support higher level inputs
(which ultimately is probably fine and desired, but since we are in an
exploratory stage we want to avoid too much potentially unnecessary
and complicated hacking). Also, I suspect that this sort of
lightweight unfolding will be beneficial for subsequent Atomese
program processing, such as finding patterns in a population of
programs and evaluating them on new inputs.

Proposal

So here it goes, for instance given (Plus (Schema "i1") (Schema "i2")), the first level of unfolding could be (using unimplemented FunMapLink)

(FunMap
  (List
    (Variable "$R")
    (Lambda
      (Variable "$R")
      (Plus
        (ExecutionOutput
          (Schema "f1")
          (Variable "$R"))
        (ExecutionOutput
          (Schema "f2")
          (Variable "$R")))))
  (Domain))

where FunMap is to be distinguished from
http://wiki.opencog.org/w/MapLink as it doesn't assume that its first
argument is a pattern but rather a function, and thus has the same
semantics as
https://hackage.haskell.org/package/base-4.11.1.0/docs/Prelude.html#v:map
or in scheme
https://srfi.schemers.org/srfi-1/srfi-1.html#FoldUnfoldMap

And Domain is just something that retrieves the row names, r1 to
r3, and should probably be written

(Domain (List (Schema "f1") (Schema "f2")))

but is just written (Domain) here for simplicity.

So written in a more casual functional program style it would be

(map (lambda (r) (cons r (+ (f1 r) (f2 r)))) (domain))

Alternatively, as suggested by @kasimebrahim, one could use PutLink

(Put
  (Variable "$R")
  (List
    (Variable "$R")
    (Put
      (Lambda
        (Variable "$R")
        (Plus
          (ExecutionOutput
            (Schema "f1")
            (Variable "$R"))
          (ExecutionOutput
            (Schema "f2")
            (Variable "$R"))))
      (Variable "$R")))
  (Domain))

The next unfolding, which is probably the most interesting is

(FunMap
  (List
    (Variable "$R")
    (Put
      (Lambda
        (VariableList
          (Variable "$X")
          (Variable "$Y"))
        (Plus
          (Variable "$X")
          (Variable "$Y")))
      (Lambda
        (Variable "$R")
        (List
          (Schema "f1")
          (Schema "f2"))))
  (Domain)))

because it exposes the heart of the program

      (Lambda
        (VariableList
          (Variable "$X")
          (Variable "$Y"))
        (Plus
          (Variable "$X")
          (Variable "$Y")))

then links it to the inputs i1 and i2, via using Put, then
applies to the domain r1 to r3. The good thing about this
representation is that it allows to abstract away the features (which
can be better to reason about some patterns), and it also makes it
easier to evaluate it on new inputs, because you only need to change
one place (Domain) by say (NewDomain) to express that simply.

@ngeiswei ngeiswei changed the title Some equivalent Atomese program representation for manipulation and evaluation Some equivalent Atomese program representations for manipulation and evaluation Jun 19, 2018
@kasimebrahim
Copy link
Collaborator

; get the node containing the row name from the row 'R' [(car row)]
(define (rowname R)
    ....)

; this yields a list containing the values of each features
; for a given row 'R' [(cdr row)]
(define (row R)
    ....)

; get all the features as List of Variable Nodes from the problem data
(define (featureVariables problemData)
    ....)

; this is the program over all the features
(DefineLink
    (DefinedSchemaNode "programOne")
    (Lambda
        (VariableList
            (ExecutionOutputLink
                 (GroundedSchemaNode "featureVariables")
                 (Node "ProblemData")))
        (Plus
            (VariableNode "$f1")
            (VariableNode "$f2"))))

; get Domain
(DefineLink
    (DefinedSchemaNode "domain")
    (SetLink
        (ListLink (Node "r1") (ListLink (Number 1) (Number 0)))
        (ListLink (Node "r2") (ListLink (Number 0) (Number 1)))))

; this is the unfolding of the program
(PutLink
    (VariableNode "$R")
    (ListLink
        (ExecutionOutputLink
            (GroundedSchemaNode "rowname")
            (VariableNode "$R"))
        (ExecutionOutputLink
                (DefinedSchemaNode "programOne")
                (ExecutionOutputLink
                    (GroundedSchemaNode "row")
                    (VariableNode "$R"))))
    (DefinedSchemaNode "domain"))


This could be an alternative for the second unfolding, but as you can see it has some problems already. For instance in order to have a generic program that works on a generic domain I needed to have a procedure to declare the variable names "featureVariables" and that wont work because VariableList can't be created from List containing VariableNodes and it may not be a good way to do this in general hopefully you will have better ideas.

@ngeiswei
Copy link
Member Author

That's a possible alternative. A few comments

  1. You should probably rename domain into intput-table or something. What I meant by domain was the input class of the functions corresponding to features f1, etc, including the target feature o. That is the domain of the feature functions is the set of rows, and the codomains are type of values the features hold (usually Boolean or Number).
  2. I think (VariableList (ExecutionOutputLink ... should be ill-formed. If one wants to auto-generate variable lists, one may instead use Put and Quote, like
(Put
  (Quote
    (Lambda
      (Unquote
        (Variable "$vardecl"))
      (Unquote
        (Plus (Variable "$X1") (Variable "$X2")))))
  (ExecutionOutput
    (GroundedSchema "featureVariables")
    (Node "ProblemData")))
  1. Having the program takes only the variables it needs, and clearly separate those variables from the input features is a slightly higher level abstraction, but I don't know if it would really be beneficial, I mean the pattern matcher for instance would be able to catch common patterns about the programs in both representations. It's just a consideration.

@ngeiswei ngeiswei added the enhancement New feature or request label Sep 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants