Skip to content

Learning of Context‐Free Grammars

Edi Muškardin edited this page Dec 13, 2023 · 5 revisions

AALpy supports the learning of visibly deterministic pushdown automata, which can be a subset of all context-free grammars.

Visibly deterministic pushdown automata

  • visibly refers to the fact that the user has to specify, in the alphabet, which symbols are internal, call, and return
  • call symbols push on the stack
  • return symbols pop from stack
  • internal symbols do not affect the stack
  • deterministic pushdown automata refers to the fact that the behavior of learned pushdown automata is deterministic, so it cannot, for example, learn languages like palindromes

Note that call, return, internal are disjoint sets.

In the following example, we demonstrate how to learn a model that can recognize regular expressions:

Learning follows same template as any other active learning:

  • define SUL
  • select one of the equivalence oracles.
    • Supported for VPA learning: RandomWordEqOracle, RandomWalkEqOracle, StatePrefixEqOracle
    • Initialize with alphabet as sevpa_alphabet.get_merged_alphabet()
  • Run KV learning as usually, just with automaton_type='vpa'

Biggest difference is in the definition of input alphabet: it has to be an instance of SevpaAlphabet with defined internal (can be empty), call, and return alphabets (use lists for reproducabilty).

from aalpy.base import SUL
from aalpy.automata import SevpaAlphabet
from aalpy.oracles import RandomWordEqOracle
from aalpy.learning_algs import run_KV
import warnings
import ast

warnings.filterwarnings("ignore")


class ArithmeticSUL(SUL):
    def __init__(self):
        super().__init__()
        self.string_under_test = ''

    def pre(self):
        self.string_under_test = ''

    def post(self):
        pass

    def step(self, letter):
        if letter:
            self.string_under_test += ' ' + letter if len(self.string_under_test) > 0 else letter

        try:
            # Parse the expression using ast
            parsed_expr = ast.parse(self.string_under_test, mode='eval')
            # Check if the parsed expression is a valid arithmetic expression
            is_valid = all(isinstance(node, (ast.Expression, ast.BinOp, ast.UnaryOp, ast.Num, ast.Name, ast.Load))
                           or isinstance(node, ast.operator) or isinstance(node, ast.expr_context)
                           or (isinstance(node, ast.BinOp) and isinstance(node.op, ast.operator))
                           for node in ast.walk(parsed_expr))
            return is_valid

        except SyntaxError:
            return False


sul = ArithmeticSUL()

alphabet = SevpaAlphabet(internal_alphabet=['1', '+'], call_alphabet=['('], return_alphabet=[')'])

eq_oracle = RandomWordEqOracle(alphabet.get_merged_alphabet(), sul, min_walk_len=2,
                               max_walk_len=10, num_walks=2000)

learned_model = run_KV(alphabet, sul, eq_oracle, automaton_type='vpa')

learned_model.visualize()

Resulting in the following model: