-
Notifications
You must be signed in to change notification settings - Fork 27
SUL Interface, or How to Learn Your Systems
The system under learning (SUL) is a basic abstract class that you have to implement for all systems you would like to learn. When inferring loaded or randomly generated automata, one only has to pass it to the appropriate SUL found in aalpy/SULs/AutomataSUL.py. However, if you would like to learn any other system, you have to implement a reset and step method.
Reset is divided in
- pre() : initialization/startup of the system
- post() : graceful shutdown of the system/memory cleanup
The step method received an element of the input alphabet. Based on the received element, an action associated with said input is performed on the system under learning. The membership query is implemented as a sequence of steps.
For further examples of how to implement SUL refer to: aalpy/SULs/
Input alphabet is a list of any hashable object, most often strings or integers. In the step method of the SUL interface receives an element from the input alphabet and performs a system action based on it.
From standard automatons (loaded from file/randomly generated), you can obtain input alphabet with:
automaton.get_input_alphabet()
To see how class methods and their arguments can be used as the elements of the input alphabet, check out aalpy/SULs/PyMethodSUL and learnPythonClass in Examples.py.
Suppose that you would like to learn a regular expression. More precisely, you would like to learn a grammar that accepts or rejects the same language as the regular expression.
In Python, to check if the regular expression is matched, one can use re
module and its match
method.
So how do we go from re.match(some_regex, some_string)
to the SUL that can be used to learn a DFA representing such regex?
Let us examine the following code snippet that achieves such a task. In the constructor of the RegexSUL we pass a regular expression we would like to learn.
In the pre
method, we initialize the system. In our case, we will reset the string that we will use to form membership queries to the empty string.
In the step method, an element of the input alphabet, denoted by 'letter' will be received as an argument. When learning DFAs, Moore machines, or MDPs, we have to ensure that we account for the None
, as None
is used to query the current state. For Mealy machines, non-deterministic or stochastic formalism None
will never be passed to the step method unless implicitly declared in the input alphabet.
In our example, we will append an element of the input alphabet, a character, to the self. string
and return whether the regular expression accepts the resulting string.
The step
method is called for all inputs in the membership query.
import re
class RegexSUL(SUL):
"""
An example implementation of a system under learning that can be used to learn any regex expression.
Note that the $ is added to the expression as in this SUL only exact matches are learned.
"""
def __init__(self, regex: str):
super().__init__()
self.regex = regex if regex[-1] == '$' else regex + '$'
self.string = ""
def pre(self):
self.string = ""
pass
def post(self):
pass
def step(self, letter):
if letter is not None:
self.string += str(letter)
return True if re.match(self.regex, self.string) else False
Once such SUL is implemented, let us learn a concrete regular expression. Let's define random regex over ['a', 'b', 'c'] input alphabet. Furthermore, we use StatePrefixEqOracle, as it ensures good test coverage.
regex = 'abc(b|c)+(ab|c)*'
alphabet = ['a', 'b', 'c']
regex_sul = RegexSUL(regex)
eq_oracle = StatePrefixEqOracle(alphabet, regex_sul, walks_per_state=100,
walk_len=20)
learned_regex = run_Lstar(alphabet, regex_sul, eq_oracle, automaton_type='dfa', print_level=2)
visualize_automaton(learned_regex)
When we run the presented example, our result is:
Hypothesis 1 has 1 states.
Hypothesis 2 has 7 states.
Hypothesis 3 has 8 states.
-----------------------------------
Learning Finished.
Learning Rounds: 3
Number of states: 8
Time (in seconds)
Total : 0.05
Learning algorithm : 0.0
Conformance checking : 0.05
Learning Algorithm
# Membership Queries : 85
# MQ Saved by Caching : 120
# Steps : 582
Equivalence Query
# Membership Queries : 800
# Steps : 18173
-----------------------------------
and
Suppose that you would like to learn an MQTT protocol. What would be the input alphabet, and how would you implement the SUL interface? You downloaded some clients and want to learn the behavior of the broker. The usual client setup would have the following signature:
class MockMqttExample:
def __init__(self):
self.state = 'CONCLOSED'
self.topics = set()
def subscribe(self, topic: str):
...
def unsubscribe(self, topic):
...
def connect(self):
...
def disconnect(self):
...
def publish(self, topic):
...
Mock implementation of the MQTT protocol can be found in BenchmarksSULs.py under the name "MockMqttExample". We can approach setting up the SUL in two ways.
Usually, we would define our input alphabet to be the abstract representation of possible actions.
The input alphabet for this experiment would be: {connect, disconnect, publish, subscribe, unsubscribe}
.
Each element of the abstract input alphabet will be mapped to the concrete method.
For example:
'connect' -> self.connect()
'publish' -> self.publish(topic='testTopic')
...
The complete learning procedure would then look like this:
from aalpy.base import SUL
from aalpy.oracles import RandomWalkEqOracle
from aalpy.learning_algs import run_Lstar
from aalpy.utils import visualize_automaton, MockMqttExample
class MQTT_SUL(SUL):
def __init__(self):
super().__init__()
self.mqtt = MockMqttExample()
def pre(self):
self.mqtt.state = 'CONCLOSED'
def post(self):
self.mqtt.topics.clear()
def step(self, letter):
if letter == 'connect':
return self.mqtt.connect()
elif letter == 'disconnect':
return self.mqtt.disconnect()
elif letter == 'publish':
return self.mqtt.publish(topic='test')
elif letter == 'subscribe':
return self.mqtt.subscribe(topic='test')
else:
return self.mqtt.unsubscribe(topic='test')
sul = MQTT_SUL()
input_al = ['connect', 'disconnect', 'publish','subscribe', 'unsubscribe']
eq_oracle = RandomWalkEqOracle(input_al, sul, num_steps=2000, reset_after_cex=True, reset_prob=0.15)
learned_mqtt= run_Lstar(input_al, sul, eq_oracle=eq_oracle, automaton_type='mealy', cache_and_non_det_check=True,
print_level=2)
visualize_automaton(learned_mqtt)
When run, the following output is observed:
Hypothesis 1 has 3 states.
-----------------------------------
Learning Finished.
Learning Rounds: 1
Number of states: 3
Time (in seconds)
Total : 0.01
Learning algorithm : 0.01
Conformance checking : 0.0
Learning Algorithm
# Membership Queries : 75
# MQ Saved by Caching : 5
# Steps : 225
Equivalence Query
# Membership Queries : 324
# Steps : 2000
-----------------------------------
Visualized model saved to Model_Visualization.pdf.
and
Let us now consider this example as a challenge where we would like to learn a Python class's behavior.
In this case, we would wrap class methods and their arguments in the FunctionDecorator
and use PyMethodSUL
.
Reset is then implemented by reinitializing the class.
from aalpy.SULs import PyClassSUL, FunctionDecorator
from aalpy.oracles import RandomWalkEqOracle
from aalpy.learning_algs import run_Lstar
from aalpy.utils import visualize_automaton, MockMqttExample
# class under learning (do not instantiate it)
mqtt = MockMqttExample
# methods weapped in the function decorators
input_al = [FunctionDecorator(mqtt.connect), FunctionDecorator(mqtt.disconnect),
FunctionDecorator(mqtt.subscribe, 'topic'), FunctionDecorator(mqtt.unsubscribe, 'topic'),
FunctionDecorator(mqtt.publish, 'topic')]
# SUL
sul = PyClassSUL(mqtt)
eq_oracle = RandomWalkEqOracle(input_al, sul, num_steps=2000, reset_after_cex=True, reset_prob=0.15)
learned_mqtt = run_Lstar(input_al, sul, eq_oracle=eq_oracle, automaton_type='mealy', cache_and_non_det_check=True)
visualize_automaton(learned_mqtt)
Execution yields the same results as the previously stated MQTT example.