- Tools
- Important Classes
- Protobuf
- Design Patterns
- Coroutines
- Finite State Machines
- Conventions
- Architecture Overview
A few commonly-used terms and tools to be familiar with:
- This is the shared vision system used by the Small Size League. It is what connects to the cameras above the field, does the vision processing, and transmits the positional data of everything on the field to our AI computers.
- The GitHub repository can be found here
- Sometimes referred to as the "Referee", this is another shared piece of Small Size League software that is used to send gamecontroller and referee commands to the teams. A human controls this application during the games to send the appropriate commands to the robots. For example, some of these commands are what stage the gameplay is in, such as
HALT
,STOP
,READY
, orPLAY
. - The GitHub repository can be found here
These are classes that are either heavily used in our code, or are very important for understanding how the AI works, but are not core components of the AI or other major modules. To learn more about these core modules and their corresponding classes, check out the sections on the Backend, Sensor Fusion, AI, and Thunderscope.
The World
class is what we use to represent the state of the world at any given time. In this context, the world includes the positions and orientations of all robots on the field, the position and velocity of the ball, the dimensions of the field being played on, and the current referee commands. Altogether, it's the information we have at any given time that we can use to make decisions.
A team is a collection of Robots.
A Robot class represents the state of a single robot on the field. This includes its position, orientation, velocity, angular velocity, and any other information about its current state.
The Ball class represents the state of the ball. This includes its position and velocity, and any other information about its current state.
The Field class represents the state of the physical field being played on, which is primarily its physical dimensions. The Field class provides many functions that make it easy to get points of interest on the field, such as the enemy net, friendly corner, or center circle. Also see the coordinate convention we use for the field (and all things on it).
These represent the current state of the game as dictated by the Gamecontroller. These provide functions like isPlaying()
, isHalted()
which tell the rest of the system what game state we are in, and make decisions accordingly. We need to obey the rules!
An Intent
represents a simple thing the AI wants (or intends for) a robot to do, but is at a level that requires knowledge of the state of the game and the field (e.g. Referee state, location of the other robots). It does not represent or include how these things are achieved. Some examples are:
- Moving to a position without colliding with anything on its way and while following all rules
- Pivoting around a point
- Kicking the ball at a certain direction or at a target
There are two types of Intent
s: DirectPrimitiveIntent
s and NavigatingIntent
s. DirectPrimitiveIntent
s directly represent the Primitives that the AI is trying to send to the robots. NavigatingIntent
s are intents that require moving while avoiding obstacles, so they contain extra parameters to help with Navigation.
Dynamic Parameters
are the system we use to change values in our code at runtime. The reason we want to change values at runtime is primarily because we may want to tweak our strategy or aspects of our gameplay very quickly. During games we are only allowed to touch our computers and make changes during halftime or a timeout, so every second counts! Using Dynamic Parameters
saves us from having to stop the AI, change a constant, recompile the code, and restart the AI.
Additionally, we can use Dynamic Parameters
to communicate between Thunderscope and the rest of our system. Thunderscope can change the values of DynamicParameters
when buttons or menu items are clicked, and these new values will be picked up by the rest of the code. For example, we can define a Dynamic Parameter
called run_ai
that is a boolean value. Then when the Start [AI](#ai)
button is clicked in Thunderscope, it sets the value of run_ai
to true
. In the "main loop" for the AI, it will check if the value of run_ai
is true before running its logic.
Here's a slightly more relevant example of how we used Dynamic Parameters
during a game in RoboCup 2019. We had a parameter called enemy_team_can_pass
, which indicates whether or not we think the enemy team can pass. This parameter was used in several places in our defensive logic, and specifically affected how we would shadow enemy robots when we were defending them. If we assumed the enemy team could pass, we would shadow between the robots and the ball to block any passes, otherwise we would shadow between the enemy robot and our net to block shots. During the start of a game, we had enemy_team_can_pass
set to false
but the enemy did start to attempt some passes during the game. However, we didn't want to use one of our timeouts to change the value. Luckily later during the half, the enemy team took a time out. Because Dynamic Parameters
can be changed quick without stopping AI, we were quickly able to change enemy_team_can_pass
to true
while the enemy team took their timeout. This made our defence much better against that team and didn't take so much time that we had to burn our own timeout. Altogether this is an example of how we use Dynamic Parameters
to control our AI and other parts of the code.
It is worth noting that constants are still useful, and should still be used whenever possible. If a value realistically doesn't need to be changed, it should be a constant (with a nice descriptive name) rather than a Dynamic Parameter
. Having too many Dynamic Parameters
is overwhelming because there are too many values to understand and change, and this can make it hard to tune values to get the desired behaviour while under pressure during a game.
Protobufs or protocol buffers are used to pass messages between components in our system.
After building using Bazel, the .proto
files are generated into .pb.h
and .pb.cc
files, which are found in bazel-out/k8-fastbuild/bin/proto
.
To include these files in our code, we simply include proto/<protobuf_filename>.pb.h
These are protobuf messages that we define and that are important for understanding how the AI works.
TbotsProto::Primitive
s represent simple actions that robots blindly execute (e.g. send signals to motor drivers), so it's up to the AI to send Primitives
that follow all the rules and avoid collisions with obstacles. Some examples are:
- Moving in a straight line to a position
- Pivoting around a point
- Kicking the ball at a certain direction
Primitives
act as the abstraction between our AI and our robot firmware. It splits the responsibility such that the AI is responsible for sending a Primitive
to a robot telling it what it wants it to do, and the robot is responsible for making sure it does what it's told. For every Primitive
protobuf message, there is an equivalent Primitive
implementation in our robot firmware. When robots receive a Primitive
command, they perform their own logic and control in order to perform the task specified by the Primitive
.
The TbotsProto::RobotStatus
protobuf message contains information about the status of a single robot. Examples of the information they include are:
- Robot battery voltage
- Whether or not the robot senses the ball in the breakbeam
- The capacitor charge on the robot
- The temperature of the dribbler motor
Information about the robot status is communicated and stored as RobotStatus
protobuf messages. Thunderscope displays warnings from incoming RobotStatus
es so we can take appropriate action. For example, during a game we may get a "Low battery warning" for a certain robot, and then we know to substitute it and replace the battery before it dies on the field.
Below are the main design patterns we use in our code, and what they are used for.
Abstract classes let us define interfaces for various components of our code. Then we can implement different objects that obey the interface, and use them interchangeably, with the guarantee that as long as they follow the same interface we can use them in the same way.
Read https://www.geeksforgeeks.org/inheritance-in-c/ for more information.
Examples of this can be found in many places, including:
The Singleton pattern is useful for having a single, global instance of an object that can be accessed from anywhere. Though it's generally considered an anti-pattern (aka bad), it is useful in specific scenarios.
Read https://refactoring.guru/design-patterns/singleton for more information.
We use the Singleton pattern for our logger. This allows us to create a single logger for the entire system, and code can make calls to the logger from anywhere, rather than us having to pass a logger
object literally everywhere.
The Factory pattern is useful for hiding or abstracting how certain objects are created.
Read the Refactoring Guru articles on the Factory Method pattern and the Abstract Factory pattern for more information.
Because the Factory needs to know about what objects are available to be created, it can be taken one step further to auto-register these object types. Rather than a developer having to remember to add code to the Factory every time they create a new class, this can be done "automatically" with some clever code. This helps reduce mistakes and saves developers work.
Read http://derydoca.com/2019/03/c-tutorial-auto-registering-factory/ for more information.
The auto-registering factory is particularly useful for our PlayFactory
, which is responsible for creating Plays. Every time we run our AI we want to know what Plays are available to choose from. The Factory pattern makes this really easy, and saves us having to remember to update some list of "available Plays" each time we add or remove one.
The Factory pattern is also used to create different Backends
The Visitor pattern is useful when we need to perform different operations on a group of "similar" objects, like objects that inherit from the same parent class (e.g. Tactic). We might only know all these objects are a Tactic, but we don't know specifically which type each one is (eg. AttackerTactic
vs ReceiverTactic
). The Visitor Pattern helps us "recover" that type information so we can perform different operations on the different types of objects. It is generally preferred to a big if-block
with a case for each type, because the compiler can help warn you when you've forgotten to handle a certain type, and therefore helps prevent mistakes.
Read https://refactoring.guru/design-patterns/visitor for more information.
An example of where we use the Visitor pattern is in our MotionConstraintVisitor
. This visitor allows us to update the current set of motion constraints based on the types of tactics that are currently assigned.
The Observer pattern is useful for letting components of a system "notify" each other when something happens. Read https://refactoring.guru/design-patterns/observer for a general introduction to the pattern.
Our implementation of this pattern consists of two classes, Observer
and Subject
. Observer
s can be registered with a Subject
, after which new values will be sent from each Subject
to all of it's registered Observer
s. Please see the headers of both classes for details. Note that a class can extend both Observer
and Subject
, thus receiving and sending out data. In this way we can "chain" multiple classes.
In our system, we need to be able to do multiple things (receive camera data, run the AI, send commands to the robots) at the same time. In order to facilitate this, we extend the Observer
to the ThreadedObserver
class. The ThreadedObserver
starts a thread with an infinite loop that waits for new data from Subject
and performs some operation with it.
WARNING: If a class extends multiple ThreadedObserver
s (for example, AI could extend ThreadedObserver<World>
and ThreadedObserver<RobotStatus>
), then there will be two threads running, one for each observer. We do not check for data race conditions between observers, so it's entirely possible that one ThreadedObserver
thread could read/write from data at the same time as the other ThreadedObserver
is reading/writing the same data. Please make sure any data read/written to/from multiple ThreadedObserver
s is thread-safe.
One example of this is SensorFusion, which extends Subject<World>
and the AI, which extends ThreadedObserver<World>
. SensorFusion runs in one thread and sends data to the AI, which receives and processes it another thread.
The publisher-subscriber pattern ("pub-sub") is a messaging pattern for facilitating communication between different components. It is closely related to the message queue design pattern.
In this pattern, Publisher
s send messages without knowing who the recipients (Subscriber
s) are. Subscriber
s express interest in specific types of messages by subscribing to relevant topics; when a Publisher
sends a message of a topic, the messaging system ensures that all interested subscribers receive the message.
Read https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern for an introduction to the pub-sub pattern.
We use the pub-sub pattern to facilitate inter-process communication in our system. Through a class called ProtoUnixIO
, components can subscribe to receive certain Protobuf message types sent out by other processes or system components.
While debatably not a design pattern depending on who you ask, templating in C++ is a powerful tool that is very useful to understand. [https://www.geeksforgeeks.org/templates-cpp/] gives a great explanantion and example.
We use templating in a few places around the codebase, with the most notable examples being our Factory Design Patterns, and our Gradient Descent
optimizer.
Coroutines are a general control structure where the flow control is cooperatively passed between two different routines without returning, by allowing execution to be suspended and resumed. This is very similar to the yield
statement and generators in Python
.
Rather than using the return
keyword to return data, coroutines use the yield
keyword. The main difference is that when return
is encountered, the data is returned and the function terminates. If the function is called again, it starts back from the beginning. On the other hand, when yield
is encountered some data is returned, but the state of the function / coroutine is saved and the function does not terminate. This means that when the function is called again, execution resumes immediately after the yield
statement that previously returned the data, with all the previous context (variables, etc) as if the function never stopped running. This is the "suspend and resume" functionality of coroutines.
See the following C++ pseudocode for an example. This coroutine function computes and returns the fibonacci sequence.
int fib(Coroutine::push_type& yield) {
int f1 = 1;
int f2 = 0;
while(true) {
int fn = f1 + f2; // Compute the next value in the sequence
f2 = f1; // Save the previous 2 values
f1 = fn;
yield(fn);
}
}
int main() {
// Coroutine setup stuff
// Lets pretend that we have created the Coroutine and called it `yield`
std::cout << fib(yield) << std::endl; // Prints 1
std::cout << fib(yield) << std::endl; // Prints 2
std::cout << fib(yield) << std::endl; // Prints 3
std::cout << fib(yield) << std::endl; // Prints 5
std::cout << fib(yield) << std::endl; // Prints 8
// and so on...
}
Lets walk through what's happening here:
- The first time the
fib
function is called, the variablesf1
andf2
are initialized, and we go through the first iteration of the loop untilyield
is encountered - The
yield
statement is going to return the currently computed value of the fibonacci sequence (the variablefn
) and save the state of thefib
function- "yielding" the data here is effectively returning it so that the code in the
main
function can print the result
- "yielding" the data here is effectively returning it so that the code in the
- The second time
main()
calls thefib()
function, the function will resume immediately after theyield()
statement. This means that execution will go back to the top of the loop, and still remember the values off1
andf2
from the last time the function was called. Since the coroutine saved the function state, it still has the previous values off1
andf2
which it uses to compute the next value in the sequence. - Once again when the
yield()
statement is reached, the newly computed value is returned and the function state is saved. You can think of this as "pausing" the function. - As
main()
keeps calling thefib()
function, it is computing and returning the values of the fibonacci sequence, and this only works because the coroutine "remembers" the values from each previous fibonacci computation which it uses to compute the next value the next time the function is called.- If the
yield
was replaced with a regularreturn
statement, the function would only ever return the value1
. This is because usingreturn
would not save the function state, so the next time it's called the function would start at the beginning again, and only ever compute the first value of the sequence.
- If the
This example / pseudocode does hide away some details about how coroutines are set up and how we extract values from them, but it's most important to understand how coroutines change the flow of control in the program.
We use the boost Coroutine2 library. Specifically, we use Asymmetric Coroutines.
This stackoverfow answer gives a decent explanation of the difference between Symmetric and Asymmetric Coroutines, but understanding the difference is not critical for our purposes. We use Asymmetric Coroutines because boost does not provide Symmetric Coroutines, and the hierarchical structure of Asymmetric Coroutines is more useful to us.
We use Coroutines to write our strategy logic. The "pause and resume" functionality of Coroutines makes it much easier to write Plays.
Specifically, we use Coroutines as a way to break down our strategy into "stages". Once a "stage" completes we generally don't want to re-evaluate it, and would rather commit to a decision and move on. Coroutines makes it much easier to write "stages" of strategy without requiring complex state machine logic to check what stage we are in, and it's easier for developers to see what the intended order of operations is (eg. "Line up to take the shot" -> "shoot").
In the past, we had issues with our gameplay logic "committing" to decisions if we were near certain edge cases. This caused robots to behave oddly, and sometimes get significantly slowed down in "analysis paralysis". Coroutines solve this problem by allowing us to write "stages" that execute top-to-bottom in a function, and once we make a decision we commit to it and move on to the next stage.
Here's a more specific example. In this example we are going to pretend to write a Tactic that will pass the ball.
def executeStrategy(IntentCoroutine::push_type& yield, Pass pass) {
do {
yield(/* align the robot to make the pass */)
}while(current_time < pass.start_time);
do {
yield(/* kick the ball at the pass location */)
}while(/* robot has not kicked the ball */)
}
We will pretend that this function is getting called 30 times per second to get the most up-to-date gameplay decision.
In this example, each do while()
loop is a "stage". When the function is first called, we enter the first stage. In this stage, we will keep telling the robot to line up behind the ball to be ready to make the pass. The robot will continue to do this until it is time to start the pass.
Once it is time to start the pass, the condition for the loop will become false and we will exit the loop. Then we enter the second loop / stage. The second stage tells the robot to kick the ball, and this continues until the ball has been kicked. Once the ball has been kicked, the loop will terminate and the function will end because the execution reaches the end of the function.
Once we have entered the second stage, we know we don't have to look at the first stage again. Because the coroutine "remembers" where the execution is each time the function is called, we will resume inside the second stage and therefore never execute the first stage again! This makes it much easier to write and read this strategy code, because we can clearly see the 2 stages of the strategy, and we know they will be executed in order.
Coroutines are a complex feature, and the boost coroutines we use don't always behave in was we expect. We have done extensive testing on how coroutines are safe (or not safe) to us, and derived some best practices from these examples. See coroutine_test_exmaples.cpp for the full code and more detailed explanantions.
To summarize, the best practices are as follows:
- Avoid moving coroutines. If the absolutely must be moved, make sure they are not moved between the stack and heap.
- Avoid using coroutines with resizable containers. If they must be used, make sure that the coroutines are allocated on the heap.
- Pass data to the coroutine on creation as much as possible, avoid using member variables.
A finite state machine (FSM) is a system with a finite number of states with defined transitions and outputs based on the inputs to the system. In particular, we are interested in hierarchical state machines where we can transition between states in terms of when states should transition (guards) and what should happen when transitions occur (actions), given a specific input (event). Hierarchical state machines are state machines that are composed of one or more FSMs, which we call sub-FSMs. The parent FSM can treat a sub-FSM as a state with guards and actions when transitioning to and from the sub-FSM. When the sub-FSM enters a terminal state, the parent FSM is able to automatically transition to another state.
We use the Boost-Ext SML, short for State Machine Library, to manage our finite state machines. This library defines state machines through a transition table, where a row indicates the transition from one state to another subject to guards, actions and events. The syntax of a row of the transition table looks like this:
src_state + event [guard] / action = dest_state
where the src_state transitions to the dest_state, while performing the action, only if the event is processed and the guard is true. Events are structs of new information that FSMs receive, so guards and actions take events as arguments. Guards must return a boolean and actions must return void. An asterix (*) at the start of a row indicates that the state is an initial state. The rows of the transition table are processed in order and the first row to match is executed.
The library also supports hierarchical FSMs. Sub-FSMs are treated as states where an unconditional transition occurs when the sub-FSM is in the terminal state, X.
/* omitted rows of transition table */
SubFSM = next_state, // Transitions to next_state only when the SubFSM is in the terminal state, X
/* omitted rows of transition table */
In order to update a subFSM with an event, we need to do the following:
const auto update_sub_fsm_action =
[](auto event, back::process<TypeOfSubFSMEvent> processEvent) {
TypeOfSubFSMEvent sub_fsm_event = // initialize the subFSM event
processEvent(sub_fsm_event);
};
The convenience of this syntax comes at the cost of hard to read error messages due to the functor and templating system.
We use SML to manage our Tactics. Each state represents a stage in the tactic where the robot should be doing a particular action or looking for certain conditions to be true. An example of this is the MoveFSM. While the robot is not at the destination and oriented correctly, the FSM is in the move state. Once the robot reaches its destination, it enters the terminal state, X, to indicate that it's done. SML also allows us to easily reuse FSMs in other tactics. For example, if a shadowing tactic needs to move to a particular destination with a certain orientation, then it can use the MoveFSM as a sub-FSM state.
Boost-ext SML is a library that supports complex functionality with similarly complex syntax and semantics. If complex syntax is misused, the complicated error messages can make development difficult. Thus, we need to carefully choose a standardized subset of the library's syntax to implement our functionality while maintaining high readability.
- Only use one event per FSM: In gameplay, we react to changes in the World, so since there's only one source of new information, we should only need one event
- Only one guard or action per transition: For readability of the transition table, we should only have one guard or action per transition. This can always be achieved by defining a guard or action outside of the transition table that checks multiple conditions or performs multiple actions if that's required.
- Define guards and actions outside of the transition table: The names of guards and actions should be succinct so that transition tables rows fit on one line and readers can easily understand the FSM from the transition table. In other words, no lambdas/anonymous functions in transition tables.
- States should be defined as classes in the FSM struct so that users of the FSM can check what state the FSM is in:
// inside the struct
class KickState;
// inside the operator()()
const auto kick_s = state<KickState>;
// allows for this syntax
fsm.is(boost::sml::state<MyFSM::MyState>)
- Avoid entry and exit conditions: Everything that can be implemented with entry and exit conditions can easily be implemented as actions, so this rule reduces source of confusion for the reader
- Avoid self transitions, i.e.
src_state + event [guard] / action = src_state
: self transitions call entry and exit conditions, which complicates the FSM. If we want a state to stay in the same state while performing an action, then we should use an internal transition, i.e.src_state + event [guard] / action
. - Avoid orthogonal regions: Multiple FSMs running in parallel is hard to reason about and isn't necessary for implementing single robot behaviour. Thus, only prefix one state with an asterix (*)
- Use callbacks in events to return information from the FSM: Since the SML library cannot directly return information, we need to return information through callbacks. For example, if we want to return a double from an FSM, we can pass in
std::function<void(double)> callback
as part of the event and then make the action call that function with the value we want returned. - When a variable needs to be shared between multiple states or can be initialized upon construction of the FSM, then define a private member and constructor in the FSM struct, and pass that in when constructing the FSM. Here's a code snippet:
(drive_forward_fsm.h)
DriveForwardFSM
{
public:
DriveForwardFSM(double max_speed): max_speed(max_speed){}
private:
double max_speed;
}
(drive_forward_tactic.h)
FSM<DriveForwardFSM> fsm;
(drive_forward_tactic.cpp: constructor)
fsm(DriveForwardFSM(10.0))
Various conventions we use and follow that you need to know.
We use a slightly custom coordinate convention to make it easier to write our code in a consistent and understandable way. This is particularly important for any code handling gameplay logic and positions on the field.
The coordinate system is a simple 2D x-y plane. The x-dimension runs between the friendly and enemy goals, along the longer dimension of the field. The y-dimension runs perpendicular to the x-dimension, along the short dimension of the field.
Because we have to be able to play on either side of a field during a game, this means the "friendly half of the field" will not always be in the positive or negative x part of the coordinate plane. This inconsistency is a problem when we want to specify points like "the friendly net", or "the enemy corner". We can't simple say the friendly net is (-4.5, 0)
all the time, because this would not be the case if we were defending the other side of the field where the friendly net would be (4.5, 0)
.
In order to overcome this, our convention is that:
- The friendly half of the field is always negative x, and the enemy half of the field is always positive x
y
is positive to the "left" of someone looking at the enemy goal from the friendly goal- The center of the field (inside the center-circle) is the origin /
(0, 0)
This is easiest to understand in the diagram below.
Based on what side we are defending, Sensor Fusion will transform all the coordinates of incoming data so that it will match our convention. This means that from the perspective of the rest of the system, the friendly half of the field is always negative x and the enemy half is always positive x. Now when we want to tell a robot to move to the friendly goal, we can simply tell it so move to (-4.5, 0)
and we know this will always be the friendly side. All of our code is written with the assumption in mind.
Going along with our coordinate convention, we have a convention for angles as well. An Angle of 0
is along the positive x-axis (facing the enemy goal), and positive rotation is counter-clockwise (from a perspective above the field, looking at it like a regular x-y plane where +y is "up"). See the diagram below.
Because of our Coordinate Conventions, this means that an angle of 0
will always face the enemy net regardless of which side of the field we are actually defending.
At a high-level, our system is split into several independent processes that communicate with each other. Our architecture is designed in this manner to promote decoupling of different features, making our system easier to expand, maintain, and test.
-
Thunderscope is main entry point of our system and provides the GUI for our software.
-
Fullsystem is the "backend" that processes data and makes decisions for a team of robots. It manages Sensor Fusion, which is responsible for processing and filtering raw data, and the AI that makes gameplay decisions.
-
The Simulator provides a physics simulation of the World, enabling testing of our gameplay when we don't have access to a real field. This process is optional and used only for development and testing purposes; in a real match, our system will receive data from SSL-Vision.
-
Thunderloop is responsible for coordinating communication between our AI computer and the motor and power boards in our robots. It is part our robot software architecture, which is documented here.
Fullsystem processes data and makes decisions for a team of robots. It manages Sensor Fusion, which is responsible for processing and filtering raw data, and the AI that makes gameplay decisions.
Data within Fullsystem is shared between components using the observer pattern; for instance, Sensor Fusion and the Backend are Subject
s that the AI observes.
Fullsystem contains a Backend
responsible for all communication with the "outside world". The responsibilities of the Backend
can be broken down into communication using SensorProto
and Primitives messages:
-
Upon receiving the following messages from the network, the
Backend
will store it in aSensorProto
message and send it to Sensor Fusion:- Robot status messages
- Vision data about where the robots and ball are (typically from SSL-Vision)
- Referee commands (typically from the SSL-Gamecontroller
-
Upon receiving Primitives from the AI,
Backend
will send the primitives to the robots or the Simulator.
The Backend
was designed to be a simple interface that handles all communication with the "outside world", allowing for different implementations that can be swapped out in order to communicate with different hardware/ protocols/programs.
Sensor Fusion
is responsible for processing the raw data contained in SensorProto into a coherent snapshot of the World that the AI can use. It invokes filters to update components of World, and then combines the components to send out the most up-to-date version.
Filters take the raw data from SensorProto and returns an updated version of a component of the World. For example, the BallFilter
takes BallDetection
s and returns an updated Ball
.
Why we need to do this: Programs that provide data like SSL-Vision only provide raw data. This means that if there are several orange blobs on the field, SSL-Vision will tell us the ball is in several different locations. It is up to us to filter this data to determine the "correct" position of the ball. The same idea applies to robot positions and other data we receive.
Filters provide a flexible way to modularize the processing of raw data, making it easy to update filters and add new ones. Filters are sometimes stateful. For example, the BallFilter
"remembers" previous locations of the ball in order to estimate the ball's current velocity.
The AI
is the part of the Fullsystem where all of our gameplay logic takes place, and it is the main "brain" of our system. It uses the information received from Sensor Fusion to make decisions, and then sends Primitives to the Backend for the robots to execute. Altogether, this feedback loop is what allows us to react to what's happening on the field and play soccer in real-time.
The two main components of the AI
are strategy and navigation.
We use a framework called STP (Skills, Tactics, Plays)
to implement our stratgy. The STP
framework was originally proposed by Carnegie Mellon University back in 2004. The original paper can be found here.
STP
is a way of breaking down roles and responsibilities into a simple hierarchy, making it easier to build up more complex strategies from simpler pieces. This is the core of where our strategy is implemented.
When the AI is given new information and asked to make a decision, our STP
strategy is what is executed first. It takes in a World and returns Intents.
The STP diagram shows how this works. Functions to assign tactics to robots and build motion constraints are passed into a Play
's get
function, which the Play
uses to generate tactics with assigned robots and with updated motion constraints.
The T
in STP
stands for Tactics
. A Tactic
represents a "single-robots' role" on a team. Examples include:
- Being a goalie
- Being a passer or pass receiver
- Being a defender that shadows enemy robots
- Being a defender that tries to steal the ball from enemies
They can also represent lower level behaviours, such as
- Moving to a position (without colliding with anything)
- Shooting the ball at a target
- Intercepting a moving ball
The high level behaviours can use the lower level behaviours in a hierarchical way.
Tactics use Intents to implement their behaviour, so that it can decouple strategy from the Navigator.
The P
in STP
stands for Plays
. A Play
represents a "team-wide goal" for the robots. They can be thought of much like Plays in real-life soccer. Examples include:
- A Play for taking friendly free kicks
- A Play for defending enemy kickoffs
- A general defense play
- A passing-based offense play
- A dribbling-based offense play
Plays are made up of Tactics
. Plays can have "stages" and change what Tactics
are being used as the state of the game changes, which allows us to implement more complex behaviour. Read the section on Coroutines to learn more about how we write strategy with "stages".
Furthermore, every play specifies an Applicable
and Invariant
condition. These are used to determine what plays should be run at what time, and when a Play should terminate.
Applicable
indicates when a Play
can be started. For example, we would not want to start a Defense Play
if our team is in possession of the ball. The Invariant
condition is a condition that must always be met for the Play
to continue running. If this condition ever becomes false, the current Play
will stop running and a new one will be chosen. For example, once we start running a friendly Corner Kick
play, we want the Play
to continue running as long as the enemy team does not have possession of the ball.
The Navigator
is responsible for path planning and navigation. Once our strategy has decided what it wants to do, it passes the resulting Intents to the Navigator
. The Navigator
is then responsible for breaking down the Intents and turning them into Primitives.
DirectPrimitiveIntents are easy to break down into Primitives, and can be converted directly without having to do any extra work.
However, NavigatingIntents like the MoveIntent
rely on the navigator to implement more complex behaviour like obstacle avoidance. In order for a robot to move to the desired destination of a NavigatingIntents, the Navigator
will use various path-planning algorithms to find a path across the field that does not collide with any robots or violate any restrictions set on the NavigatingIntents. The NavigatingPrimitiveCreator
then translates this path into a series of Primitives, which are sent to the robot sequentially so that it follows the planned path across the field.
The Path Manager
is responsible for generating a set of paths that don't collide. It is given a set of Path Objectives and Path Planner, and it will generate paths using the given path planner and arbitrate between paths to prevent collisions.
A path objective is a simple datastructure used to communicate between the navigator and the path manager. It conveys information for generating one path, such as start, destination, and obstacles. Path Objectives use very simple datastructures so that Path Planners do not need to know about any world-specific datastructures, such as Robots or the Field.
The Path Planner
is an interface for the responsibility of path planning a single robot around a single set of obstacles from a given start to a given destination. The interface allows us to easily swap out path planners.
AI Diagram
Thunderscope Main
serves as the main entry point for our entire system. It starts up the Thunderscope GUI and other processes, such as a Fullsystem for each AI team.
Thunderscope is our main visualizer of our AI. It provides a GUI that shows us the state of the World, and it is also able to display extra information that the AI would like to show. For example, it can show the planned paths of each friendly robot on the field, or highlight which enemy robots it thinks are a threat. Furthermore, it displays any warnings or status messages from the robots, such as if a robot is low on battery.
Thunderscope also lets us control the AI by setting Dynamic Parameters. The GUI lets us choose what strategy the AI should use, what colour we are playing as (yellow or blue), and tune more granular behaviour such as how close an enemy must be to the ball before we consider them a threat.
Thunderscope is implemented using PyQtGraph, a Python graphics and GUI library. PyQtGraph is built upon PyQt, which provides Python bindings for Qt, a C++ library for creating cross-platform GUIs. The general documentation for Qt can be found here. The most important parts of Qt to understand are:
Thunderscope has a field visualizer that uses PyQtGraph's 3D graphics system to render 3D graphics with OpenGL. PyQtGraph handles all the necessary calls to OpenGL for us, and as an abstraction, provides a scenegraph to organize and manipulate entities/objects within the 3D environment (the scene).
software/thunderscope/gl
is the main directory for the 3D visualizer. The "GL" prefix lets us identify 3D graphics related code and keeps namings consistent with pyqtgraph.opengl
class names. Inside this directory:
GLWidget
is the widget that displays our 3D visualization. It wraps a PyQtGraphGLViewWidget
that renders all theGLGraphicsItem
s that have been added to its scenegraph./graphics
contains custom "graphics items". Graphics items (or just "graphics" for short) are objects that can be added to the 3D scenegraph. Graphics should inherit fromGLGraphicsItem
and represent 3D objects that can be visualized in the scene (e.g. a robot, a sphere, a circle, etc.)./layers
contains all the layers we use to organize and group together graphics.
We organize our graphics into "layers" so that we can toggle the visibility of different parts of our visualization. Each layer is responsible for visualizing a specific portion of our AI (e.g. vision data, path planning, passing, etc.). A layer can also handle layer-specific functionality; for instance, GLWorldLayer
lets the user place or kick the ball using the mouse. The base class for a layer is GLLayer
.
A GLLayer
is in fact a GLGraphicsItem
that is added to the scenegraph. When we add or remove GLGraphicsItem
s to a GLLayer
, we're actually setting the GLLayer
as the parent of the GLGraphicsItem
; this is because the scenegraph has a hierarchical tree-like structure. GLLayer
s can also be nested within one another, i.e. a GLLayer
can be added as a child of another GLLayer
.
The Simulator
is what we use for physics simulation to do testing when we don't have access to real field. In terms of the architecture, the Simulator
"simulates" the following components' functionalities:
- SSL-Vision by publishing new vision data
- the robots by accepting new Primitives
Using the current state of the simulated world, the Simulator
simulates the new Primitives over some time step and publishes new ssl vision data based on the updated simulated world. The Simulator
is designed to be "perfect", which means that
- the vision data it publishes exactly reflects the state of the simulated world
- the simulation perfectly reflects our best understanding of the physics (e.g. friction) with no randomness.
The Simulator
uses Box2D
, which provides 2D physics simulation for free. While this simplifies the simulator greatly, it means that we manually implement the physics for "3D effects", such as dribbling and chipping.
The Standalone Simulator
is a wrapper around the Simulator
so that we can run it as a standlone application that publishes and receives data over the network. The Standalone Simulator
is designed to interface with the WifiBackend over the network, and so it is essentially indistinguishible from robots receiving Primitives and an SSL-Vision client publishing data over the network. The Standalone Simulator
also has a GUI that provides user-friendly features, such as moving the ball around.
When it comes to gameplay logic, it is very difficult if not impossible to unit test anything higher-level than a Tactic (and even those can be a bit of a challenge). Therefore if we want to test Plays we need a higher-level integration test that can account for all the independent events, sequences of actions, and timings that are not possible to adequately cover in a unit test. For example, testing that a passing play works is effectively impossible to unit test because the logic needed to coordinate a passer and receiver relies on more time-based information like the movement of the ball and robots. We can only validate that decisions at a single point in time are correct, not that the overall objective is achieved successfully.
Ultimately, we want a test suite that validates our Plays are generally doing the right thing. We might not care exactly where a robot receives the ball during a passing play, as long as the pass was successful overall. The solution to this problem is to use simulation to allow us to deterministically run our entire AI pipeline and validate behaviour.
The primary design goals of this test system are:
- Determinism: We need tests to pass or fail consistently
- Test "ideal" behaviour: We want to test the logic in a "perfect world", where we don't care about all the exact limitations of our system in the real world with real physics. Eg. we don't care about modelling robot wheels slipping on the ground as we accelerate.
- Ease of use: It should be as easy and intuitive as possible to write tests, and to understand what they are testing.
The SimulatedTestFixture
consists of three components:
- A Simulator that does physics simulations based on the Primitives it receives
- A Sensor Fusion that processed raw data for the AI
- The AI under test
The components of this system are run in a big loop. The Simulator publishes new vision data with a fixed increment and that data is passed through Sensor Fusion to produced an updated World for the AI. The SimulatedTestFixture
will wait to receive Primitives before triggering simulation and publishing the next World. This means that no matter how much faster or slower the simulation runs than the rest of the system, everything will always happen "at the same speed" from the POV of the rest of the system, since each newly published World will be a fixed amount of time newer than the last. See the section on Component Connections and Determinism for why this is important.
Validation Functions
are the way we check that the behaviour of the AI is as we expect. They are essentially functions that contain Google Test ASSERT
statements, and use Coroutines to maintain state. We can create a list of Validation Functions
that are run every time SimulatedTestFixture
produces an updated World, so that we can continuously check that the behaviour of the AI is as we expected. See the section on Component Connections and Determinism for how this is used and why this is important.
A Validation Function
comes in two types:
- Terminating: Some sequence of states or actions that must occur in a given order and then terminate. (Eg. A robot moves to point A, then point B, and then finally kicks the ball)
- Non-terminating: Some condition that must be met for the entire duration of the test, i.e. "for all time". (Eg. The ball never leaves the field or all friendly robots follows SSL rules).
Benefits of Validation Functions
:
- They are reuseable. We can write a function that validates the ball never leaves the field, and use it for multiple tests.
- They can represent assertions for more complex behaviour. For example, we can build up simpler
Validation Functions
until we have a singleValidation
function for a specific Tactic. - They let us validate independent sequences of behaviour. For example, we create a different
ValidationFunction
for each Robot in the test. This makes it easy to validate each Robot is doing the right thing, regardless if they are dependent or independent actions.
When testing, determinism is extremely important. With large integration tests with many components, there are all kinds of timings and execution speed differences that can change the behaviour and results. We make use of a few assumptions and connect our components in such a way that prevents these timing issues.
The Validation Functions are run before updating AI with a new World and getting new primitives so that we stop as soon as we know that there's incorrect behaviour. See the diagram.
Now we have a nice loop from the Simulator -> Sensor Fusion -> Validation Functions -> AI -> Simulator ...
. As mentioned in their own sections, the Simulator waits to receive Primitives from the AI before publishing a new World, and the Validation Functions wait to receive and validate a World allowing the new world to be published. The final assumption we make to complete this loop is that the AI waits to receive a new World before publishing new Primitives.
What this means is that each component in the loop waits for the previous one to finish its task and publish new data before executing. As a result, no matter how fast each component is able to run, we will not have any issues related to speed or timing because each component is blocked by the previous one. As a result, we can have deterministic behaviour because every component is running at the same speed relative to one another.
Notice this is very similar to the Architecture Overview Diagram, with the Backend replaced by the Simulator and with Validation Functions in the loop between Sensor Fusion and AI.
Thunderscope and connections to it are marked with dashed lines, since they are optional and are not run during the tests unless we are debugging.
Since Thunderscope runs in a separate process from Fullsystem, we use Unix domain sockets to facilitate communication between Fullsystem and Thunderscope. Unix sockets have high throughput and are very performant; we simply bind the unix socket to a file path and pass data between processes, instead of having to deal with TCP/IP overhead just to send and receive data on the same computer.
The data sent between Fullsystem and Thunderscope is serialized using protobufs. Some data, such as data that goes through our Backend (vision data, game controller commands, Worlds from Sensor Fusion, etc.), is sent using unix senders owned by those parts of the Fullsystem directly. In other higher level components of the Fullsystem (such as FSMs, pass generator, navigator, etc.), we want to delegate away the responsibility of managing unix senders directly and have a lightweight way of sending protobufs to Thunderscope. To avoid needing to dependency inject a "communication" object in places we have visualizable data to send to Thunderscope, we take advantage of the g3log
logger already used throughout the codebase to log and send visualizable data.
g3log
is a fast and thread-safe way to log data with custom handlers called “sinks". Importantly, it gives us a static singleton that can be called anywhere. Logging a protobuf will send it to our custom protobuf g3log
sink, which lazily initializes unix senders based on the type of protobuf that is logged. The sink then sends the protobuf over the socket to listeners.
Aside: calling g3log
to log protobuf data
Logging protobufs is done at the VISUALIZE
level (e.g. LOG(VISUALIZE) << some_random_proto;
). Protobufs need to be converted to strings in order to log them with g3log
. We've overloaded the stream (<<
) operator to automatically pack protobufs into a google::protobuf::Any
and serialize them to a string, so you don't need to do the conversion yourself.
In Thunderscope, the ProtoUnixIO
is responsible for communicating protobufs over unix sockets. ProtoUnixIO
utilizes a variation of the publisher-subscriber ("pub-sub") messaging pattern. Through ProtoUnixIO
, clients can register as a subscriber by providing a type of protobuf to receive and a ThreadSafeBuffer
to place incoming those protobuf messages. The ProtoUnixIO
can then be configured with a unix receiver to receive protobufs over a unix socket and place those messages onto the ThreadSafeBuffer
s of that proto's subscribers. Classes can also publish protobufs via ProtoUnixIO
by configuring it with a unix sender.
The Estop
allows us to quickly and manually command physical robots to stop what they are doing. It is a physical push button that is connected to the computer via a USB cable. When Thunderscope is launched, a ThreadedEstopReader
is initialized (within RobotCommunication
) that is responsible for communicating and reading values from the Estop
via UART. While running, it will poll the status of the Estop
to determine whether it is in the STOP
or PLAY
state:
- If the
Estop
is in theSTOP
state, it overrides the Primitives sent to the robots withStop
primitives. On the robot,Thunderloop
is responsible for handling the primitive message and ensuring that the power & motor boards receive the correct inputs for the robot to stop. - If the
Estop
is in thePLAY
state, primitives are communicated as normal.