Localization

What is Localization?

In simple terms, localization is just figuring out where the robot is. In the case of our rover, we need to know what point it is located at in 3D space, as well as what angle it is rotated to in each axis. This information is also known as a pose (see this page). We need to have this data at all times, and it needs to be updated frequently enough to keep up with the rover's motion. To get this information, we employ a variety of sensors:

GPS

GPS (Global Positioning System) is a system that uses signals from an array of satellites orbiting earth to figure out where you are on the planet (read more about how this works here). We use a GPS unit mounted to the rover, which outputs our position on the earth in degrees latitude and degrees longitude.

TODO: copy/adapt David's RTK wiki page (and any other useful pages) from the old mrover workspace into here

IMU

An IMU (Inertial Measurement Unit) is a device that contains several sensors that are used in combination to measure orientation and movement. A 9DoF (Degree of Freedom) IMU like the one we use consists of:

A 3 axis gyroscope, which can measure angular velocity in each axis
A 3 axis accelerometer, which can measure linear acceleration in each axis
A 3 axis magnetometer, which essentially acts as a compass and can measure the direction of magnetic north in each axis All of this data is then combined using sensor fusion algorithms (more on this later) in order to produce a bearing measurement, AKA the angle of rotation around the Z axis.

TODO: add section about cameras and visual odometry

TODO: section about GPS linearization and how accurate it is

GPS Driver

We use an ArduSimple RTK2B Budget GPS receiver. Once configured, this receiver outputs NMEA messages (full reference here) over USB, which contain all of the GPS data we need for our autonomy system. In order to get them from serial messages on a USB port to ROS messages sent to our autonomy nodes, we need a GPS driver.

Fortunately, there is already a ROS package that handles this for us, called nmea_navsat_driver. This package can be configured to read from a certain USB port at a certain baud rate, and to accept messages that produce either estimated position covariance or estimated velocity. These parameters are configured in config/esw.yaml. The driver will then publish NavSatFix messages to the \fix topic, which contains the information we want.

IMU Driver

We use an Adafruit BNO055 IMU. It is centered around the Bosch BNO055 chip, which is what contains all of the sensors. This IMU has onboard sensor fusion capabilities, as it can (allegedly) produce a globally accurate 3D orientation measurement using its own onboard black box sensor fusion algorithms. It can provide a wide range of data over I2C and UART.

In order to access the data it measures, both the raw sensor readings and the filtered orientation estimate, we obviously need to read them from the sensor into our computer. Since the only good library for this sensor only supports I2C, and ordinary computers can't read I2C, we need an intermediary processor to read it for us and convert the data to serial data that the computer can actually read. For this we are using an Arduino Nano Every microcontroller. We have an Arduino sketch kept in the embedded-testbench repository which uses the Adafruit BNO055 Unified Sensor Library to read the IMU data over I2C and then publish it over a serial connection. The arduino is then plugged into our main Jetson computer over USB so the data is accessible on a serial port.

To get this data from serial to the ROS network, we have an IMU driver node. this node reads the IMU data over serial and then publishes it to several standard ROS messages, which other nodes can then subscribe to. The IMU driver node is configured in config/esw.yaml. The specific data published includes:

IMU: Imu messages on the /imu/data topic
Magnetometer: MagneticField messages on the /imu/magnetometer topic
IMU Temperature: Temperature messages on the /imu/temp topic
IMU Calibration: custom CalibrationStatus messages on the /imu/calibration topic

For information on the design process and decisions behind the IMU driver, read this discussion.

Guide to Localization Frames

When determining and defining where the robot is, we have to deal with an inherent tradeoff in data quality. Usually data sources can be either locally accurate or globally accurate, but it is much more difficult to get a single source of data that is both globally and locally accurate.

This problem is addressed by the standards introduced in REP 105 (which you should read to get a general overview). Unfortunately, the REP uses some misleading terms and gives a somewhat confusing explanation, so here is a hopefully more clear way of explaining it.

First, a few important notes and definitions:

local vs global accuracy

In the context of localization, we often talk about local and global accuracy, so we need to be very clear about what these things mean. Locally accurate data must change in a smooth way, i.e. no discrete/discontinuous jumps. However, it may drift over time without bounds, meaning it can gradually accumulate large errors. On the other hand, globally accurate data must not drift significantly over time, but may have discrete/discontinuous jumps at any time. These definitions are of course a little bit vague, since they somewhat depend on the idea of a short vs long period of time, but this isn't usually a problem since we don't deal with a lot of significant edge cases.

abstraction of sensor data sources into transforms

For the purposes of this explanation, we will abstract away any sensor processing/fusion algorithms and just imagine all localization data sources as providing a transform telling us where the robot is located. A global sensor transform will be relative to a fixed frame, in this case the map frame. A local sensor transform on the other hand may be relative to any arbitrary starting point, so long as the transform obeys the rules of local accuracy.

`map`

Since pose is relative (see SE3 wiki entry), we need to define where the robot is relative to some "fixed" frame. Fixed in this case means it does not move relative to the thing/place we are navigating in, which we will call the world. We will use the "map" frame as our fixed frame, and the pose of the robot will be defined relative to that frame, i.e. it will be defined as a transform from map to the robot.

`base_link`

The base_link frame is simply the frame of the robot. It is rigidly attached to the robot, and in our case is located at the center of the chassis (TODO: is this true?).

`odom`

The odom frame is an intermediate frame in between map and base_link. It doesn't really have a physical representation, but it gives us a good way to separate local and global data sources. Odom essentially acts as a local reference frame for the robot. This means that the pose of the robot relative to the odom frame should always be locally accurate, but doesn't need to be globally accurate.

There are two transforms we need to define in order to connect these three frames, which thereby defines the pose of the robot:

`odom_to_base_link`

The transform from the odom frame to the base_link frame, which we will call odom_to_base_link, will be defined using locally accurate sensor data. This means the pose of the robot in the odom frame (which is the exact same thing as this transform) will be locally accurate, i.e it can drift over time, but must always change smoothly and continuously without discrete jumps.

`map_to_odom`

The transform from the map frame to the odom frame, which we will call map_to_odom in this case, will be defined using globally accurate sensor data. Our goal here is to use this data to make the transform from map to base_link (which is equal to map_to_odom * odom_to_base_link) globally accurate, i.e it will not drift over time, but it may change non-continuously with discrete jumps. We want to do this by only changing the map_to_odom frame, which means we have to first figure out what the odom_to_base_link transform is (by asking the TF tree) and then "subtract" that from our global sensor transform in order to determine the correct map_to_odom transform.

Using these frames in practice

The main benefit of this system is having an organized way of separately accessing local and global localization data. For applications where you need a locally accurate localization source, you can get the odom_to_base_link transform from the TF tree and use that as your robot pose. Similarly, for applications where you need globally accurate localization, you can get the map_to_odom transform from the TF tree and use it as your robot pose (for exact syntax, see the wiki section about the TF tree or just google it).

Here's a simple example of using these localization frames: suppose we have a simple 4 wheeled robot equipped with a GPS unit and wheel encoders. The GPS provides a globally accurate source of localization data, while the wheel encoder data can be fed into a sensor processing algorithm to produce a locally accurate source of localization data.

In order to use these two data sources, our first step is to publish the wheel encoder localization data as the odom_to_base_link transform to the TF tree. Similarly, we will also use the GPS data to publish the map_to_odom transform to the TF tree, as described in the map_to_odom section. Once this is done, we can actually use this data for some autonomous routines.

Suppose we want to write a function that rotates the robot 90 degrees in place. Since this is a routine that won't take very much time, is only based on relative measurements, and needs to be quite accurate on a local scale, we want to use a locally accurate data source. To do this, we will get the odom_to_base_link transform from the TF tree, look it's rotation, and then send the robot drive commands until the rotation we read from odom_to_baselink has increased by 90 degrees.

Now suppose we want to write a function to drive to a specific waypoint on our map, a mile away. Since this routine will take a while and is based on absolute measurements (relative to the map frame), we need to use a globally accurate data source to avoid drift and so we can measure relative to the map frame. To do this, we will get the map_to_odom transform from the TF tree, look at its position, and then use a drive controller to drive the robot in the direction of the waypoint until the position we read from map_to_odom is close enough to our target waypoint.