Mapping

Mapping refers to the ability of a robot to construct a representation or model of its environment. This is crucial for various robotic tasks, including localization, planning, and navigation. To do this we use sensor data to build a model of the robot's environment. In this document, we'll dive deep into Occupancy Grid Mapping (OGM) model.


Overview of mapping in robotics. z refers to sensor observations and x refers to the state and location of the robot.

Occupancy Grid Mapping (OGM)

OGM is a mapping method where the environment is represented as a discrete grid. Each cell in this grid represents a patch of space in the real world, and can either be occupied, free, or unknown.


An example occupancy grid. The black boxes indicate occupied regions, the white boxes indicate unoccupied (free) regions, and the gray boxes indicate unexplored regions.

Naive OGM

In a naive OGM, measurements from sensors (e.g., LiDAR or ultrasonic sensors) are used directly to update the grid. If a sensor measurement indicates that there's an obstacle at a specific location, the corresponding cell in the grid is marked as occupied. Conversely, if there's no obstacle, it's marked as free. In this tutorial we will mainly be working with LiDAR (Light Detection and Ranging) sensors. LiDARS use lasers and their reflections to estimate the distance of objects around the robot and are represented as rays eminating out of the robot. LiDARs are one of the most accurate distance sensors we can use and are used in applications like selfdriving cars, drones, Building Information Modeling (BIM), etc.

The major drawback of this approach is its binary nature. In other words, there's no room for uncertainty, which can lead to inaccuracies in the map, especially in dynamic environments or with noisy sensors.


Our implementation of OGM involves using a 2D lidar. The lidar scans show the distance of different points around the robot.

How does this work in detail?

Imagine we have a robot that has never been turned on before and we initialize it to be able to start tasks. It does not know anything about its surroundings. If we assume a grid representation of the environment, it would look something like this:

However, we know that the actual environment has some obstacles and looks like this:

To understand its surroundings, we incorporate a naive OGM approach. Let's say the robot is initialized like the following:


Red squares indicate robot location

We plan to use a LiDAR sensor to determine whether obstacles exist and whether the space discretized into grids is empty or not. For simplicity, Imagine that we have a LiDAR that only emits two rays.

The initial ray cast would be emitted.

Our sensors will detect that both emitted rays have hit an object. Given the length and angle of emission of these rays, we can determine which grid points are occupied. Afterwards, we know that all the other grid points this ray has traversed through are free, since if it was not, the ray would have detected a hit there and would not have gone through.

How do we find the free grids in between?

This is where the Bresenham Algorithm comes into play. The Bresenham algorithm is a line-drawing algorithm that determines the points that should be selected to form a close approximation to a straight line between two points. The idea is to identify which grid points that belong on this like to classify as free. The algorithm is relatively simple:

Given: Start Point: \((x_1, y_1)\)

Endpoint: \((x_2, y_2)\)

Horizontal Difference: \(\Delta x = |x_2 - x_1|\)

Vertical Difference: \(\Delta y = |y_2 - y_1|\)

We choose to find the grids that the line \(y = mx + b\) exists in.

The method's core revolves around finding the dominant axis. That is, if \(\Delta x > \Delta y\), the algorithm moves along the x grid, and if \(\Delta y > \Delta x\), it moves along the y grid.

To quantify this, the algorithm uses a decision variable \(D\), where \(D\) is defined as:

\[ D = 2 \cdot \Delta y - \Delta x \]

The algorithm iterates until \(D = 0\). If \(D \neq 0\), the following updates are made:

\[ D = D - 2 \cdot \Delta x \]

and

\[ D = D + 2 \cdot \Delta y \]

Once that is complete, we can update our robot map to represent those occupied and unoccupied grids.

Now imagine if the robot moved to another spot and used it's LiDAR again:

Keep in mind, that if a robot does not move and its sensor does not change its behavior, the map will always remain static.

In this case, now, we can see that the left ray has collided with an object and the right ray has gone towards the edge of our environment. This can be another obstacle or a max length for the ray of the LiDAR. Once again, we identify a plan to identify the free spaces between these two rays and their endpoints

Using the Bresenham algorithm, we can classify the points and update our map:

We can continue iterating throughout the map and time to keep improving on the mapped representation of the environment until all the grids are determined either occupied or not.

What are the drawbacks of using this method? At first glance, this method seems to work well and builds an accurate map of the robot's environment. However, this is only true in ideal settings in which we have 100% accurate sensors and a completely static environment, but these assumptions don't hold in realistic scenarios. Realistically, sensors will have noise that affects their accuracies and objects may be moving in the environment so a probabilistic approach is usually more accurate and widely used to implement OGM.

Demo

To try out building a naive OGM, run this Google Colab notebook:

Probabilistic OGM