Multi-modal Environment Understanding

Recent years have seen impressive progress in Simultaneous Localization And Mapping (SLAM), which has been instrumental in transitioning robots from factory floors to unstructured environments. State-of-the-art SLAM can track visual-inertial systems over long trajectories in real time, while providing a metric reconstruction of the environment. Surprisingly, however, SLAM has advanced mostly in isolation from the impressive progress in object recognition and scene understanding, enabled by structured (deformable part) models and deep learning. Few approaches combine spatial and semantic information, despite the tremendous scientific and practical promise of multi-modal representations. The objective of our research is to develop perception, inference, and learning algorithms that unify metric (geometry, temperature, chemical concentration), semantic (object identities and functions), topological, and temporal properties in a common representation of the environment and other entities in it.


Unifying Geometry, Semantics, and Data Association in SLAM

Traditional approaches for simultaneous localization and mapping (SLAM) rely on geometric features such as points, lines, and planes to infer the environment structure. They make hard decisions about the (data) association between observed features and mapped landmarks to update the environment model. Our work makes two contributions to the state of the art in SLAM. First, it generalizes the purely geometric model by introducing semantically meaningful objects, represented as structured models of mid-level part features. Second, instead of making hard, potentially wrong associations between semantic features and objects, it shows that SLAM inference can be performed efficiently with probabilistic data association via matrix permanent computations. The approach not only allows building meaningful maps (containing doors, chairs, cars, etc.) but also offers significant advantages in ambiguous environments. Finally, one can go beyond SLAM and formulate high-level robot missions in terms of the objects on the map. Our work proposes algorithms for motion planning under temporal logic constraints in probabilistic semantic maps.

Relevant publications: [IROS'19] [IJCAI'18] [ICRA'17] [ICRA'16] [IJRR'15] [RSS'14]


Event-based Visual Inertial Odometry

Event-based cameras provide a new visual sensing model by detecting changes in image intensity asynchronously at almost unlimited frame rates. This opens the possibility for visual-inertial localization and mapping in extremely high speed and high dynamic range situations where traditional cameras fail. This new frame-less mode of operation, however, prohibits intensity gradient computations and necessitates new techniques for feature tracking and visual odometry. Our work proposes event-based features by grouping events in small spatiotemporal windows of duration determined by the optical flow length. Our feature tracking method alternates between probabilistic data association of events to features and optical flow computation based on expectation maximization (EM) of a translation model over all data associations. To enable long feature tracks, we also compute an affine deformation with respect to the initial feature point and use the resulting residual as a measure of persistence. We proposed a visual-inertial odometry algorithm that fuses the event-based features with inertial measurements to provide 6-D camera state estimates at a rate proportional to the camera velocity. The inferred trajectory is used to reduce the dimensionality of affine template matching during feature tracking, while events from the previous time step are used to reduce the complexity of optical flow estimation.

Relevant publications: [CVPR'17] [ICRA'17]


Global Localization using Object Recognition

Visual localization under a wide range of operational conditions is a fundamental problem in robotics. It is a critical importance for autonomous operation in GPS-denied or complex urban environments. Our work considers robot and vehicle localization with respect to a semantic map composed of objects. The objective is to provide global (re-)localization based on recognized objects in streaming visual input. Our contribution is a sensor measurement model for set-valued observations (i.e., detections) that captures both metric and semantic information (range, bearing, and object identities) and incorporates missed and false detections and unknown data association. We proved that obtaining the likelihood of a set-valued observation is equivalent to a matrix permanent computation, which leads to an efficient polynomial-time approximation of Bayesian inference with set-valued observations. More generally, we consider continuous estimation problems with discrete observations (e.g., binary signals, object detections, context information) and develop scalable inference algorithms.

Relevant publications: [IJRR'15] [RSS'14] [TAC'17] [Allerton'15]


Autonomous Information Acquisition

Autonomous operation in unknown, complex, unstructured environments requires maintaining and reducing a measure of uncertainty over the robot and environment models in order to recognize sudden, dynamic, or even adversarial changes. The goal of our research is to characterize and efficiently solve optimal control and reinforcement learning problems in which the usual mission-specific cost function is augmented with an information-theoretic measure of uncertainty over the robot and environment models. We aim to compute uncertainty-reducing control policies that allow the robots to autonomously decide how to improve the accuracy of their models and learn about new characteristics by exploring unknown areas or situations.


Fast Autonomous Flight in Cluttered GPS-Denied Environments

The video shows a report of our work on the DARPA Fast Lightweight Autonomy (FLA) program. The team is lead by Prof. Vijay Kumar at the University Pennsylvania. Our team is creating autonomous multi-rotor flying robots that are able to navigate in challenging indoor and outdoor environments without GPS at speeds up to 20m/s. Our work focuses on enabling completely autonomous, real-time, on-board operation, including state estimation, mapping and object recognition, planning, and control. We develop methods for visual-inertial odometry that remain accurate over kilometer-long trajectories and for reliable detection of obstacles and objects of interest in varying lighting conditions. We also develop efficient, long-range planning and geometric control approaches that take into account the dynamic feasibility and safety of intended quadrotor trajectories and compensate for aerodynamic disturbances.

Relevant publications: [JFR'18] [RAL'18] [IROS'17]


Search-based Motion Planning for Aggressive Flight in SE(3)

Collecting information in cluttered, unknown environments necessitates precise and safe maneuvering. Our work develops efficient trajectory planning techniques for differentially flat systems (e.g., multi-rotor and car-like robots) that account for the system's attitude and dynamics and guarantee global (sub)optimality. Differential flatness allows converting time-parameterized flat output (e.g., position and yaw) trajectories into control inputs. To plan time-parameterized trajectories, we consider a linear quadratic minimum time (LQMT) optimal control problem. Our idea is to generate short-duration dynamically feasible motion primitives by discretizing the input space (e.g., acceleration or jerk), which reduces the LQMT problem to graph search. A key insight is that, if the obstacle and input constraints are relaxed, the LQMT problem can be solved in closed form. This allows us to design an accurate and consistent heuristic function that enables extremely efficient search-based planning in the discretized control space. The desired vehicle orientation can also be computed from the motion primitives allowing us to check collisions for robots with complex shape.

Relevant publications: [JFR'18] [RAL'18] [IROS'17]


Active Multi-robot SLAM

Autonomous exploration and mapping of unknown environments can be formulated rigorously as an active SLAM problem. Our work considers the design of control policies for n robots that aim to minimize the uncertainty (entropy) in the joint map-robot representation. We proved that the classic separation principle between estimation and control holds for information theoretic objectives, as long as the robot observation and motion models are linearized. As a result, the optimal estimator is the Kalman filter/smoother, the uncertainty (entropy) is proportional to the log-determinant of the map-robot covariance matrix, and the problem can be reduced to deterministic optimal control. We developed approximation algorithms that manage the complexity of computing optimal information acquisition policies with respect to the length of the planning horizon T and the number of robots n. We achieve efficient performance by detecting and discarding uninformative robot trajectories from the search and by decentralizing both the control and estimation processes. Theoretically, we can show that our algorithms have bounded suboptimality regardless of the length of the planning horizon and obtain at least 50% of the (mutual) information achievable by the optimal centralized solution.

Relevant publications: [IROS'19] [RSS'19] [ACC'19] [RAL'18] [ICRA'15] [ICRA'14] [CDC'14] [ACC'17]


Active Tactile Sensing

Another example of an active information acquisition problem is object recognition using tactile feedback. Our work focuses on adaptive selection of sequences of wrist poses and enclosure grasps in order to achieves accurate touch-only recognition. We formulate an optimal control problem to minimize the number of touches and the probability of an incorrect object classification. The classic separation principle does not hold for such active classification and hypothesis testing problems because the discrete measurements and states (object classes) violate the necessary linear Gaussian assumptions. To enable efficient, yet non-greedy closed-loop planning, our work develops Monte Carlo tree search algorithms that approximate the optimal sequence of wrist poses.

Relevant publications: [IROS'17] [ICRA'15]


Active Deformable Part Models

This work proposes an active approach for part-based object detection, which optimizes the order of part filter evaluations and the time at which to stop and make a prediction. Statistics, describing the part responses, are learned from training data and are used to formalize part scheduling as an offline optimal control problem. Dynamic programming is applied to obtain a policy, which balances the number of part evaluations with the classification accuracy. During inference, the policy is used as a look-up table to choose the part order and the stopping time based on the observed filter responses. The method is faster than cascade detection with deformable part models (which does not optimize the part order) with negligible loss in accuracy when evaluated on the PASCAL VOC 2007 and 2010 datasets.

Relevant publications: [ECCV'14] [T-RO'14] [ICRA'13]


Active Object Recognition with a Mobile Depth Camera

Active information acquisition can also be used to optimize the trajectory of a depth camera in 3-D space in order to improve the performance of object classification and pose estimation. We can formulate a stochastic optimal control problem in which the objective is to minimize the probability of misclassification subject to constraints from the camera observation and motion model. Our work proposes an exact planning algorithm based on dynamic programming that obtains the optimal camera control policy but scales poorly with the size of the state and measurement spaces. To provide scalability, we developed an approximation algorithm based on Monte Carlo tree search with a rollout policy that exploits the structure of the probability of error objective function. Our experiments suggest that active approaches for camera view planning provide significant improvements over static object recognition. Also, an advantage of non-greedy planning is that it allows high-confidence object recognition with an adaptive decision threshold that depends on the observations received online.

Relevant publications: [T-RO'14] [ECCV'14] [ICRA'13]


Distributed Intelligence

Collaboration among multiple robots with heterogeneous sensing, memory, computation, and motion capabilities offers increased efficiency, accuracy, and robustness of environment modeling, information collection, and mission execution compared to any individual agent. Yet, existing tools for multi-robot collaboration are completely ineffective in the face of heterogeneity and large distributed teams. The goal of our research is to establish principles for composition of heterogeneous models residing at different agents and for collaborative inference and decision making among robots with heterogeneous capabilities.


Distributed Localization, Estimation, and Control

Designing localization, estimation, and navigation approaches for large heterogeneous robot teams requires rethinking many state-of-the-art algorithms to enable distributed computation and storage while relying only on local neighborhood communication with robot teammates. Our work consider three fundamental problems: (1) distributed localization: how to estimate the robot poses using relative measurements (e.g., range or bearing) of the poses of neighboring robots, (2) distributed estimation: how should sensor information be distributed across the robot networks when the robots are estimating a common phenomenon of interest such as a map of the environment or the concentration of pollution in the air, (3) distributed control: how should the robots collaborate to achieve collective tasks such as environment exploration or task assignment without using all-to-all communication or global network knowledge. Our distributed localization and estimation work develops algorithms, along with theoretical analysis, for utilizing relative position, range-only, or bearing-only measurements. We developed a distributed Jacobi algorithm for localization from relative-position measurements and a distributed Kalman filter for mobile target tracking. We proved that the two algorithms can be used in conjunction to achieve joint localization and estimation in sensor networks with arbitrarily small asymptotic mean square error. We also proposed Gaussian Mixture Multi Dimensional Scaling (GM-MDS) and barycentric-coordinate-based algorithms for range-only localization using ultra-wideband radio measurements. Our work on distributed navigation is specific to the active information gathering problem and exploits submodularity of information measures to prove that decentralized control schemes that scale linearly with the number of robots can provide guaranteed performance of at least 50% of the performance of the optimal fully-centralized control law.

Relevant publications: [ACC'19] [RAL'18] [IPSN'17] [ICRA'15] [CDC'14]


Distributed Source Seeking

Our work considers the problem of localizing the source of a physical signal of interest, such as magnetic force, heat, radio or chemical concentration, using a robot team. We proposed distributed control strategies for source seeking specific to two scenarios: one in which the robots have a noisy model of the signal formation process and one in which a signal model is not available. In the model-free scenario, the robot team follows a stochastic gradient of the signal field. Our approach is robust to deformations in the group geometry, does not necessitate global localization, and is guaranteed to lead the robots to a neighborhood of a local maximum of the field. The performance of the model-free algorithm is demonstrated in the video using a single robot to localize the source of a wireless radio signal. In the model-based scenario, the robots follow a stochastic gradient of the mutual information between predicted signal measurements and the predicted source location. In contrast with existing work which insists on improving the quality of the gradient estimate as much as possible, we show that good performance can be achieved by using only a few predicted signal measurements.

Relevant publications: [JDSMC'15] [ICRA'12]