Collaborative Scene Perception with Multiple Sensing Modalities
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
With the increasing reliance on autonomous systems, there is a critical need for robots to perceive the world at least as good as a human does. This requires being able to take advantage of all the sensing modalities that are available to the robot and fuse them together to come up with the best estimate about the state of the observable surrounding. However, even with the tremendous research in the field of robot perception, there is still a long way for robots to serve as reliable teammates for humans in the wild. This dissertation explores gaps in four key areas affiliated to collaborative perception: choosing an apt feature representation, active perception, shared autonomy, and perception-enabled planning. First, a human-subject study is presented that reveals the challenges associated with current fusion models in situations when there is a human in the loop. The study depicts the unreliability of certain feature representations due to human errors that needs to be accounted for in subsequent decision-making steps. To facilitate active perception, a multi-stage question-answering scheme is proposed that helps the robot to seek specific human input with the goal of maximizing situational awareness. The algorithm is implemented on a ground robot and tested in a crowded environmental setting, proving its robustness. To develop a shared understanding of the surrounding in a search and rescue (SaR) mission, a deep learning-based approach is presented that fuses information from the visual and language domain. The fused knowledge is used to intelligently plan paths for a team of heterogeneous agents, resulting in safer paths while maintaining performance in terms of time to locate the victim. The approach is tested on the gazebo simulation platform. Finally, to bridge the gap between simulation and reality, specifically in the context of SaR missions, a dataset is developed with photo-realistic online images. A Bayesian fusion framework is developed for assessing danger from photo-realistic images and human language input. An extensive simulation campaign reveals that a danger-aware planner achieves a higher mission success rate compared to a naive shortest path planner.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Ferrari, Silvia