Learning From Demonstrations in Changing Environments: Learning Cost Functions and Constraints for Motion Planning Public
Downloadable Contentopen in viewer
We address the problem of performing complex tasks for a robot operating in changing environments. We propose two approaches to the following problem: 1) define task-specific cost functions for motion planning that represent path quality by learning from an expert's preferences and 2) using constraint-based representation of the task inside learning from demonstration paradigm. In the first approach, we generate a set of paths for a given task using a motion planner and collect data about their features (path length, distance from obstacles, etc.). We provide these paths to an expert as a set of pairwise comparisons. We then form a ranking of the paths from the expert's comparisons. This ranking is used as training data for learning algorithms, which attempt to produce a cost function that maps path feature values to a cost that is consistent with the expert's ranking. We test our method on two simulated car-maintenance tasks with the PR2 robot: removing a tire and extracting an oil filter. We found that learning methods which produce non-linear combinations of the features are better able to capture expert preferences for the tasks than methods which produce linear combinations. This result suggests that the linear combinations used in previous work on this topic may be too simple to capture the preferences of experts for complex tasks. In the second approach, we propose to introduce a constraint-based description of the task that can be used together with the motion planner to produce the trajectories. The description is automatically created from the demonstration by performing segmentation and extracting constraints from the motion. The constraints are represented with the Task Space Regions (TSR) that are extracted from the demonstration and used to produce a desired motion. To account for the parts of the motion where constraints are different a segmentation of the demonstrated motion is performed using TSRs. The proposed approach allows performing tasks on robot from human demonstration in changing environments, where obstacle distribution or poses of the objects could change between demonstration and execution. The experimental evaluation on two example motions was performed to estimate the ability of our approach to produce the desired motion and recover a demonstrated trajectory.
- Defense date
- Date created
- Resource type
- Rights statement
Permanent link to this page: https://digital.wpi.edu/show/hm50tr794