This article presents a complete solution for autonomous mapping and inspection tasks, namely a lightweight multi-camera drone design coupled with computationally efficient planning algorithms and environment representations for enhanced autonomous navigation in exploration and mapping tasks. The proposed system utilizes state-of-the-art Next-Best-View (NBV) planning techniques, with geometric and semantic segmentation information computed with Deep Convolutional Neural Networks (DCNNs) to improve the environment map representation. The main contributions of this article are the following. First, we propose a novel efficient sensor observation model and a utility function that encodes the expected information gains from observations taken from specific viewpoints. Second, we propose a reward function that incorporates both geometric and semantic probabilistic information provided by a DCNN for semantic segmentation that operates in close to real-time. The incorporation of semantics in the environment representation enables biasing exploration towards specific object categories while disregarding task-irrelevant ones during path planning. Experiments in both a virtual and a real scenario demonstrate the benefits on reconstruction accuracy of using semantics for biasing exploration towards task-relevant objects, when compared with purely geometric state-of-the-art methods. Finally, we present a unified approach for the selection of the number of cameras on a UAV, to optimize the balance between power consumption, flight-time duration, and exploration and mapping performance trade-offs. Unlike previous design optimization approaches, our method is couples with the sense and plan algorithms. The proposed system and general formulations can be be applied in the mapping, exploration, and inspection of any type of environment, as long as environment dependent semantic training data are available, with demonstrated successful applicability in the inspection of dry dock shipyard environments.
Temporal planning is a hard problem that requires good heuristic and memoization strategies to solve efficiently. Merge-and-shrink abstractions have been shown to serve as effective heuristics for classical planning, but they have not yet been applied to temporal planning. Currently, it is still unclear how to implement merge-and-shrink in the temporal domain and how effective the method is in this setting. In this paper we propose a method to compute merge-and-shrink abstractions for temporal planning, applicable to both partial- and total-order temporal planners. The method relies on precomputing heuristics as formulas of temporal variables that are evaluated at search time, and it allows to use standard shrinking strategies and label reduction. Compared to state-of-the-art Relaxed Planning Graph heuristics, we show that the method leads to improvements in coverage, computation time, and number of explored nodes to solve optimal problems, as well as leading to improvements in unsolvability-proving of problems with deadlines.
The trade-offs between different desirable plan properties - e.g. PDDL temporal plan preferences - are often difficult to understand. Recent work addresses this by iterative planning with explanations elucidating the dependencies between such plan properties. Users can ask questions of the form ’Why does the plan not satisfy property p?’, which are answered by ’Because then we would have to forego q’. It has been shown that such dependencies can be computed reasonably efficiently. But is this form of explanation actually useful for users? We run a large crowd-worker user study (N = 100 in each of 3 domains) evaluating that question. To enable such a study in the first place, we contribute a Web-based platform for iterative planning with explanations, running in standard browsers. Comparing users with vs. without access to the explanations, we find that the explanations enable users to identify better trade-offs between the plan properties, indicating an improved understanding of the planning task.
Multi-Agent Path Finding (MAPF) plans can be very complex to analyze and understand. Recent user studies have shown that explanations would be a welcome tool for MAPF practitioners and developers to better understand plans, as well as to tune map layouts and cost functions. In this paper we formulate two variants of an explanation problem in MAPF that we call contrastive "map-based explanation". The problem consists of answering the question "why don’t agents A follow paths P’ instead?"—by finding regions of the map that would have to be an obstacle in order for the expected plan to be optimal. We propose three different methods to compute these explanations, and evaluate them quantitatively on a set of benchmark problems that we make publicly available. Motivations for generating this type of explanation are discussed in the paper and include both user understanding of MAPF problems, and designer-aids to guide the improvement of map layouts.
Multi-Agent Path Finding (MAPF) and Multi-Robot Motion Planning (MRMP) are complex problems to solve, analyze and build algorithms for. Automatically-generated explanations of algorithm output, by improving human understanding of the underlying problems and algorithms, could thus lead to better user experience, developer knowledge, and MAPF/MRMP algorithm designs. Explanations are contextual, however, and thus developers need a good understanding of the questions that can be asked about algorithm output, the kinds of explanations that exist, and the potential users and uses of explanations in MAPF/MRMP applications. In this paper we provide a first step towards establishing a taxonomy of explanations, and a list of requirements for the development of explainable MAPF/MRMP planners. We use interviews and a questionnaire with expert developers and industry practitioners to identify the kinds of questions, explanations, users, uses, and requirements of explanations that should be considered in the design of such explainable planners. Our insights cover a diverse set of applications: warehouse automation, computer games, and mining.
In this work we propose a holistic framework for autonomous aerial inspection tasks, using semantically-aware, yet, computationally efficient planning and mapping algorithms. The system leverages state-of-the-art receding horizon exploration techniques for next-best-view (NBV) planning with geometric and semantic segmentation information provided by state-of-the-art deep convolutional neural networks (DCNNs), with the goal of enriching environment representations. The contributions of this article are threefold, first we propose an efficient sensor observation model, and a reward function that encodes the expected information gains from the observations taken from specific view points. Second, we extend the reward function to incorporate not only geometric but also semantic probabilistic information, provided by a DCNN for semantic segmentation that operates in real-time. The incorporation of semantic information in the environment representation allows biasing exploration towards specific objects, while ignoring task-irrelevant ones during planning. Finally, we employ our approaches in an autonomous drone shipyard inspection task. A set of simulations in realistic scenarios demonstrate the efficacy and efficiency of the proposed framework when compared with the state-of-the-art.
Path planners are important components of various products from video games to robotics, but their output can be counter-intuitive due to problem complexity. As a step towards improving the understanding of path plans by various users, here we propose methods that generate explanations for the optimality of paths. Given the question "why is path A optimal, rather than B which I expected?", our methods generate an explanation based on the changes to the graph that make B the optimal path. We focus on the case of path planning on navigation meshes, which are heavily used in the computer game industry and robotics. We propose two methods - one based on a single inverse-shortest-paths optimization problem, the other incrementally solving complex optimization problems. We show that these methods offer computation time improvements of up to 3 orders of magnitude relative to domain-independent search-based methods, as well as scaling better with the length of explanations. Finally, we show through a user study that, when compared to baseline cost-based explanations, our explanations are more satisfactory and effective at increasing users’ understanding of problems.
Visions have an important role in guiding and legitimizing technical research, as well as contributing to expectations of the general public towards technologies. In this paper we analyze technical robotics papers published between 1998 and 2019 to identify themes, trends and issues with the visions and values promoted by robotics research. In particular, we identify the themes of robotics visions and implicitly normative visions; and we quantify the relative presence of a variety of values and applications within technical papers. We conclude with a discussion of the language of robotics visions, marginalized visions and values, and possible paths forward for the robotics community to better align practice with societal interest. We also discuss implications and future work suggestions for Responsible Robotics and HRI research.
Motion planning is a hard problem that can often overwhelm both users and designers: due to the difficulty in understanding the optimality of a solution, or reasons for a planner to fail to find any solution. Inspired by recent work in machine learning and task planning, in this paper we are guided by a vision of developing motion planners that can provide reasons for their output - thus potentially contributing to better user interfaces, debugging tools, and algorithm trustworthiness. Towards this end, we propose a preliminary taxonomy and a set of important considerations for the design of explainable motion planners, based on the analysis of a comprehensive user study of motion planning experts. We identify the kinds of things that need to be explained by motion planners ("explanation objects"), types of explanation, and several procedures required to arrive at explanations. We also elaborate on a set of qualifications and design considerations that should be taken into account when designing explainable methods. These insights contribute to bringing the vision of explainable motion planners closer to reality, and can serve as a resource for researchers and developers interested in designing such technology.
The trade-offs between different desirable plan properties - e.g. PDDL temporal plan preferences - are often difficult to understand. Recent work proposes to address this by iterative planning with explanations elucidating the dependencies between such plan properties. Users can ask questions of the form ’Why does the plan you suggest not satisfy property p?’, which are answered by ’Because then we would have to forego q’ where not-q is entailed by p in plan space. It has been shown that such plan-property dependencies can be computed reasonably efficiently. But is this form of explanation actually useful for users? We contribute a user study evaluating that question. We design use cases from three domains and run a large user study (N = 40 for each domain, ca. 40 minutes work time per user and domain) on the internet platform Prolific. Comparing users with vs. without access to the explanations, we find that the explanations tend to enable users to identify better trade-offs between the plan properties, indicating an improved understanding of the task.
Recent research in AI ethics has put forth explainability as an essential principle for AI algorithms. However, it is still unclear how this is to be implemented in practice for specific classes of algorithms - such as motion planners. In this paper we unpack the concept of explanation in the context of motion planning, introducing a new taxonomy of kinds and purposes of explanations in this context. We focus not only on explanations of failure (previously addressed in motion planning literature) but also on contrastive explanations - which explain why a trajectory A was returned by a planner, instead of a different trajectory B expected by the user. We develop two explainable motion planners, one based on optimization, the other on sampling, which are capable of answering failure and constrastive questions. We use simulation experiments and a user study to motivate a technical and social research agenda.
In this paper we investigate and characterize social fairness in the context of coverage path planning. Inspired by recent work on the fairness of goal-directed planning, and work characterizing the disparate impact of various AI algorithms, here we simulate the deployment of coverage robots to anticipate issues of fairness. We show that classical coverage algorithms, especially those that try to minimize average waiting times, will have biases related to the spatial segregation of social groups. We discuss implications in the context of disaster response, and provide a new coverage planning algorithm that minimizes cumulative unfairness at all points in time. We show that our algorithm is 200 times faster to compute than existing evolutionary algorithms - while obtaining overall-faster coverage and a fair response in terms of waiting-time and coverage-pace differences across multiple social groups.
Drowsiness and fatigue are important factors in driving safety and work performance. This has motivated academic research into detecting drowsiness, and sparked interest in the deployment of related products in the insurance and work-productivity sectors. In this paper we elaborate on the potential dangers of using such algorithms. We first report on an audit of performance bias across subject gender and ethnicity, identifying which groups would be disparately harmed by the deployment of a state-of-the-art drowsiness detection algorithm. We discuss some of the sources of the bias, such as the lack of robustness of facial analysis algorithms to face occlusions, facial hair, or skin tone. We then identify potential downstream harms of this performance bias, as well as potential misuses of drowsiness detection technology - focusing on driving safety and experience, insurance cream-skimming and coverage-avoidance, worker surveillance, and job precarity.
In this work we showcase the design and assessment of the performance of a multi-camera UAV, when coupled with state-of-the-art planning and mapping algorithms for autonomous navigation. The system leverages state-of-the-art receding horizon exploration techniques for Next-Best-View (NBV) planning with 3D and semantic information, provided by a reconfigurable multi stereo camera system. We employ our approaches in an autonomous drone-based inspection task and evaluate them in an autonomous exploration and mapping scenario. We discuss the advantages and limitations of using multi stereo camera flying systems, and the trade-off between number of cameras and mapping performance.
In this paper we propose methods that provide explanations for path plans, in particular those that answer questions of the type "why is path A optimal, rather than path B which I expected?". In line with other work in eXplainable AI Planning (XAIP), such explanations could help users better understand the outputs of path planning methods, as well as help debug or iterate the design of planners and maps. By specializing the explanation methods to path planning, using optimization-based inverse-shortest-paths formulations, we obtain drastic computation time improvements relative to general XAIP methods, especially as the length of the explanations increases. One of the claims of this paper is that such specialization might be required for explanation methods to scale and therefore come closer to real-world usability. We propose and evaluate the methods on large-scale navigation meshes, which are representations for path planning heavily used in the computer game industry and robotics.
In this paper we investigate potential issues of fairness related to the motion of mobile robots. We focus on the particular use case of humanitarian mapping and disaster response. We start by showing that there is a fairness dimension to robot navigation, and use a walkthrough example to bring out design choices and issues that arise during the development of a fair system. We discuss indirect discrimination, fairness-efficiency trade-offs, the existence of counter-productive fairness definitions, privacy and other issues. Finally, we conclude with a discussion of the potential of our methodology as a concrete responsible innovation tool for eliciting ethical issues in the design of autonomous systems.
Vehicle insurance companies have started to offer usage-based policies which track users to estimate premiums. In this paper we argue that usage-based vehicle insurance can lead to indirect discrimination of sensitive personal characteristics of users, have a negative impact in multiple personal freedoms, and contribute to reinforcing existing socio-economic inequalities. We argue that there is an incentive for autonomous vehicles (AVs) to use similar insurance policies, and anticipate new sources of indirect and structural discrimination. We conclude by analyzing the advantages and disadvantages of alternative insurance policies for AVs: no-fault compensation schemes, technical explainability and fairness, and national funds.
The process of designing hierarchical motion planners typically involves problem-specific intuition and implementations. This process is sub-optimal both in terms of solution space (amount of possibilities for search-space approximations, choice of planner parameters, etc) and amount of human labour. In this paper we show that the design of hierarchical motion planners does not have to be manual. We present a method for parameterizing and then optimizing sequences of problem approximations used in hierarchical motion planning. We define these as a specific kind of graph with intermediate state-spaces and solutions as nodes, and costs and planner parameters as edge properties. These properties become a continuous optimization variable that changes the sequence and parameters of sub-planners in the hierarchy. Using Pareto-front estimation, our method automatically discovers multiple designs of optimal computation-time/motion-cost trade-offs. We evaluate the method on a set of legged robot motion planning problems where hand-designed hierarchies are abundant. Our method discovers sequences of problem approximations which achieve similar—though slightly higher—performance than the best human-designed hierarchies. The performance gain significantly increases on new problems, yielding 12x faster computation times and 10% higher success rates.
Inspection and monitoring of assets are repetitive and expensive tasks and have higher risk when facilities are located offshore. Robotics holds the promise of improving the efficiency and safety of such platforms by allowing inspection and continuous monitoring remotely for difficult-to-access facilities. Legged robots, such as quadrupedal robots, are promising machines to achieve this goal: they have high maneuverability both indoors and outdoors, they are designed for accessing and navigating facilities that are built for humans (e.g. stairs, step-over piping, narrow passageways) and can carry a variety of sensors targeted at inspection and monitoring tasks. In this paper we introduce our approach for autonomous inspection of oil & gas platforms using legged robots. Our approach is being developed as part of the ORCA Hub (Offshore Robotics for Certification of Assets), a UK robotics research hub. We envision a highly autonomous robotic system that conducts inspections with minimal intervention by human operators. The robot can navigate through facilities, as shown in Figure 1, accomplishing crucial tasks such as 3D mapping, monitoring of thermal build-up using thermal cameras, pressure sensing and other sensor and also using color cameras to detect people and to carry out general visual inspection. We demonstrate and evaluate the system’s perception, locomotion and inspection capabilities on a training facility that realistically simulates an oil rig at the Fire Service College, Moreton-in-Marsh, UK and an industrial area in the Offshore Renewable Energy Catapult Facility, Blyth, UK. We show the result of both autonomous and real-time teleoperated missions, and analyze the accuracy and efficiency of the system.
This paper tackles the problem of designing 3D perception systems for robots with high visual requirements, such as versatile legged robots capable of different locomotion styles. In order to guarantee high visual coverage in varied conditions (e.g. biped walking, quadruped walking, ladder climbing), such robots need to be equipped with a large number of sensors, while at the same time managing the computational requirements that arise from such a system. We tackle this problem at both levels: sensor placement (how many sensors to install on the robot and where) and run time acquisition scheduling under computational constraints (not all sensors can be acquired and processed at the same time). Our first contribution is a methodology for designing perception systems with a large number of depth sensors scattered throughout the links of a robot, using multi-objective optimization for optimal trade-offs between visual coverage and the number of sensors. We estimate the Pareto-front of these objectives through evolutionary optimization, and implement a solution on a real legged robot. Our formulation includes constraints on task-specific coverage and design symmetry, which lead to reliable coverage and fast convergence of the optimization problem. Our second contribution is an algorithm for lowering the computational burden of mapping with such a high number of sensors, formulated as an information-maximization problem with several sampling techniques for speed. Our final system uses 20 depth sensors scattered throughout the robot, which can either be acquired simultaneously or optimally scheduled for low CPU usage while maximizing mapping quality. We show that, when compared to state-of-the-art robotic platforms, our system has higher coverage across a higher number of tasks, thus being suitable for challenging environments and versatile robots. We also demonstrate that our scheduling algorithm allows to obtain higher mapping performance than naive and state-of-the-art methods by leveraging on measures of information gain and self-occlusion at low computational costs.
Long-range locomotion planning is an important problem for the deployment of legged robots to real scenarios. Current methods used for legged locomotion planning often do not exploit the flexibility of legged robots, and do not scale well with environment size. In this paper we propose the use of navigation meshes for deployment in large-scale, potentially multi-floor sites. We leverage this representation to improve long-term locomotion plans in terms of success rates, path costs and reasoning about which gait-controller to use when. We show that NavMeshes have higher planning success rates than sampling-based planners, but are 400x faster to construct and at least 100x faster to plan with. The performance gap further increases when considering multi-floor environments. We present both a procedure for building controller-aware NavMeshes and a full navigation system that adapts to changes to the environment. We demonstrate the capabilities of the system in simulation experiments and in field trials at a real-world oil rig facility.
In recent years, the development and deployment of autonomous systems such as mobile robots have been increasingly common. Investigating and implementing ethical considerations such as fairness in autonomous systems is an important problem that is receiving increased attention, both because of recent findings of their potential undesired impacts and a related surge in ethical principles and guidelines. In this paper we take a new approach to considering fairness in the design of autonomous systems: we examine fairness by obtaining formal definitions, applying them to a system, and simulating system deployment in order to anticipate challenges. We undertake this analysis in the context of the particular technical problem of robot navigation. We start by showing that there is a fairness dimension to robot navigation, and we then collect and translate several formal definitions of distributive justice into the navigation planning domain. We use a walkthrough example of a rescue robot to bring out design choices and issues that arise during the development of a fair system. We discuss indirect discrimination, fairness-efficiency trade-offs, the existence of counter-productive fairness definitions, privacy and other issues. Finally, we elaborate on important aspects of a research agenda and reflect on the adequacy of our methodology in this paper as a general approach to responsible innovation in autonomous systems.
Different legged robot locomotion controllers offer different advantages; from speed of motion to energy, computational demand, safety and others. In this paper we propose a method for planning locomotion with multiple controllers and sub-planners, explicitly considering the multi-objective nature of the legged locomotion planning problem. The planner first obtains body paths extended with a choice of controller or sub-planner, and then fills the gaps by sub-planning. The method leads to paths with a mix of static and dynamic walking which only plan footsteps where necessary. We show that it is faster than pure footstep planning methods both in computation (2x) and mission time (1.4x), and safer than pure dynamic-walking methods. In addition, we propose two methods for aggregating the multiple objectives in search-based planning and reach desirable trade-offs without weight tuning. We show that they reach desirable Pareto-optimal solutions up to 8x faster than fairly-tuned traditional weighted-sum methods. Our conclusions are drawn from a combination of planning, physics simulation, and real robot experiments.
In this paper we evaluate the age and gender bias in state-of-the-art pedestrian detection algorithms. These algorithms are used by mobile robots such as autonomous vehicles for locomotion planning and control. Therefore, performance disparities could lead to disparate impact in the form of biased crash outcomes. Our analysis is based on the INRIA Person Dataset extended with child, adult, male and female labels. We show that all of the 24 top-performing methods of the Caltech Pedestrian Detection Benchmark have higher miss rates on children. The difference is significant and we analyse how it varies with the classifier, features and training data used by the methods. Algorithms were also gender-biased on average but the performance differences were not significant. We discuss the source of the bias, the ethical implications, possible technical solutions and barriers to "solving" the issue.
Motivated by experiments showing that humans’ localization performance changes with walking parameters, in this paper we explore the effects of walking gait on biped humanoid localization. We focus on walking style (normal and gallop) and gait symmetry (one side slower), and we assess the performance of visual odometry (VO) and kinematic odometry algorithms for the robot’s localization. Changing the walking style from normal to gallop slightly improved the performance of the visual localization, which was related to a reduction in torques on the feet. Changing the gait temporal symmetry worsened the performance of the visual algorithms, which according to an analysis of inertial data, is related to an increase of mechanical vibrations and camera rotations. Both changes of gait style and symmetry decreased the performance of the kinematic localization, caused by the increase of vertical ground reaction forces, to which kinematic odometry is very sensitive. These observations support our claim that gait and footstep planning could be used to improve the performance of localization algorithms in the future.
Motivated by experiments showing that humans regulate their walking speed in order to improve localization performance, in this paper we explore the effects of walking gait on biped humanoid localization. We focus on step length as a proxy for speed and because of its ready applicability to current footstep planners, and we compare the performance of three different sparse visual odometry (VO) algorithms as a function of step length: a direct, a semi-direct and an indirect algorithm. The direct algorithm’s performance decreased the longer the step lengths, which along with the analysis of inertial and force/torque data, point to a decrease in performance due to an increase of mechanical vibrations. The indirect algorithm’s performance decreased in an opposite way, i.e., showing more errors with shorter step lengths, which we show to be due to the effects of drift over time. The semi-direct algorithm showed a performance in-between the previous two. These observations show that footstep planning could be used to improve the performance of VO algorithms in the future.
Modeling heat transfer is an important problem in high-power electrical robots as the increase of motor temperature leads to both lower energy efficiency and the risk of motor damage. Power consumption itself is a strong restriction in these robots especially for battery-powered robots such as those used in disaster-response. In this paper, we propose to reduce power consumption and temperature for robots with high-power DC actuators without cooling systems only through motion planning. We first propose a parametric thermal model for brushless DC motors which accounts for the relationship between internal and external temperature and motor thermal resistances. Then, we introduce temperature variables and a thermal model constraint on a trajectory optimization problem which allows for power consumption minimization or the enforcing of temperature bounds during motion planning. We show that the approach leads to qualitatively different motion compared to typical cost function choices, as well as energy consumption gains of up to 40%.
This paper addresses two issues with the development of ethical algorithms for autonomous vehicles. One is that of uncertainty in the choice of ethical theories and utility functions. Using notions of moral diversity, normative uncertainty, and autonomy, we argue that each vehicle user should be allowed to choose the ethical views by which the vehicle should act. We then deal with the issue of indirect discrimination in ethical algorithms. Here we argue that equality of opportunity is a helpful concept, which could be applied as an algorithm constraint to avoid discrimination on protected characteristics.
Complex robots such as legged and humanoid robots are often characterized by non-convex optimization landscapes with multiple local minima. Obtaining sets of these local minima has interesting applications in global optimization, as well as in smart teleoperation interfaces with automatic posture suggestions. In this paper we propose a new heuristic method to obtain sets of local minima, which is to run multiple minimization problems initialized around a local maximum. The method is simple, fast, and produces diverse postures from a single nominal posture. Results on the robot WAREC using a sum-of-squared-torques cost function show that our method quickly obtains lower-cost postures than typical random restart strategies. We further show that obtained postures are more diverse than when sampling around nominal postures, and that they are more likely to be feasible when compared to a uniform sampling strategy. We also show that lack of completeness leads to the method being most useful when computation has to be fast, but not on very large computation time budgets.
Trajectory optimization and posture generation are hard problems in robot locomotion, which can be non-convex and have multiple local optima. Progress on these problems is further hindered by a lack of open benchmarks, since comparisons of different solutions are difficult to make. In this paper we introduce a new benchmark for trajectory optimization and posture generation of legged robots, using a pre-defined scenario, robot and constraints, as well as evaluation criteria. We evaluate state-of-the-art trajectory optimization algorithms based on sequential quadratic programming (SQP) on the benchmark, as well as new stochastic and incremental optimization methods borrowed from the large-scale machine learning literature. Interestingly we show that some of these stochastic and incremental methods, which are based on stochastic gradient descent (SGD), achieve higher success rates than SQP on tough initializations. Inspired by this observation we also propose a new incremental variant of SQP which updates only a random subset of the costs and constraints at each iteration. The algorithm is the best performing in both success rate and convergence speed, improving over SQP by up to 30% in both criteria. The benchmark’s resources and a solution evaluation script are made openly available.
Friction estimation from vision is an important problem for robot locomotion through contact. The problem is challenging due to its dependence on many factors such as material, surface conditions and contact area. In this paper we 1) conduct an analysis of image features that correlate with humans’ friction judgements; and 2) compare algorithmic to human performance at the task of predicting the coefficient of friction between different surfaces and a robot’s foot. The analysis is based on two new datasets which we make publicly available. One is annotated with human judgements of friction, illumination, material and texture; the other is annotated with static coefficient of friction (COF) of a robot’s foot and human judgments of friction. We propose and evaluate visual friction prediction methods based on image features, material class and text mining. And finally, we make conclusions regarding the robustness to COF uncertainty which is necessary by control and planning algorithms; the low performance of humans at the task when compared to simple predictors based on material label; and the promising use of text mining to estimate friction from vision.
In this paper we tackle the problem of visually predicting surface friction for environments with diverse surfaces, and integrating this knowledge into biped robot locomotion planning. The problem is essential for autonomous robot locomotion since diverse surfaces with varying friction abound in the real world, from wood to ceramic tiles, grass or ice, which may cause difficulties or huge energy costs for robot locomotion if not considered. We propose to estimate friction and its uncertainty from visual estimation of material classes using convolutional neural networks, together with probability distribution functions of friction associated with each material. We then robustly integrate the friction predictions into a hierarchical (footstep and full-body) planning method using chance constraints, and optimize the same trajectory costs at both levels of the planning method for consistency. Our solution achieves fully autonomous perception and locomotion on slippery terrain, which considers not only friction and its uncertainty, but also collision, stability and trajectory cost. We show promising friction prediction results in real pictures of outdoor scenarios, and planning experiments on a real robot facing surfaces with different friction.
Energy efficiency and robustness of locomotion to different terrain conditions are important problems for humanoid robots deployed in the real world. In this paper, we propose a footstep-planning algorithm for humanoids that is applicable to flat, slanted, and slippery terrain, which uses simple principles and representations gathered from human gait literature. The planner optimizes a center-of-mass (COM) mechanical work model subject to motion feasibility and ground friction constraints using a hybrid A* search and optimization approach. Footstep placements and orientations are discrete states searched with an A* algorithm, while other relevant parameters are computed through continuous optimization on state transitions. These parameters are also inspired by human gait literature and include footstep timing (double-support and swing time) and parameterized COM motion using knee flexion angle keypoints. The planner relies on work, the required coefficient of friction (RCOF), and feasibility models that we estimate in a physics simulation. We show through simulation experiments that the proposed planner leads to both low electrical energy consumption and human-like motion on a variety of scenarios. Using the planner, the robot automatically opts between avoiding or (slowly) traversing slippery patches depending on their size and friction, and it chooses energy-optimal stairs and climbing angles in slopes. The obtained motion is also consistent with observations found in human gait literature, such as human-like changes in RCOF, step length and double-support time on slippery terrain, and human-like curved walking on steep slopes. Finally, we compare COM work minimization with other choices of the objective function.
Stereo confidence measures are important functions for global reconstruction methods and some applications of stereo. In this article we evaluate and compare several models of confidence which are defined at the whole disparity range. We propose a new stereo confidence measure to which we call the Histogram Sensor Model (HSM), and show how it is one of the best performing functions overall. We also introduce, for parametric models, a systematic method for estimating their parameters which is shown to lead to better performance when compared to parameters as computed in previous literature. All models were evaluated when applied to two different cost functions at different window sizes and model parameters. Contrary to previous stereo confidence measure benchmark literature, we evaluate the models with criteria important not only to winner-take-all stereo, but also to global applications. To this end, we evaluate the models on a real-world application using a recent formulation of 3D reconstruction through occupancy grids which integrates stereo confidence at all disparities. We obtain and discuss our results on both indoors’ and outdoors’ publicly available datasets.
In this paper we use an extended footstep planning algorithm to plan optimal humanoid locomotion trajectories subject to constraints on the maximum predicted Zero Moment Point (ZMP) tracking error. The approach can guarantee walking stability bounds with little extra computational burden, thus increasing safety of robots walking in challenging environments. This is done by estimating energy and stability models in simulation through Bayesian optimization, and smartly integrating the models into search-based planning.
Energy consumption and stability are two important problems for humanoid robots deployed in remote outdoor locations. In this paper we propose an extended footstep planning method to optimize energy consumption while considering motion feasibility and ground friction constraints. To do this we estimate models of energy, feasibility and slippage in physics simulation, and integrate them into a hybrid A* search and optimization-based planner. The graph search is done in footstep position space, while timing (leg swing and double support times) and COM motion (parameterized height trajectory) are obtained by solving an optimization problem at each node. We conducted experiments to validate the obtained energy model on the real robot, as well as planning experiments showing 9 to 19% energy savings. In example scenarios, the robot can correctly plan to optimally traverse slippery patches or avoid them depending on their size and friction; and uses stairs with the most beneficial dimensions in terms of energy consumption.
The Uncanny valley hypothesis, which tells us that almost-human characteristics in a robot or a device could cause uneasiness in human observers, is an important research theme in the Human Robot Interaction (HRI) field. Yet, that phenomenon is still not well-understood. Many have investigated the external design of humanoid robot faces and bodies but only a few studies have focused on the influence of robot movements on our perception and feelings of the Uncanny valley. Moreover, no research has investigated the possible relation between our uneasiness feeling and whether or not we would accept robots having a job in an office, a hospital or elsewhere. To better understand the Uncanny valley, we explore several factors which might have an influence on our perception of robots, be it related to the subjects, such as culture or attitude toward robots, or related to the robot such as emotions and emotional intensity displayed in its motion. We asked 69 subjects (N = 69) to rate the motions of a humanoid robot (Perceived Humanity, Eeriness, and Attractiveness) and state where they would rather see the robot performing a task. Our results suggest that, among the factors we chose to test, the attitude toward robots is the main influence on the perception of the robot related to the Uncanny valley. Robot occupation acceptability was affected only by Attractiveness, mitigating any Uncanny valley effect. We discuss the implications of these findings for the Uncanny valley and the acceptability of a robotic worker in our society.
We propose a new biped locomotion planning method that optimizes locomotion speed subject to friction constraints. For this purpose we use approximate models of required coefficient of friction (RCOF) as a function of gait. The methodology is inspired by findings in human gait analysis, where subjects have been shown to adapt spatial and temporal variables of gait in order to reduce RCOF in slippery environments. Here we solve the friction problem similarly, by planning on gait parameter space: namely foot step placement, step swing time, double support time and height of the center of mass (COM). We first used simulations of a 48 degrees-of-freedom robot to estimate a model of how RCOF varies with these gait parameters. Then we developed a locomotion planning algorithm that minimizes the time the robot takes to reach a goal while keeping acceptable RCOF levels. Our physics simulation results show that RCOF-aware planning can drastically reduce slippage amount while still maximizing efficiency in terms of locomotion speed. Also, according to our experiments human-like stretched-knees walking can reduce slippage amount more than bent-knees (i.e. crouch) walking for the same speed.
We present a grid-based 3D reconstruction method which integrates all costs given by stereo vision into what we call a Cost-Curve Occupancy Grid (CCOG). Occupancy probabilities of grid cells are estimated in a Bayesian formulation, from the likelihood of stereo cost measurements taken at all distance hypotheses. This is accomplished with only a small set of probabilistic assumptions which we discuss in the paper. We quantitatively characterize the method’s performance under different conditions of both image noise and number of used stereo pairs, compared also to traditional algorithms. We complement the study by giving insights on design choices of CCOGs such as likelihood model, window size of the cost function and use of a hole filling method. Experiments were made on a real-world outdoors dataset with ground-truth data.
Humanoid robots have this formidable advantage to possess a body quite similar in shape to humans. This body grants them, obviously, locomotion but also a medium to express emotions without even needing a face. In this paper we propose to study the effects of emotional gaits from our biped humanoid robot on the subjects’ perception of the robot (recognition rate of the emotions, reaction time, anthropomorphism, safety, likeness, etc.). We made the robot walk towards the subjects with different emotional gait patterns. We assessed positive (Happy) and negative (Sad) emotional gait patterns on 26 subjects divided in two groups (whether they were familiar with robots or not). We found that even though the recognition of the different types of patterns does not differ between groups, the reaction time does. We found that emotional gait patterns affect the perception of the robot. The implications of the current results for Human Robot Interaction (HRI) are discussed.
We describe our recent developments in probabilistic modeling of 3D reconstruction with stereo vision, applied to planning strategies for locomotion and gaze. We first overview the use of probabilistic occupancy grids for 3D reconstruction, and the sensor models of stereo best suited to the problem. These grids are then used for robot navigation, which is tackled at two levels: 1) At the locomotion level, trajectories are computed from the grid using an A* search algorithm that minimizes the total probability of occupancy over the trajectory. 2) At the grid level, we propose two task-relevant active strategies which redirect the sensor to "maximum visible entropy" and "maximum visible occupancy" points along the planned locomotion trajectories. Steps 1) and 2) are executed alternately until the locomotion trajectory converges to a high certainty, safe solution. Results of the proposed gaze and locomotion planning strategies were obtained on simulated scenarios and a real robot. Estimates of the uncertainty that occupancy grids are subjected to in real outdoor scenarios were computed for different stereo sensor models. These estimates were used in active gaze simulations for an extensive comparison of gaze strategies across 400 randomly generated environments. The results show that careful modeling of stereo vision sensor uncertainty and the proposed task-relevant planning strategies lead to more complete and consequently collision-free reconstructions of the environment along planned robot trajectories.
We describe a learning strategy that allows a humanoid robot to autonomously build a representation of its workspace: we call this representation Reachable Space Map. Interestingly, the robot can use this map to (i) estimate the Reachability of a visually detected object (i.e. judge whether the object can be reached for, and how well, according to some performance metric) and (ii) modify its body posture or its position with respect to the object to achieve better reaching. The robot learns this map incrementally during the execution of goal-directed reaching movements; reaching control employs kinematic models that are updated online as well. Our solution is innovative with respect to previous works in three aspects: the robot workspace is described using a gaze-centered motor representation, the map is built incrementally during the execution of goal-directed actions, learning is autonomous and online. We implement our strategy on the 48-DOFs humanoid robot Kobian and we show how the Reachable Space Map can support intelligent reaching behavior with the whole-body (i.e. head, eyes, arm, waist, legs).
Extensive literature has been written on occupancy grid mapping for different sensors. When stereo vision is applied to the occupancy grid framework it is common, however, to use sensor models that were originally conceived for other sensors such as sonar. Although sonar provides a distance to the nearest obstacle for several directions, stereo has confidence measures available for each distance along each direction. The common approach is to take the highestconfidence distance as the correct one, but such an approach disregards mismatch errors inherent to stereo. In this work, stereo confidence measures of the whole sensed space are explicitly integrated into 3D grids using a new occupancy grid formulation. Confidence measures themselves are used to model uncertainty and their parameters are computed automatically in a maximum likelihood approach. The proposed methodology was evaluated in both simulation and a realworld outdoor dataset which is publicly available. Mapping performance of our approach was compared with a traditional approach and shown to achieve less errors in the reconstruction.
Robots depend on a world map representation in order to navigate on it. Only a part of the space around the agent can be sensed at each time and so measures must be taken in order to reduce the uncertainty of this map and likelihood of collision. In this work we propose the use of a probabilistic occupancy grid to guide active gaze of the robot on the “walk to target” task. A map uncertainty measure is proposed, as is a method for choosing gaze points along the robot’s computed trajectory to anticipate the need for trajectory changes. Gaze points are chosen from the whole space volume the robot will traverse. Then, robot trajectories are computed directly on the probabilistic map in order to drive the robot towards free-space areas of high confidence. A preliminary evaluation of the approach is done on a real scenario using the humanoid robot KOBIAN for the preparatory gaze exploration task necessary for safe trajectory planning to a target.
We present a novel control architecture for the integration of visually guided walking and whole-body reaching in a humanoid robot. We propose to use robot gaze as a common reference frame for both locomotion and reaching, as suggested by behavioral neuroscience studies in humans. A gaze controller allows the robot to track and fixate a target object, and motor information related to gaze control is then used to i) estimate the reachability of the target, ii) steer locomotion, iii) control whole-body reaching. The reachability is a measure of how well the object can be reached for, depending on the position and posture of the robot with respect to the target, and it is obtained from the gaze motor information using a mapping that has been learned autonomously by the robot through motor experience: we call this mapping Reachable Space Map. In our approach, both locomotion and whole-body movements are seen as ways to maximize the reachability of a visually detected object, thus i) expanding the robot workspace to the entire visible space and ii) exploiting the robot redundancy to optimize reaching. We implement our method on a full 48-DOF humanoid robot and provide experimental results in the real world.
Humanoid robots are complex sensorimotor systems where the existence of internal models are of utmost importance both for control purposes and for predicting the changes in the world arising from the system’s own actions. This so-called expected perception relies on the existence of accurate internal models of the robot’s sensorimotor chains. We assume that the kinematic model is known in advance but that the absolute offsets of the different axes cannot be directly retrieved from encoders. We propose a method to estimate such parameters, the zero position of the joints of a humanoid robotic head, by relying on proprioceptive sensors such as relative encoders, inertial sensing and visual input. We show that our method can estimate the correct offsets of the different joints (i.e. absolute positioning) in a continuous, online manner. Not only the method is robust to noise but it can as well cope with and adjust to abrupt changes in the parameters. Experiments with three different robotic heads are presented and illustrate the performance of the methodology as well as the advantages of using such an approach.
Tracking an object’s 3D position and orientation from a color image can been accomplished with particle filters if its color and shape properties are known. Unfortunately, initialization in particle filters is often manual or random, thus rendering the tracking recovery process slow or no longer autonomous. A method that uses image data to generate likely pose hypotheses for known objects is proposed. These generated pose hypotheses are then used to guide visual attention and computer resources in a “top-down” tracking system such as a particle filter: speeding up the tracking process and making it more robust to unpredictable movement.
Tracking an object’s 3D pose from a color image can been accomplished with particle filters if its color and shape properties are known a priori. Unfortunately, initialization in particle filters is often manual or random, thus rendering the tracking recovery process slow or no longer autonomous. A method that uses existing object information to better decide on where to automatically start or recover the tracking process is proposed. Each 3D pose of an object is observed as a 2D shape and so training is made to infer pose from image information. The object is first segmented through color, then shape description is made using geometric moments and finally a learning stage maps 2D shapes to 3D poses with an associated likelihood measure.