Adam Jacoff, Ann Virts, Kam Saidi
Intelligent System Division, Engineering Laboratory
National Institute of Standards and Technology
RobotTestMethods [at] nist.gov (RobotTestMethods[at]nist[dot]gov)
Emergency responders literally risk life and limb interacting with known hazards to protect the public and rescue potential victims. They typically wear only conventional personal protective equipment while manually dealing with a variety of extreme hazards for which remotely operated robots should be well suited. Examples include disabling or dismantling improvised explosive devices (packages, personnel, and vehicles); establishing situational awareness during fires or police actions; assessing large-scale industrial or transportation accidents; investigating illicit border tunnels; or mitigating potential terrorist attacks involving chemical, biological, or radiological sources. Responders want to "start remote and stay remote" when dealing with such hazards and need capable robotic systems that can be remotely operated from safe standoff distances. Many responder organizations already own robots but have had difficulty deploying them effectively. New robots are promising advanced capabilities and more intuitive operator interfaces, but it is hard for responders to sift through the marketing. Responders need quantitative ways to measure whether any given robot is capable and reliable enough to perform specific missions. They also need to train with measures of operator proficiency to evaluate and improve very perishable operator skills and to identify deficiencies in equipment.
Since 2001, a series of Presidential Policy Directives and Homeland Security Directives on National Preparedness have prompted increased funding for new and better technologies for emergency responders, including purchasing response robots. In 2005, the Department of Homeland Security (DHS), Science and Technology Directorate, Office of Standards, engaged in a multi-year partnership with the National Institute of Standards and Technology (NIST) to develop a comprehensive suite of standard test methods to quantify key capabilities of robots for emergency response applications. A 2011 Presidential Policy Directive outlined the need for strengthening the security and resilience of the United States through systematic preparation for threats including acts of terrorism, pandemics, significant accidents, and catastrophic natural disasters. It emphasizes three national preparedness principles: 1) an all-hazards approach, 2) a focus on capabilities, and 3) outcomes with rigorous assessments to measure and track progress in building and sustaining capabilities over time. The following approach applies all three principles specifically for emergency response robots.
NIST has been developing the measurements and standards infrastructure necessary to evaluate robotic capabilities for public safety emergency responders, military organizations, and other critical national needs. NIST leads an international collaboration that has generated more than 50 test methods for robotic ground systems, aquatic systems, and micro aerial systems (FAA Group I under 2 kg (4.4 lbs)). These test methods measure robot maneuvering, mobility, dexterity, sensing, endurance, communication, durability, autonomy, logistics, and safety. They produce objective, quantitative results that facilitate comparisons of different robot configurations and highlight best-in-class implementations. A variety of civilian and military responder communities have used them to understand deployment capabilities and guide purchasing decisions. These test methods are now being applied toward remote operator training, to provide repeatable practice tasks and standard measures of proficiency.
Fifteen test methods have been standardized so far through the ASTM International Standards Committee on Homeland Security Applications; Operational Equipment; Robots (E54.08.01), which includes equal representation of robot developers, emergency responders, and civilian/military test administrators. Draft standard test methods are continually being validated in preparation for balloting, while new tests are being generated to address emerging requirements. The resulting suite is called the DHS-NIST-ASTM International Standard Test Methods for Response Robots.
An ongoing series of requirements workshops, robot competitions, and responder exercises have provided an effective mechanism for generating, validating, and disseminating these standard test methods. These standards are essentially just agreed-upon ways to measure capabilities no matter the particular robotic implementation. As such, they provide tangible challenges and quantitative results to facilitate communication between widely disparate international communities of responders, robot developers, program managers, procurement officials and others. There are now more than 10 collaborating facilities around the world hosting the entire suite. Several dozen more locations host subsets of the test methods to support specific objectives. One major benefit is they enable direct comparison of capabilities no matter where or when the tests are performed.
Examples of civilian organizations involved in helping to develop, validate, and/or use these test methods include: Department of Homeland Security; Department of Justice; Department of Commerce; Federal Emergency Management Agency; National Bomb Squad Commanders Advisory Board; National Capital Region Bomb Squad Working Group along with dozens of Federal, state and local bomb technicians from across the country; the Fire Department of New York City, and many more.
Examples of military organizations involved in helping to develop, validate, and/or use these test methods include: the Joint Improvised Explosive Device Defeat Organization (JIEDDO); Army Research Laboratory (ARL), the Defense Advanced Research Projects Agency (DARPA), Air Force and Air Force Reserves, NAVY SPAWAR, and several others.
Since the inception of this project, these test methods have helped guide robot manufacturers toward technical innovations that solve key mission tasks; they have encouraged hardening of systems to endure statistically significant repetitions within the test apparatuses; and they have informed robot purchase and deployment decisions with comprehensive test data. To date, these standards have been used to specify more than $60M worth of response robot purchases for soldiers, bomb squads, and hazmat teams. The result has been quantifiably more capable and reliable robots in the hands of users.
This suite of standard test methods is now being used to focus operator training and provide standard measures of operator proficiency that can help evaluate very perishable skills. The objective is to improve the effectiveness of remote operators and ensure they can reliably perform hazardous operational tasks from safer standoff distances.
In 2015, the Joint Program Office for Counter-Improvised Explosive Devices (C-IED) hosted 4 Interoperability and Training Exercises, aka the Raven's Challenges. These exercises provided an opportunity to validate 30 standard and draft standard test methods for C-IED missions. More than 200 bomb technicians from civilian and military response organizations trained in the test methods with more than 100 robots. Most used their own organization's robots, but commercial robot manufacturers also participated by providing robots for domestic and international responders who could not bring their own. The exercises demonstrated how to use repeatable tasks and inherent measures of proficiency to form a "circuit training" model for robots and operators. They included 10 basic skills test methods for maneuvering, mobility, dexterity, and camera pointing; 15 C-IED test methods for tasks involving packages, personnel, and vehicles; and 5 building access test methods for doors, steps, stairs, entanglements, and hallway labyrinths.
|
|
|
|
Regional facilities were fabricated for all four exercises with exactly the same 7 training lanes containing 30 standard and draft standard test methods. Regional training facilities of this size accommodate up to 7 robots concurrently without reconfiguring. Overall it is roughly 15 m (50 ft) x 30 m (100 ft) with walkways as shown. Each lane is 3 m (9 ft) wide x 10 m (33 ft) long. Components within each lane are interchangeable, so any lane can be reconfigured into any other. B) Increasingly challenging mid-range obstacles for maneuvering, mobility, and situational awareness. C) The end zones contain basic skills and operationally-relevant test methods for manipulator dexterity and tool deployments. The entire assembly was packed and shipped in 8 boxes, including remote operator stations, and it was set up and in approximately 6 hours.
|
|
|
Package-size IEDs are abstracted into repeatable, reproducible, embeddable tasks with clear measurements of success and failure. They are called the "embeddable" set of test methods because they can also be placed into operational training scenarios to help evaluate proficiency. A) A simple faceted inspection prop provides 5 visual acuity targets to identify from slightly different angles, similar to seeing all sides of an object or package. Several of these magnetically attached under a vehicle, for example, provide a comprehensive evaluation of underbody inspection with measures of visual acuity. Adding 10 cm (4 in) long pipes requires more arm dexterity to identify recessed visual acuity targets. B) Attached to a base plate with 10 cm (4 in) diameter PVC caps as locations around the object to place mineral water bottles. Note the short wires emerging from the PVC inserts as optional grasp features. C) PVC pipe inserts made of smaller diameter pipes provide easy control of precise dimensioning needed to replicate such standards widely. D) This basic prop supports several package size IED test methods including (left to right) Inspect, Aim Disruptor or Touch, Extract, Mineral Water Placement and X-Ray panel placement.
|
|
|
|
|
|
|
|
|
Training Lane 2 contains basic skills test methods. A) A basic maneuvering test to Align Edges emphasizes rotational control to align the robot perpendicular to the landing edges, and parallel to the rail edges to traverse. Two segments of parallel rails are set with their outer edges coincident with the outer width of each robot's ground contacts. So all robots have the same margin of error left to right, which is the width of the rail, before falling off and potentially high centering (a deterrent). This timed trial requires robots to balance going quickly downrange with precise control of steering angles in forward and reverse directions. Aligning edges is important especially when descending stairs, for example, where even 10 degrees out of perpendicular can result in catastrophic consequences. B) A basic situational awareness test for Camera Pointing emphasizes fine controls for pan-tilt-zoom-focus. There are 10 near field targets around the robot and 10 far field targets down range (C). This timed trial requires the remote operator to sequence between near field and far field targets in designated pairs to establish practical camera pointing skills with near and far field visual acuity.
|
|
|
Training Lane 3 contains basic skills test methods. A) A basic maneuvering test to traverse Crossing Ramps emphasizes stowing grasped objects to avoid rollover, and maintaining grip during inherent shaking. B) This apparatus consists of two segments of 15 degree pitch ramps offset from each other to create a more complex crossover section between two pylons 1.2 m (8 ft) apart at the center of the lane. This timed trial requires robots to traverse the crossing ramp obstacle driving forward and reverse with grasped objects (unweighted and weighted) that can occlude forward views. C) The down-range end zone contains a basic manipulator dexterity test for Weighted Grasp and Place tasks. This timed trial is performed on an inclined slope (in this case set to15 degrees) to complicate tasks involving grasp-lift-stow-turn-carry-reach-place.
|
|
|
|
|
|
Training Lane 4 contains vehicle-borne IED (VBIED) tasks. A) A basic maneuvering test to traverse Angled Curbs provides the mid-range obstacle. This emphasizes using coordinated articulators to avoid rollover while carrying grasped objects or deploying tools. This timed trial is conducted in forward and reverse. B, C and D) The VBIED cab (B and C) and cargo bay (D). Both adjust elevation in 30 cm (12 in) increments to vary between tractor-trailers and smaller vehicles. The vehicle cab (C) contains a PBIED prop, along with similar tasks as a mock dashboard and seat. The vehicle cargo bay (D) contains an optional tail gate and a shelf with a Weighted Grasp and Place test. The standard weight apparatus can be set to the workable weight of more operationally significant objects such as propane and gas tanks. E) Underbody search tasks under both the cab and cargo bay provide multi-faceted props to inspect (5 targets each). They also provide targets under which to place bootbangers. F) The sides of the apparatus have other tasks for window breaking, boring through cargo panel walls, and cutting (not shown).
|
|
|
Training Lane 5 contains building access test methods. A) The Door apparatus with one push door and one pull door. B) The doors can slide toward each other and be fixed in place making an enclosed 2.4 m (8 ft) square room, or a 1.2 m (4 ft) wide hallway so that there are open and confined approaches to both push and pull doors. Other variables that can be inserted are different types of door handles and spring closures. C) A Center in Alley test method augments the Door test in this lane. The blue tarp shown is an Entanglement obstacle providing some complexity. This kind of obstacle provides repeatable distraction for the remote operator during the task. It can be embedded anywhere in the tests or in scenarios to practice the situational awareness needed to know when such entanglement is happening, and how to back out of it appropriately.
|
|
|
|
Training Lane 6 contains building access test methods. A) A Hurdle obstacle provides basic skills training in coordinating articulators to surmount obstacles such as stoops. The PVC pipes and pallets stack to provide increments of 10 cm (4 in) to any elevation. The pipes rotate freely to ensure that robots must change their geometry to surmount the obstacle. B) Robots learn to either work with their manipulators or stow them when surmounting the obstacle in confined areas. C) The Stair apparatus is set to 40 degrees with wood bull-nose treads and 20 cm (8 in) risers. This training variant is set to a modest challenge given that the standard test method for stairs includes up to 45 degrees, rounded steel treads, and wet surfaces. D) Note the rope belay routed through an eye-bolt on the floor of the upper landing. Any tug on the rope from the side of the apparatus and the robot is pulled back into contact with the stairs safely with no chance of falling.
|
|
|
Training Lane 7 contains building access test methods. A) An overview of the Hallway labyrinth shows a series of turns to negotiate, with optional flooring complexity in the form of 4x4 ribs (shown), entanglements, ramps, or even more complex terrains (not shown). B) All the various test methods should be conducted in darkened rooms at some point during training. This apparatus is easily darkened with a black-out tarp even at field exercises. D) 10 mapping fiducials with 5 targets on each provide tasks to complete while searching and mapping the interior of the hallway. The remote operator is given instructions to find certain hazmat labels within the environment and to report back the number of the cylinder on which they find them. Robots in competitions for years now have been laser scanning the environment to create maps and searching automatically for standard hazmat labels. This test method uses half cylinders on both sides of walls so that they appear in the maps as circles (when done correctly) and viewed as a floor plan. Simply counting the number of fiducials clearly mapped in the environment provides a measure of search completeness. The relative proximity of two related half barrels on the map show clearly where maps are correct, where they begin to break, and where they are obviously wrong. These are essential tests that have inspired an entirely new and useful capability for response robots. But this capability has yet to be deployed within the responder community.
|
|
|
Remote operator stations should be used for all trials, out of sight and preferably out of sound of the robot within the test apparatus. This forces the remote operator to rely on the system interface for all situational awareness. Between the two operator stations is the metered backdrop of the photo booth used to capture the as-tested robot configuration. B) A scoreboard can be posted to capture high scores in each test method, and to track progress for everyone over time. C) Quad-screen video should be used to capture simultaneous views of the robot during trials. Two views of the robot within the test apparatus, typically an overview and a detail view, plus two views of the operator showing the interface and the hands on the controls so the operator intent can be captured. The four simultaneous views provide excellent feedback when the trial goes well, and when it doesn't.
|
|
|
Packing and shipping each training lane in a single box allows the entire facility to fit on a single truck. It can be set up in roughly 8 hours with a team of 8-10 people. It tears down even faster.
When evaluating robot capabilities using standard test methods, typically 20-30 different test methods are used for any given robot configuration. No single test method result characterizes a particular robot configuration. It's the trade-offs in performance across many test methods that provide the key to understanding a robot's deployment capabilities. So each test method involves 10-30 repetitions to establish a level of statistical significance chosen by the standards committee to be roughly 80% reliability with 80% confidence that the robot can perform the next repetition at the end of the trial. This means that within the first 10 repetitions in any test method, no faults are allowed; 1 fault is allowed in 20 repetitions; and 3 faults are allowed in 30 repetitions. Successfully completed task repetitions are logged along with the elapsed time to produce an average rate (tasks/time or distance/time in various terrains). Each test typically takes less than 1 hour to perform. So, comprehensive testing can be conducted in roughly three days for a reliable robot.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A wide variety of ground robots used during C-IED training exercises show how different sizes and configurations of robots could be applicable. They included some new robots provided by manufacturers, showing how standard test methods can also provide good teaching aids to learn new robot interfaces and capabilities.
Robot capability testing is always conducted using so-called "expert" operators provided by the robot manufacturer. This aligns incentives and ensures the best possible robot capabilities are captured as a baseline for comparison of different robots. It also provides a notional 100% of operator proficiency for a particular robot configuration against which others can compare themselves.
Operator training begins with basic skills test methods that every robot and operator should be able to perform to some level, like calisthenics for sports as a warm-up before every practice and game. These basic skills test methods exercise cameras, drive controls, manipulator coordination, and interface features in a comprehensive way to help operators gain "muscle memory" for often-repeated tasks. Then the operator can move on to repeatable, reproducible test apparatuses (or props) representing mission essential tasks. Each test lane includes incremental increases in complexity to enable a step-wise approach toward more advanced capabilities and operational readiness.
Training trials are typically time limited to 10 minutes each so that novice and expert operators work for similar times (10 related tests in 100 minutes, for example). Some tests are longer if more complex. The objective is to test long enough to establish a rate across 10 repetitions or so. Operators can then compare their own rate of success to that of the "expert." Different user communities can set their own thresholds of proficiency if they want. For example: Novice: 0–39%, Proficient: 40–79%. Expert: 80–100%. Different user communities can also define their own mission task profiles using a combination of 5 or more test methods. For example, a building search mission could consist of a combination of tests for doors + stairs + hallways + dexterity + visual acuity.
|
|
Preparing to train in scenarios, different user communities can combine mid-range obstacles with end zone tasks to vary complexity, like dealing with a curb (shown colored blue) while inspecting the inside of a vehicle cab. Or multiple lanes can be performed sequentially to simulate a mission profile, like the series of building access tests with suspicious package props to deal with along the way. B) Leveraging sports analogies, these test methods can encourage comprehensive training outside comfort zones, leading to more effective scenario training, and eventual missions.
Complete circuit training sessions can be conducted in 3-4 hours, alone or in groups. The strictly timed trials facilitate synchronized rotations of operators from test to test. Individual trainees can simply drive robots from test to test. Squads training together can operate all their robots in concurrent lanes and simply rotate operators as the time limits expire.
These test methods are not intended to replace regular scenario training. Rather, they provide a rigorous method to prepare for and enhance scenario training. Each test method is distinct, and can be practiced and tested individually. The repeatable tasks include controlled and increasing complexity to refine techniques and measure progress. Many of the standard task props can even be embedded into training scenarios to provide quantitative measures of capabilities that augment qualitative assessments typically associated with scenarios. Some examples include placing visual acuity targets (near-field and far-field versions) on 10 or more objects of interest throughout the scenario, in open areas and in confined, dark places. For example, several multi-faceted visual inspection props can be magnetically attached under actual vehicles to measure completeness of underbody inspections with actual levels of visual acuity to know what they can and cannot see. Manipulator dexterity props for Inspect, Aim Disruptor, Touch, Rotate, Extract can be placed almost anywhere inside buildings or vehicles to provide repeatable, score-able tasks within even relatively uncontrolled and complex scenario. These standard apparatuses can be performed right next to more operationally-significant objects to help identify issues when unsuccessful.
Summary of rules for standardized training:
Two of these regional training facilities remain as year-round venues. They join a growing roster of robot standard test facilities around the world that are interested in supporting regional responder organizations.
Collaborating facilities helping to generate, validate, and standardize a growing suite of test methods:
Counter-Improvised Explosive Device (C-IED) Applications
RobotTestMethods [at] nist.gov (Counter-Improvised Explosive Device (C-IED) Applications Assembly Guide (to receive a pointer to the guide, please e-mail RobotTestMethods[at]nist[dot]gov))
If you are having problems downloading a document, send an email to RobotTestMethods [at] nist.gov (RobotTestMethods[at]nist[dot]gov) with your name and organization. You will receive a pointer to download the document.
Visit http://nist.gov/el/isd/ms/robottestmethods.cfm
The list of bomb technicians that have diligently leaned in to help develop these test methods for C-IED applications is too long to innumerate. They all deserve thanks. But the list of those that actually helped set up and administer these regional test facilities is rather short. Which means they expended an extraordinary amount of effort and were extremely effective. Thanks goes to: