Due Diligence Recommendations for the Mobile, Autonomous and Driverless Industry
Program Engineering Manager-Customer Success - Systems Engineer
Autopilot Issues and Weaponizing of Vehicles
Autonomous/Driverless Systems
Several companies are currently creating driverless or autopilot vehicles and are using their customers, the public around them and in some cases professional drivers, to gather the data they need and test those vehicles to create their products. This makes those people beta testers or Guinea pigs. It puts them at risk for no reason. It also wastes time. There is a better way. This includes Tesla and comma.ai. Waymo uses professional drivers. (Ford is stopping that practice because the professional drivers are falling asleep).
Tesla
Tesla has convinced NHTSA, the insurance companies and others that their autonomous vehicles, even though nowhere near complete, saves lives by lowering human caused front end collisions by 40%. They use that as a justification to use the public as data gathering and test subjects. While this statistic may be true (though not of their recently degraded “drunk” software version I explain below) it is misleading. The problem is that they are nowhere near able to handle most of the other lesser common accident scenarios and actually need their customers and the public to experience them in order to get the data needed for them to handle those situations properly later. Said differently – they need people to experience near-accidents or accidents, to be hurt or worse, so they can get the data they need to improve the design. As I state in more detail below my issue with this is they should be doing this another way. Which does not put people at risk. A way that Commercial IT doesn't understand well. Unlike DoD, NASA, Boeing etc. Said differently - if I am curing a widespread disease like cancer but avoidably give people less common diseases that can kill them – and I doing my due diligence?
https://www.wired.com/2017/01/probing-teslas-deadly-crash-feds-say-yay-self-driving/
Tesla has stated that they need their customers to drive 6 billion miles to gather all the data needed. And that only a fraction have been driven. That many miles are needed because the data they need involves gathering non-standard or exception scenarios like accidents or near accidents. In order to stumble on all of the various scenarios Tesla believes 6 billion miles need to be driven. (Notice in this article Tesla basically says the lost lives are unavoidable and help them save more lives as a result). I wonder how regression testing is handled when an update is made? Drive a couple billion miles again until you stumble on the test cases again?
In the most recent update Tesla’s AP regressed so badly owners classified it as “driving drunk”. This system is still in use, was not recalled or replaced by Tesla and is a significant regression and risk to the public. The car can barely stay in the lines on an average road, in daylight and in good weather. Why can’t that be engineered on a test track or with simulation or manned simulator before going in to the public? How is it a good PR move to let people see how that system can’t handle rudimentary driving? Why would any company or NHTSA let those cars on the road or allow the public to be beta testers? To make matters worse an inside report was recently leaked from Uber recently showing that in spite of years of "AI, "Machine " and "Deep" learning their vehicles can't average a mile of autonomous driving without disengaging. (Tesla was no better). This is why an industry wide Scenario Matrix is needed.
Why can't these folks use simulation and test tracks to get the most basic scenarios down before putting the public at risk? Data used from traffic engineers, researchers, the insurance and auto companies would give you the majority of information you need to create the base scenario matrix. Using systems engineers and experts you could then add huge amounts of variation to those scenarios. Once you have that you can then design and test to that. The greater majority of real life scenarios would be covered by this process. While this is going on you can continue to gather data from drivers not in AP.
Lastly most of these companies follow normal Commercial IT engineering and project management practices. Those are traditionally nowhere near best practices. Most have no idea what CMMI is, what systems engineering best practices are or how to design, build or test a system nearly as complicated as what is required here. Especially regarding exception handling, negative testing, sensors integration or simulation.
Key Best Practices Not Being Used
Sensor Systems and Integration
These vehicles are not using a broad enough array of sensors and in many cases relying on only LIDAR or cameras. (Tesla is using one camera at this time). That is extremely unwise since every sensor has weaknesses. Aircraft manufactures use multiple sensors as well as probability and priority filters to ensure the right data is being used at all times. That includes, FLIR, several types of radar, GPS, cameras and inertial navigation. (Automobiles need to add sound detection to that list). This is what needs to be done in vehicles. Sensors can provide incorrect data. They can contradict each other. An example may be changed road patterns that contradict a map or signs that for whatever reason cannot be read correctly. Or sensors are broken or not doing well due to bad weather. (Many of the beta test cars out there now can't handle driving in simple scenarios or with minimal exception handling without disengaging every mile or so now. They aren't even scratching the surface yet.) That can never result in the vehicle doing the wrong thing. It is imperative to not just double verify but triple verify or more in many cases.
(Why am I not hearing about the use of inertial navigation?)
Exception Handling
Exception handling is where the system does something unplanned or expected. Accidents would be exception cases. NASA, DoD and the airlines industry spend more time on identifying these, designing in responses and testing them than they spend on the normal or expected path. Commercial IT on the other hand rarely identifies these let alone handles them. Their processes actually don’t support most of what is needed to find them let alone ensure proper designs are implemented and tested. While many Commercial IT produces don’t require as much rigor as an aircraft, weapon system or space craft, driverless vehicles surely do.
Other Key Areas
- Using text based scope docs that do not build into a full system view. Use Cases and Stories are extremely poor ways to illicit and explain scope. Especially exception handling. What is needed is Diagrams. These facilitate visual flow where exception handling points would be seen. This step is the most important. If you cannot see all of the combinations you cannot design or test for them. Not doing this one step alone will cripple these companies. They will have zero visibility or view into the entire system. They will get lost, make design and coding mistakes, break things that used to work and not be able to test complete threads. All they will literally have is a massive stack of text.
- Most Commercial IT companies have many products and separate teams. They rarely perform mass system integrations. There is very little system design being accomplished. Especially not to this size or complexity. There is also very little object oriented or UML design going on. This is caused by how many people choose to practice Agile. They purposefully ignore what they can know up front and utilize Use Cases and Stories and not Diagrams from that point forward, Most of Commercial IT's design process is not based on a full systems design approach. They build one step at a time purposefully ignoring whole systems.
- They lack proper tools that facilitate scope decomposition through design, code and testing. Something like DOORs. Commercial IT rarely has separate tools let alone an integrated one. Most won't even use a proper Requirements Traceability Verification Matrix (RTVM) in Excel. This will result in missing and incomplete scope, design and testing. Where this would show up most is in their inability to deal with making, designing to and testing the massive Scenario Matrix that is needed to develop autonomous vehicles. They simply cannot handle all the variations.
- They rarely have chief architects that look across the whole system.
- Full system testing is rarely done. Especially when there are third party interfaces. Simulators are rarely built to replace those systems if they are not connected in the test environment. Exception handling or negative testing is rarely done.
- There are rarely any coding standards. Especially built from in depth testing and exception handling. Examples - http://caxapa.ru/thumbs/468328/misra-c-2004.pdf, http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf, http://www.stroustrup.com/JSF-AV-rules.pdf
- Commercial IT rarely creates a product wide integrated software configuration management systems. They have dozens or even hundreds of little teams who have their own CM. This will result in the wrong software versions being used. Which will lead to defects. It will also lead to laying patches on top of patches which will result in defects.
Miles Driven and Disengagement
Miles driven is virtually meaningless. As is data on disengagements. Both likely leading to false confidence. Exactly what scenarios were experienced in this driving? Most is repeated. I can do far more with properly planned 50 miles than a million miles driven by drivers stumbling on scenarios. (Tesla says they need 6 BILLION miles driven to get the data they need. Much of it accident data). And since the data I really need is on exception handling or near/actual accident data I should be using simulation and simulators not overly trusting and unwitting human Guinea pigs. What do you do when software needs changed and there is a big regression test impact? Drive those miles 6 billion again? If the answer is simulation then you could have gotten it that way in the first place.
https://www.driverless.id/news/2016-disengagement-reports-show-waymo-absolutely-crushing-competition-every-single-metric-0176110/
NHTSA has not done their due diligence
NHTSA has allowed Tesla, and others, to determine what the new best practices are. The problem with that is most of these engineers come from Commercial IT where they rarely experience engineering on a scale and complexity anywhere close to this. Their processes literally don’t support doing that. Especially regarding exception handling. They are in way over their heads. They cannot tell that because no one around them comes from places that have this experience and proper tools like NASA, DoD, Boeing etc.
NHTSA is allowing industry to determine what the Scenario Matrices looks like and there is no effort to make a single minimum acceptable set for design and testing. This will result in massive differences, gaps and confusion. Cars will work different based on brands. This is a mistake.
They determined that Joshua Brown should have been paying better attention when he was killed in an accident while using his Tesla autopilot. Tesla admitted the system did not have radar integrated well and the camera system mistook the trailer as the sky since the sun was shining on it. I believe this is another clear example of these cars not being ready to be on the road in autopilot, how the public should not be Guinea pigs and how the average person thinks “Autopilot” means the car can drive itself.
Regarding the term "autopilot". Using the term “autopilot” versus terms like driver assist" well before the vehicles have fully functional driverless systems is misleading, confusing, reckless and unnecessary. NHTSA stipulated that since Tesla states in their fine print that the systems are not actually autopilot and the user should keep their hands on or near the wheels that there is no issue. I contend that the term is misleading and that Tesla, through its own actions, like videos and press releases, has sent mixed signals to it users and the public. Thereby creating a significant level of false confidence. The German government made the exact same points. Other watchdog groups have had issues with the process. Like Consumer Watchdog. (I believe the reason Tesla misleads people is so their customers and the public will be comfortable being their beat testers.)
Video of Elon Musk using his system in ways he told others not to do - https://www.youtube.com/watch?v=gDv9TEXtHzw&list=FLcDGGGtllzLmeV_UCebqHUw&index=8
Remote Control – Weaponized Vehicles
Many of these companies are and already have released either remote control versions of their vehicles or the source code so the system can be modified. Not only does this put these neophytes at risk, and the public around them for accidents, these vehicles can be weaponized. The worst offender here is comma.ai. They not only use their customers as Guinea pigs but they released the software source code to them for free. This will allow users to modify the code and change the way the vehicles perform.
Also the vast majority of companies and government organizations can be easily hacked due to poor cyber-security practice use. Especially around Privileged Account Security. This means the source code for remote control cannot be deemed safe. Given this the potential for harm far outweighs the good. As such I believe remote control should not be an option under any circumstance.
More on this here - https://www.linkedin.com/pulse/privileged-account-security-massive-hole-most-why-isnt-michael-dekort
Recommendations
Autonomous Vehicles
The term “autopilot” should not be used until vehicles meet Level 5 criteria. That should include demonstrating it can pass an industry-wide scenario matrix. That being the minimum normal driving and exception or accident situations the vehicle should be able to handle properly. This would include variables like vehicle type and performance, weather, terrain, moving and stationary obstacles, sensor degradation or conflicts, driver induced mistakes, driver take over, time of day, signage, road changes/condition, handling of external data sources etc.
These systems should not be released to the public until a minimum amount of scenarios are tested. This includes professional drivers. (Ford stopped the practice because professional drivers were falling asleep.) Simulation and simulators in combination with inputs from the participants below should be used for primary data gathering and testing.
Suggested Method for Data Capture, Design and Testing
Scenario Matrix
- The key to all of this is creating a complete and accurate industry-wide Scenario Matrix which would be used for design and testing. (That includes regression and repeat testing)
- That matrix should include any situation that a user of the system could reasonably be expected to experience as well as as many variations of those scenarios as possible
- Design and Testing should include the boundaries of those combinations
- This matrix will help inform where AI, Machine learning is needed to help fill in the gaps
- Minimizes repeat data found by those gathering data via driving
- Helps avoid wasting time on repeat scenarios or missing intricacy of existing scenarios
- Provides a checklist to help ensure macro and even some micro scenarios are not missed or incomplete
- This will also ensure drivers who select different brands of vehicle do not have to worry about changes in scenarios covered or how they are covered. What happens if someone goes into a different brand and a scenario that was covered in their former vehicle is not? Or if the scenario is handled differently? That drive could take control or not take control at the wrong time.
Data Sources
- Note on AI - AI for every aspect OTHER than that which needlessly puts people at risk in “autopilot” systems that are not reliable yet, is obviously encouraged. For Mapping etc there is really no other way. (Mapping for example would have to be constant. Something that will have to be crowd sourced.)
- Drivers not in autonomous mode
- Automobile Companies
- Insurance Companies
- Researchers – to include Social Engineering – Expectations of other human drivers
- Traffic Engineering
- Government Agencies – NHTSA etc.
- Product Team Exception Handling Inputs – Folks trying to break the system
- Vehicle Performance, Configuration Changes and Issues
- Weather
- Road and Terrain – Time of Day - Changes especially for temporary work
- Signage – To include human gestures
- Sensors – System Device Capabilities - Handling of Conflicts and Missing or Flawed Data - Priority and Probability Filters - LIDAR, Radar, FLIR, GPS, Cameras, Sound etc. Every V2X receipt will have to be treated as a separate sensor input.
- Moving and Stationary Objects
- External Data Sources – Other Vehicles, Objects and Systems - V2X
- Route deviations based on interior and exterior changes – Include handling of latency
- User Error – When to ignore, correct or notify
- System Wide Conflicts, Missing Data or Errors
Use of Data, Design and Testing Approach
- Create the Scenario Matrix in an Object Oriented software system that represents the combination of all the various data and system types. Once the data areas and exception boundaries are know the various combinations of them can be created, tuned, changed and tested.
- "Business" Rules - The variations of rules the vehicles need to use is massive. Far, far more than in most Commercial IT systems. Those folks don't "what if" much. Unlike in NASA, DoD or Boeing for example. Virtually every rule will have to be broken in certain situations. And those themselves need rules.
- Utilize non-manned and manned simulations/simulators to run through the various scenarios. As well as variations of those scenarios. How do you do this driving around? How do you repeat scenarios? Or regression test? Drive around billions of miles over and over until you stumble on them? That's simply lazy and reckless systems engineering.
- Utilize real world testing from test tracks and controlled public driving to verify the simulations. The key being to not go into the public domain until the rudimentary scenarios are proved via test tracks, simulation and manned simulators. And when public driving is done do it in a controlled environment first. So far most of the vehicles out there can't stay within the lines on the road. They need to get the basics right before they involve the public. (Most of these companies are saying they have to use the public domain and human beta testers to gather data, design and test. They are wrong. Wrong to the point of being reckless.) Tesla - Why wouldn't all of the sensors be integrated in simulations and on test tracks for basic ops the cars go into the public domain? Imagine if NASA, DoD or Boeing did things like this.
- You cannot use Agile for a project like this. Bottoms up will not work. If people use Agile they will be constantly tripping over what they have already done. Constantly breaking things that used to work, miss huge pieces only to tear things bask apart later. They will miss a lot in regression testing. This whole thing is far, far too complicated for Commercial IT's engineering practices, tools and most engineers to handle. Using Stories or Use Cases alone and not Diagrams and a Scenario Matrix will hold them back for years if not doom them overall. This is not a bottoms up Agile exercise. It is a massive top down systems engineering effort around a scenario matrix and object oriented design. It's about defining what all the objects or variables are then filling in the types, ranges and combinations. What is needed is an Agile-Waterfall hybrid with the use of actual best engineering practices. https://www.linkedin.com/pulse/software-development-one-best-approach-michael-dekort and
- V2X - Every receipt of information a vehicle gets has to be treated like a separate sensor unless there is a regional system that transmits truth to every vehicle. And can do so in actual real-time.
- Real-time - There are very few systems in Commercial IT that operate in actual real-time. That term is usually defined by the user community system and its most demanding scenario(s). Most often that is determined by whether someone thinks something was fast enough. Data retransmissions usually occur all the time and are no impact. Folks from that industry do not having an understanding of the critical system timing that mass driverless vehicles and V2X systems will need. They think that since networks, CPUs and memory are all fast and have tons of capacity all is well. (That includes folks who make OS and games. Though folks who make games that involve networks surely have some insight. As do the VR folks. However I would bet that a primary reason for people getting sick is system lag and timing). With driverless vehicles and V2X the entire system has to be designed to accommodate the most dependent action of any one vehicle and then a thread of many vehicles having the same need. If every data need was plotted out on a sequence diagram you would see certain things happen at certain rates and in a certain order. This may be hundreds if not thousands of times per second. If a frame or window is missed something bad can happen, with many vehicles all dependent on each other’s actions. There could easily be a catastrophic domino effect. Networked aircraft simulators, especially those that fly at high rates of speed and in formation or even air refueling, have to deal with this. That industry has been using global memory and system architectures that most folks in Commercial IT are unaware of. The entire system, with V2X, is based on asynchronous data exchanges. In order for that system to work and meet very demanding timing needs that are synchronous, you have to transmit truth fast enough and often enough so that the receiving systems see exactly what they need, exactly when they need it. While they can often dead reckon there are plenty of times that will be a mistake. In order to avoid this every vehicle will have to look at all data sources and sensors, treat V2X transmissions as sensor inputs, calculate what is truth from a wide array of inputs then take the right action. And to do that at the right time and in the right order. (For those of you who think satellites are an option do the math comparing one hop to how far a car goes at 25mph and 75mph. This also needs to include electro-mechanical delay in steering and braking).
Remote Control
There is significant risk of these vehicles being weaponized. As such they should not be allowed to be remote controlled. At least not until the system is proven full proof and only the right organizations can control them. This should include these organizations proving that they themselves cannot be hacked.
Who will actually be first?
I do not have insight into what every company is doing. If one of them is experienced enough, has patience and the right funding they could be in the lead on this in 5 years or so. They will come out ahead because the others will have exceeded their actual experience or be locked up in civil or even criminal courts because of wrongful death lawsuits. You could build a system with a Scenario Matrix so complete that governments and everyone else would have no choice but to defer to you. You could license this or move forward with the only viable product and watch the other folks trying to play catch up.
My Background – 15 Years - Systems Engineer, Program Manager and Engineering Manager for Lockheed Martin – Aircraft Simulation, NORAD and the Aegis Weapon System. Commercial IT Project Manager for 11 years. Post 9/11 DoD/DHS Whistleblower - IEEE Barus Ethics Award recipient - http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4468728
Update 3-17-2017
New reports on Uber shows they are struggling to make improvements. The driver has to take over every MILE. This shows AI, Machine Learning, Deep Learning etc are only as good as the data they are fed and the plan to get them there. Driving around for billions of miles waiting to stumble on new scenarios, as your primary data gathering method, is not a good plan. Unless you are very lucky you will hit new data gathering plateaus. As I have said these folks are in over their heads. What they actually know can only get them so far. Apparently that isn't even far enough to stay between the lines of most roads in daylight and in good weather. Almost nothing they have experience in to this point applies to this. Twitter, Uber, Paypal, games etc are not training grounds for this. These folks are re-plowing fields plowed long ago in other industries. Looks like they are all finding this out. The problem is they are needlessly risking people's lives and delaying the final product which will help people. as a result.
Update 3-18-2017 - 11 Billion Miles to prove AP is only 20% safer than human drivers
Here is an excellent paper by Nidhi Kalra at RAND. Notice she says 11 billion miles need to be driven to demonstrate with 95% confidence and 80% power that their failure rate is 20% better than humans. Autonomous vehicles would have to be driven more than 11 billion miles to detect this difference. With a fleet of 100 autonomous vehicles being test-driven 24 hours a day, 365 days a year at an average speed of 25 miles per hour, this would take 518 years—about a half a millennium.
This is what we want to expose human beta testers in premature AP systems too?
http://www.rand.org/content/dam/rand/pubs/research_reports/RR1400/RR1478/RAND_RR1478.pdf
Update 3-28-2017
There is an excellent Mobileye video on YouTube. In the video Mobileye makes many of the points I have tried to make here.
https://www.youtube.com/watch?feature=em-subs_digest&v=b_lBL2yhU5A
At 53:45 there is a discussion on how simulation and simulators should be used
At 56:40 they mention fro driverless systems at least 2 different sensor types should be used
At 1:12:00 they mention all the hype around folks driving around billions of miles to stumble on data
Update 4-13-2017
A very interesting article from MIT came out discussing how the folks who are using machine learning to create autonomous cars do not know why it works. The author posits that until they do that approach cannot be considered safe enough. I agree. More here:
ADAS AI or Machine Learning - A Dark Art? Is it the best option? Is it safe?
Update 4-18-2017
Bloomberg article released today on the need to augment AI with simulation and the reasons for doing so.
https://www.bloomberg.com/news/articles/2017-04-17/don-t-worry-driverless-cars-are-learning-from-grand-theft-auto
The reasons cited include that scenarios cannot be repeated with AI, greatly inhibiting data gathering, engineering, primary and regression testing. In addition far too much time is needed to gather all the data needed and using human beta testers, especially in difficult conditions, puts them at risk unnecessarily. This clearly backs up information I presented earlier from RAND, MIT and Mobileye.
Update 4-20-2017
Tesla owners sue saying AP is dangerous
It's unfortunate that Elon Musk's ego, which drove him to doing some amazing things. has now driven him to be a dangerous, unethical charlatan.
Regarding regulations. When NASA first reviewed his SpaceX code they rejected it for not being properly tested, especially negative testing, and for not doing nearly enough exception handling. That happened because NASA unlike Commercial IT actually uses best engineering practices. They were the regulators. in this case NHTSA has punted. They are in over their heads just as much as the Commercial IT folks who are making AP. These folks drank way too much of their own bathwater. They are way to impressed with their skills of making apps, games and websites. Just look at the folks in charge across many of these AP companies. They came from PayPal, Twitter, Swift, travel websites etc. It's a massive echo chamber.
In addition RAND, MIT and Mobileye have all come out recently and said AI is valuable but way over estimated. The folks who use it do not really know how it works. Corner cases are not found and it would take hundreds of billions of miles of driving to stumble on all the data they need. These engineers are using machine learning to be experts for them. Since they have almost no background in this domain or in actual best practices they have no choice. What really should be happening is AI is mixed with proper systems engineering, other existing data sources and simulation to create a Scenario Matrix. A tool that would ensure due diligence is done and everyone is on the same page and handles scenarios the same way. What happens if a Ford AP owner buys a Tesla? Are all the same scenarios handled? Are they handled the same way? Does a difference entice the driver to do or not do something they shouldn't be because of an expectation from the previous car.
Just in case the echo chamber of folks who only know what they know from the press and have no experience in any of this chime in that I am against AP. No I am for it. So much for it I want to see it happen ASAP. That ONLY happens if it is done in the right way. Tesla is not doing it the right way. Putting people at risk needlessly, depending way too much on AI and not using actual best systems engineering practices.
Update 4-21-2017
Mercedes will no longer use the term "Autopilot" if the vehicle is not fully autopilot. Believes using that term before that is misleading and leads to a false sense of confidence.
Interesting that was done right on the back of Tesla being sued for a an autopilot bait and switch.
Final note - I want to explain the tone of my posts on this subject. It is direct, even critical, because I believe the only way to break through the overwhelming thought pattern is through an intervention. I realize that could turn off the folks I am trying to reach. As I have tried softer paths and this topic has life and death ramifications I believe the approach is warranted because I it is, unfortunately, the most likely approach to get folks to re-examine what they are doing. And to hopefully change their course. I am more than available to help in an way I can.
https://www.linkedin.com/pulse/due-diligence-recommendations-mobile-autonomous-industry-dekort
No comments:
Post a Comment