Lean Sensing - A Critical Enabler For Autonomous Vehicles - Jonathan Cartu Internet, Mobile & Application Software Corporation
post-template-default,single,single-post,postid-1319,single-format-standard,qode-quick-links-1.0,ajax_fade,page_not_loaded,,qode_grid_1300,qode-theme-ver-11.2,qode-theme-bridge,wpb-js-composer js-comp-ver-5.2.1,vc_responsive

Lean Sensing – A Critical Enabler For Autonomous Vehicles

Lean Sensing – A Critical Enabler For Autonomous Vehicles

Autonomous Vehicle (AV) are progressing at a rapid pace (notwithstanding COVID 19 constraints). The past 2 months have experienced significant corporate events. These include Amazon’s acquisition of Zoox, Volkswagen’s investments in Argo, Yandex plans to spin-out its AV joint venture with Uber, and LiDAR unicorns (LUnicorns) Velodyne and Luminar announcing plans to go public at multi-B$ valuations (more than Zoox!). 

On the deployment front, the pandemic has provided an opportunity for China-based AV companies to aggressively deploy ride-hailing, whereas trucking automation is making significant strides in the US (Waymo, Daimler, Ike, TuSimple, Aurora).  Elon Musk has yet again announced that Tesla will have basic functionality for Level 5 AVs (no humans required in the car) by the end of 2020.  Topping all this, General Motors is re-organizing its vaunted Corvette engineering team to support EVs (electric vehicles) and AVs. The focus is primarily on EVs, but the article mentions plans to upgrade GM Cruise’s AV platform (Origin) to Corvette like handling and comfort levels. Who says AVs are destined to become a boring utility?

Artificial Intelligence (AI) based systems are required for replacing a human driver. Continued innovation and testing of these systems have driven the need for richer sensor data, either through the use of many sensors per AV or higher sensor capabilities – range, accuracy, speed, visibility of FoV (Field of View), resolution and data rates. Paradoxically, the increased sophistication of sensors raises barriers for deployment – higher sensor and compute costs, increased power consumption and thermal issues, reliability and durability concerns, higher time to decision making (latency), and possibly more confusion and errors.  It also increases requirements for data transmission bandwidth, memory and computing capabilities (all driving up power, heat, and $$$s).   

Multiple directions can be pursued to thin down the sensor stack and focus on what matters in a driving environment (lean sensing). The rest of the article covers four approaches to realizing this: learning based sensor design, event-based sensing, Region of Interest (ROI) scanning, and semantic sensing.

1.   Optimizing Sensor Design Through AI-Based Learning

Using complex and sophisticated sensor “instruments” and compute fabrics is fine during the development phase as the AI develops and trains itself (learning) to replace the human driver. Successful machine learning should be able to identify the features that are important in the deployment phase. Analysis of the neuron behavior in DNN (Deep Neural Networks) can reveal the aspects of sensor data that are important versus those that are superfluous (similar to DNN neurons processing 2d vision information). This in turn can help thin down sensor and compute specifications for deployment. One of the goals of machine learning during the AV development phase should be to specify sensor suites that provide actionable data at the right time with the optimal level of complexity – to enable timely and efficient decision making and driving decisions.

A previous article argued that AV players (like Waymo, Uber, Aurora, Cruise, Argo, Yandex) chose to control and own LiDAR sensor technology to ensure tighter coupling with the AI software stack. This coupling can also help understand which LiDAR performance features are critical for deployment. Working with multiple sensor modalities helps identify individual sensor features that are critical in different driving situations, eliminates duplicate and redundant information, and reduces unneeded sensor complexity. Tesla’s anti-LIDAR stance and Elon Musk’s “Lidar is a crutch” comment is an extreme case – where presumably the data and machine learning based on radar and camera data from over 0.5M cars deployed in the field has convinced Tesla that LiDAR is not required in AVs.

Human drivers sense a tremendous amount of information through different modalities – visual, audio, smell, haptic, etc. An inexperienced driver absorbs all this data, initially assuming that all of it is relevant. With practice and training, expert drivers can filter out the irrelevant and focus on the relevant information, both in time and space. This enables them to react quickly in the short term (braking for a sudden obstacle on the road or safely navigating out of traffic in the event of vehicle malfunction) and longer-term (changing lanes to avoid a slower moving vehicle). Machines trying to simulate human intelligence should be able to follow a similar model – initially acquire a large amounts of sensor data and train on this, but become more discriminating once the training achieves a certain level. Learning should allow a computer to select, sense, and act on the relevant data to ensure timely and efficient decision making.

Arriving at an optimally lean or thin sensor stack design is a function of AI and machine learning. Assuming this is done, the sensor system needs to decide what data to collect (event-based, ROI based) and how to process this data (semantic sensing). 

2.  Event-Based Sensing

Event-based sensing has existed in the military domain where one sensor (say a radar) can be used to detect an incoming threat, and cue another sensor (a camera or LiDAR) to pay more attention and devote more resources in that region (to recognize whether it is a friend or foe, for example). However, other techniques rely solely on the individual sensor itself to identify the event.

Prophesee (“predicting and seeing where the action is”) is a French company that specializes in developing event-based cameras. Their thrust is to emulate human or neuromorphic vision where the receptors in the retina react to dynamic information and the brain focuses on processing changes in the scene (especially for dynamic tasks like driving). 

The basic idea is to use camera and pixel architectures that detect changes in light intensity over a threshold (an event), and providing only this data to the compute stack for further processing. Relative…


Software Development Software Developer Jonathan Cartu

Source link

No Comments

Post A Comment