Object detection probabilities
- Timofey Uvarov
- Jan 8
- 3 min read
Updated: Jan 17
The presentation is about an 8MPX automotive autonomous camera system with a focus on how to translate human vision-based image quality metrics into image quality metrics and image quality metrics — into a trend of probabilities and costs of detection of various objects at different distances with cameras equipped with 8MPX and 2MPX image sensors. Also, some architectural aspects of back-end/infra-system design were addressed with correlation to the design of the image pre-processing pipeline.
Image Quality and Detection Trends
The essence of the first part can be summarized in the slide below:

Human Vision Chart
On the left side of the slide above, you can see the human vision chart captured with the same high-resolution lens at a distance of 10 feet with 2MPX and 8MPX sensors and processed with the same software ISP from Onsemi. On the 8MPX chart, we can confidently read 3 more lines than on the 2MPX chart.
Detection probabilities

In the graph above, you can see two solid lines representing the detection probability of a human mannequin at a given distance. The orange line represents the vision trend for 2MPX and the blue line — for the 8MPX sensor.
For instance, if we target the detection probability of 0.5, we can find out that with an 8MPX sensor (blue line), the ADV will be able to detect the human at 390m and with a 2MPX sensor - at 320m, which is hypothetical maximums of detection distances of cameras tested camera prototypes. With an 8MPX camera, we gain a 70m distance advantage for ADV in detecting, predicting, and reacting. Slicing the trend at 0.75 confidence level we can observe difference in distance of detection around 60m and at 1 the two trends converge at around 200m. Thus, below 200m (unless we are targeting some gesture or face detection, there is not much benefit to use computationally expensive 8MPX).
You might be surprised to compare the numbers for human detection vs. the detection of a deer.
The above results were perceived on a clear, sunny day on Aug. 25, 2021, at Crow’s Landing airport: 84°F (29°C), 8 MPH wind, and 30% humidity.
Cost of detection
The dotted line represents the cost of detecting the pedestrian at a given distance in pixels for each of the sensors.
For example, at 100m, we would need at least 20 pixels horizontally to detect a pedestrian with a 2MPX sensor and 40 pixels with an 8MPX sensor. The computational cost (number of transistors) will grow as a square of the pixel size ratio—thus, it would be 4x more computationally expensive to detect a pedestrian at 100m with an 8MPX sensor.
Conclusions
Above, we conducted a field experiment engaging a target perception group of image labeling team members at Pony.AI. We were able to learn vision trends for 2 and 8MPX sensors for the detection of different objects and relate 2 and 8MPX trends via cost and probability of detection of numerous objects at a wide range of distances.
Using the charts below, one can determine the rationale for using the type of sensor resolution/pixel size for different applications.


Notice how different results are obtained for different objects due to differences in size, shape, morphological structure, and material.\

How to use the charts in your own computer vision project
From the charts above, we can determine that to detect a deer with 0.5 confidence, we would need to be at 350m with a 2MPX camera and around 400 meters with an 8MPX camera. At the same time, we know from human vision chart metrics that with an 8MPX camera, we can see 3 more lines of text at 10ft.

Thus, 3 extra lines of text visibility at the human vision chart translate into roughly a 15% increase in distance when we detect a deer with an 8MPX sensor compared to a 2MPX sensor and acknowledge the pixel count and size.
8MPX camera system presentation at Auto.AI 2022