[PPT] - 3D Object Tracking in Driving Environment: a Short Review and a PowerPoint Presentation

SLIDE 1

1

19th IEEE Intelligent Transportation Systems Conference (ITSC) (PPNIV WORKSHOP)

3D Object Tracking in Driving Environment:

a Short Review and a Benchmark Dataset

Rio de Janeiro, Brazil November 2016

Institute of Systems and Robotics University of Coimbra

Pedro Girão, Alireza Asvadi, Paulo Peixoto and Urbano Nunes

09:50‐10:10, Paper TuA1‐T1.3, Tuesday November 1, 2016

SLIDE 2

Motivation

2

The paper provides: 1) an overview on 3D object tracking methods

Most of previous literature are focused on the data association problem. Here, the focus is in the assessment of object appearance modeling in object tracking methods

2) A framework/dataset to allow the evaluation and comparison of object tracking methods (in the autonomous driving context) ‐ Autonomous cars will be available in the near future ‐ Object tracking is a crucial component

Companion paper: 3D Object Tracking using RGB and Lidar Data, A. Alireza, P. Girão, P. Peixoto, U. Nunes, ITSC2016

SLIDE 3

3

Part 1:

3D Object Tracking in Driving Environment:

a Short Review

SLIDE 4

Taxonomy of object tracking using 3D sensors (using stereo and 3D‐Lidar)

4

Object tracking algorithms can be divided into two categories:

Tracking‐by‐detection (or Discriminative) Approaches
Generative Approaches (without training)

SLIDE 5

Object detection mechanism in object tracking methods

5

D‐ Discriminative (tracking by detection)

D1‐ Supervised Object Detectors (with training): Localize the object using a pre‐trained detector (e.g., DPM), and next link‐ up the detected positions over time (mostly computer vision‐based approaches / using RGB‐D images). D2‐Model‐based Approaches: Detecting and tracking a target by fitting a pre‐defined object shape.

G‐ Generative (without training)

G1‐ Segmentation‐based Approaches (clustering) Partition the PCD into perceptually meaningful regions that can be used for object detection (remove ground ‐> clustering ‐> tracking each cluster) G2‐ Motion‐based Approaches: Moving object detection can be achieved by background modeling and subtraction’ or ‘frame differencing’.

SLIDE 6

Object appearance modeling in object tracking methods

6

(a) represents a scan of a vehicle which is split up by an occlusion from top view (b) the centroid (in the point model) representation of the target object (c) 2D rectangular or 2.5D box shape based representations (d) 2.5D grid, 3D voxel grid, or octree data structure‐based representation (e) object delimiter‐based representation (f) 3D reconstruction of the shape of the target object

SLIDE 7

Some

f

the recent 3D

bject

tracking methods for autonomous driving applications

7

Object tracking algorithms are composed of three main components:

bject representation, search mechanism, and model update.

SLIDE 8

8

Part 2:

3D Object Tracking in Driving Environment:

a Benchmark Dataset

SLIDE 9

3D Object Tracking in Driving Environments (3D‐OTD) Benchmark Dataset

9

A benchmark dataset was constructed out of the ‘KITTI Object

Tracking Evaluation’, and the sequence attributes and challenging factors were extracted.

Two baseline object trackers were implemented.
Two evaluation criteria were considered for the performance

analysis.

The evaluation scripts, source codes for the baseline object trackers

and the ground‐truth data, corresponding to this work, are available

nline.

SLIDE 10

3D‐OTD Dataset

10

3D‐OTD Dataset KITTI Object Tracking Dataset

Focused on the evaluation
f data association problem
Object tracklets (object labels may

change during the tracking)

Large dataset (21 seq. for training &

29 seq. testing/each seq. multiple objects)

Focus is in assessment of
bject appearance modeling
50 annotated sequences
Each sequence denotes full

track

f
nly
ne

target

bject

(if one scenario includes two target objects, it is considered as two seq.)

Specifications/challenging

factors of each seq. extracted

3D‐LiDAR point clouds (PCD)
Stereo vision data: Right/left color images
GPS/IMU localization data

Annotation/Label data: Sensory perception data: (KITTI Object Tracking Dataset/3D‐OTD Dataset)

SLIDE 11

3D‐OTD Dataset

11

In the original KITTI dataset, objects are annotated with their

tracklets, and generally the dataset is more focused on the evaluation

f data association problem in discriminative approaches.
Our goal is to provide a tool for the assessment of object appearance

modeling in both the discriminative and generative methods. Therefore, instead of tracklets, full track of each object is extracted.

SLIDE 12

3D‐OTD Dataset

12

The sequence attributes and challenging factors are extracted: Sequence attributes:

Object type
Object status
Ego‐vehicle situations
Scene condition

Challenges:

Occlusion (OCC)
Object pose (POS)
Distance (DIS) variations

to Ego‐vehicle

Changes in the relative

velocity (RVL)

f

the

bject to the Ego‐vehicle

SLIDE 13

Baseline 3D object tracking algorithms

13

A. Baseline KF 3D Object Tracker (3D‐KF):

Two baseline object trackers were implemented:

B. Baseline MS 3D Object Tracker (3D‐MS):

0.5 | max iter. < 3

A 3D Constant Acceleration (CA) KF with a Gating Data Association (DA) is used for the robust tracking of the

bject centroid in the PCDs.

The Mean Shift (MS) iterative procedure is used to locate the object.

SLIDE 14

Quantitative evaluation methodology

14

The overlap rate (the intersection‐
ver‐union metric) in 3D:
Orientation error: difference in the

Yaw angle (Yaw angle describes the heading of the object): The percentage of frames with successful occurrence (the overlap ratio exceed 0.25 / the orientation error less than 10 degrees) is used as a metric to measure tracking performance.

SLIDE 15

Evaluation results and analysis of metric behaviors

(y‐axis: normalized cumulative sum of the successful cases; x‐axis: normalized nº of frames)

15

The metrics for the two baseline trackers (3D‐MS and 3D‐KF) are computed based‐on OCC, POS, DIS and RVL challenges. The 3D‐KF achieves higher success rate because the 3D‐MS tracker may diverge to a denser nearby object (a local minima) instead of tracking the target object. However, the 3D‐MS tracker has a higher precision in

rientation estimation.

SLIDE 16

A Comparison of Base‐line Trackers with the State‐

f‐the‐art Computer Vision based Object Trackers

16

[6] Y. Wu, J. Lim, and M.‐H. Yang, Object tracking benchmark, PAMI, vol. 37, no. 9, pp. 1834–1848, 2015.

Baseline trackers (3D‐MS and 3DKF), benefiting from highly reliable 3D‐LIDAR data, have superior performance over the state‐of‐the‐art approaches in Computer Vision field (SCM [37], and ASLA [38]).

This is because, in autonomous driving scenarios, ego‐vehicle and objects are

ften moving. Therefore, object size and pose undergo severe changes (in the

RGB image), which can easily mislead visual object trackers.

SLIDE 17

Conclusion and future directions

17

We presented:

A brief survey for 3D object tracking in driving environments
A benchmark dataset based‐on KITTI Object Tracking Evaluation
A quantitative evaluation methodology
Two baseline trackers
The evaluation scripts, source codes for the baseline object

trackers and the ground‐truth data are available online: https://sites.google.com/site/amshmi12/downloads

We encourage other authors to evaluate their 3D object tracking

methods using the 3D‐OTD evaluation benchmark, and make their results available!

An extension of the dataset and codes to include more sequences

and trackers remains an area for future work.

SLIDE 18

Thank you for your attention

18

The video represents the first two sequences (Car and Cyclist) from the 3D‐OTD dataset. 1‐ the top shows the 2D and 3D BB in the image; 2‐ and the bottom shows the 3D‐BB in PCD. Each sequence represents one object. For each sequence, the full track of each

bject is extracted.

VIDEO:

SLIDE 19

19

This work has been supported by the FCT project ”AMSHMI2012 ‐ RECI/EEIAUT/0181/2012” and project ”ProjBDiagnosis and Assisted Mobility ‐ Centro‐07‐ST24‐ FEDER‐002028” with FEDER funding, programs QREN and COMPETE.