Collaborative Mapping with Street- level Images in the Wild Yubin - - PowerPoint PPT Presentation

▶

Jun 17, 2023 429 likes •665 views

Collaborative Mapping with Street- level Images in the Wild Yubin Kuang Co-founder and Computer Vision Lead Mapillary Mapillary is a street-level imagery platform, powered by collaboration and computer vision. SfM/3D Sign Object Map

SLIDE 1

Collaborative Mapping with Street- level Images in the Wild

Yubin Kuang Co-founder and Computer Vision Lead

SLIDE 2

Mapillary

Mapillary is a street-level imagery platform, powered by collaboration and computer vision.

Image s Dat a

Collaboration - Image Capture

Computer Vision

Mapillary data SfM/3D reconstruct Sign recognition Object recognition Map features Map updates OEMs/Map Providers

SLIDE 3

Any device combined with automation can scale infinitely

Collaborative mapping - Capture

Phone s Action cams 360 Dashcams Cars Professional rigs

Collaborative mapping generates fresh, diverse and global map data for HD Maps

SLIDE 4

Localization and Mapping

Structure from Motion (SfM)
Simultaneous Localization and Mapping

(SLAM)

Positioning and scale estimation

Monocular Camera + GPS

Collaborative mapping - Computer Vision

Sensors: Monocular Camera, GPS Redundancy: Accelerometers, Compass, IMU, LiDAR, Radar, Stereo Camera

Recognition

Object Recognition
Stationary objects
Moving objects
Semantic Scene Understanding
Semantic relations between the map objects

Sensors: Monocular Camera Redundancy: LiDAR, Radar, Stereo Camera,

SLIDE 5

Key Components

SfM Recognition

Map Data

3D reconstruction Object recognition 3D object extraction

Monocular Camera + GPS

SLIDE 6

Semantic Segmentation

3D Point cloud

Semantic Point Cloud

Traffic Sign Recognition

SLIDE 7

Traffic Signs

Poles

Map Data - Visualization and API

Map data from 200M images accessible worldwide through API

SLIDE 8

Challenges and Solutions

SLIDE 9

Moving Objects

Challenges:

Differentiate between the ego motion and distractor motions in the scene

Solutions:
Motion segmentation: Identify motion clusters in the scene and recover ego motion
Moving object removal: Semantically ignore moving objects in SfM

A moving bus in front of the camera

SLIDE 10

Moving Objects

Imag e Segmentation Static vs. Dynamic Before After Removal of moving objects

SLIDE 11

Action Cameras Fisheye Equirectangular (360)

Database:

Build a database for camera intrinsics and

projection models

Calibration:

Crowdsourced calibration
Self-calibration with multiple images
End-to-end self-calibration with CNN

Camera Calibration

SLIDE 12

Camera Calibration

Panorama to Perspective Time Travel

SLIDE 13

Map Updates

Challenges:
Traditional SfM pipeline is designed for static/batch processing
Map updates need to be scalable and consistent
Solutions:
Stream processing architecture over batch processing
Robust local reconstruction alignments under varying imaging conditions
Distributed map updates given GPS (straightforward)
Handling boundary conditions

SLIDE 14

Annotations - Recognition

Cityscape Dataset

30 object classes
5K fine / 20K coarse annotations
European cities
Diverse weather/season
Instance labels

Mapillary Vistas Dataset (MVD)

100 object classes
25K fine annotations
6 continents
Diverse weather/season/cameras
Instance labels

Neuhold et al. ICCV 2017 Mapillary

SLIDE 15

Annotations - Recognition

Challenge:
Annotation is time-consuming in terms of specification, annotations and QA.
Solutions:
Synthetic data
GAN for domain adaptation
Active learning
Semi-automatic annotation
Human in the loop

SLIDE 16

Annotations - Human in the loop

Machin e Data Human

Challenges:
Turnaround time from annotations to

improvement of algorithms

Quality control is generally difficult with a large

crowd of people

Solutions:
Fully connected backend with automatic re-

training

Work with the mapping community that

understands and cares the quality of map data

SLIDE 17

Annotations - Human in the loop

Machine detection to human verification Tagging to machine detection

SLIDE 18

Rare Objects

Detecting rare objects (under-represented annotations) is key to the safety and map

updates

Long tail distribution for general objects on the road e.g. a koala on the road

Number of instances for each object class in Mapillary Vistas Dataset

>100K street lights <10k mailboxes <100 ramps

SLIDE 19

Rare Objects

Use adaptive weighting in loss functions to boost performance for rare objects

Loss Max-Pooling for Semantic Image Segmentation. Rota Bulò, Neuhold and Kontschieder CVPR 2017, Mapillary

SLIDE 20

Scaling

200 million Images 3.4 million km 15.6 billion objects 190 countries

Challenges:
Constant and parallel updates
Serve billions of map features via API
Low latency and cost-effective processing
Time-consuming training
Solutions:
Streaming processing over batch processing
Geo-Index and full-text search for map features
Optimized GPU processing in AWS ~$5K/100M images
In-house Titan-XP cluster significantly reduces training time

SLIDE 21

Map Data - Monocular Camera

SLIDE 22

Let’s map the world together!

To Date

Collaborative Mapping with Street- level Images in the Wild

Yubin Kuang Co-founder and Computer Vision Lead

Mapillary

Mapillary is a street-level imagery platform, powered by collaboration and computer vision.

Collaboration - Image Capture

Computer Vision

Any device combined with automation can scale infinitely

Collaborative mapping - Capture

Collaborative mapping generates fresh, diverse and global map data for HD Maps

Localization and Mapping

Monocular Camera + GPS

Collaborative mapping - Computer Vision

Recognition

Key Components

Monocular Camera + GPS

Poles

Map Data - Visualization and API

Map data from 200M images accessible worldwide through API

Challenges and Solutions

Moving Objects

Differentiate between the ego motion and distractor motions in the scene

Moving Objects

Database:

Calibration:

Camera Calibration

Camera Calibration

Panorama to Perspective Time Travel

Map Updates

Annotations - Recognition

Cityscape Dataset

Mapillary Vistas Dataset (MVD)

Annotations - Recognition

Annotations - Human in the loop

Machin e Data Human

improvement of algorithms

crowd of people

training

understands and cares the quality of map data

Annotations - Human in the loop

Rare Objects

updates

Rare Objects

Scaling

200 million Images 3.4 million km 15.6 billion objects 190 countries

Map Data - Monocular Camera

Let’s map the world together!

200 million Images 3.4 million km mapped 15.6 billion objects 190 countries