Kodifly, Hong Kong

Traffic Monitoring System

AI Engineer
May 2023 – Apr 2024

Developed an end-to-end computer vision pipeline for Hong Kong tunnel traffic monitoring, processing 100+ camera feeds in real-time with 95%+ detection accuracy using 2D/3D LiDAR fusion.

95%+

Detection Accuracy (mAP)

100+

Camera Feeds

30-60

FPS

< 100ms

Detection Latency

The Challenge

Hong Kong's tunnel infrastructure handles millions of vehicles daily. Traditional monitoring relied on human operators watching dozens of camera feeds—an approach prone to fatigue, inconsistency, and delayed incident response. The client needed an automated system that could detect vehicles, classify them, track their movement, and identify incidents in real-time.

Key Challenges

  • Processing 100+ simultaneous camera feeds with minimal latency
  • Varying lighting conditions inside tunnels (bright entrances, dark interiors, flickering lights)
  • Occlusion from large vehicles blocking smaller ones
  • Accurate vehicle classification (cars, trucks, motorcycles, emergency vehicles)
  • Real-time incident detection (stopped vehicles, wrong-way drivers, debris)

The Solution

I designed a multi-modal perception system that fuses 2D camera imagery with 3D LiDAR point clouds for robust vehicle detection and tracking. The system processes all feeds in real-time, maintaining a unified world model of tunnel traffic that enables sophisticated incident detection and traffic analytics.

1

Built custom detection pipeline using PaddlePaddle's PP-YOLOE for 2D detection with domain-specific training

2

Implemented 3D LiDAR processing using PointPillars architecture for depth-accurate vehicle localization

3

Developed sensor fusion algorithm combining 2D detections with 3D point clouds using calibrated projection

4

Created multi-object tracking system using DeepSORT with custom re-identification features for tunnel environments

5

Deployed on edge devices with TensorRT optimization for real-time performance

System Architecture

The architecture follows a distributed edge-cloud pattern. Edge devices handle real-time inference while the cloud aggregates data for analytics and long-term storage.

Edge Processing Units

NVIDIA Jetson AGX Xavier devices deployed at each tunnel section. Each unit processes 8-12 camera feeds and 2-4 LiDAR sensors with on-device inference.

2D Detection Module

PP-YOLOE model fine-tuned on 50,000+ tunnel-specific images. Handles vehicle detection, classification, and attribute recognition (color, size, type).

3D Perception Module

PointPillars-based LiDAR processing for accurate 3D bounding boxes. Provides ground-truth depth information for fusion.

Fusion Engine

Proprietary algorithm that projects 3D LiDAR detections into 2D camera space using extrinsic calibration. Resolves conflicts and improves detection confidence.

Tracking & Analytics

Multi-object tracker maintaining vehicle identities across camera handoffs. Feeds into traffic flow analytics and incident detection system.

Key Implementation Details

Handling Extreme Lighting Variations

Tunnel environments present unique lighting challenges—from bright daylight at entrances to near-darkness inside. I implemented adaptive histogram equalization preprocessing and trained the model with aggressive augmentation including synthetic lighting variations. This improved detection accuracy in low-light conditions by 23%.

Sensor Calibration Pipeline

Accurate fusion requires precise calibration between cameras and LiDAR sensors. I built an automated calibration pipeline using checkerboard patterns and developed a continuous calibration monitoring system that detects drift and triggers recalibration alerts.

Real-Time Multi-Camera Tracking

Vehicles must be tracked consistently as they move through the tunnel, appearing in multiple cameras. I designed a hierarchical tracking system: local trackers per camera, and a global tracker that maintains identity across camera handoffs using learned appearance embeddings.

Edge Deployment Optimization

To achieve 30-60 FPS on Jetson devices, I implemented INT8 quantization with calibration, layer fusion, and dynamic batching. Custom CUDA kernels for preprocessing further reduced latency by eliminating CPU-GPU transfers.

Tech Stack

Deep Learning

PaddlePaddlePyTorchPP-YOLOEPointPillarsDeepSORT

Computer Vision

OpenCVOpen3DCUDATensorRT

Edge Computing

NVIDIA Jetson AGX XavierROS2Docker

Backend

PythonC++RedisPostgreSQLKafka

Key Learnings

Multi-modal fusion is only as good as your calibration—invest heavily in robust calibration pipelines

Edge deployment constraints force creative optimization that often improves overall system design

Domain-specific training data is more valuable than larger generic datasets

Real-time systems need extensive stress testing under degraded conditions

Interested in Similar Solutions?

Let's discuss how I can help bring your AI ideas to production.

Get in Touch