Overview
This master’s thesis presents a vision-based system for real-time 3D object coordinate estimation and tracking. The approach leverages the Hough Transform for robust object detection combined with a modular 3D estimation pipeline implemented entirely in ROS (Robot Operating System).
Key Features
- Monocular 3D Estimation: Extracts 3D object coordinates from a single camera input without requiring depth sensors
- Hough Transform Detection: Uses the Generalized Hough Transform for robust shape-based object detection
- Modular ROS Architecture: Cleanly separated ROS nodes for detection, estimation, and tracking
- Real-Time Performance: Optimized pipeline capable of running at interactive frame rates
- Robotic Integration: Designed for direct integration with robotic manipulation and navigation stacks
System Architecture
The system consists of three main modules:
- Detection Module: Processes camera frames using Hough Transform to identify and localize objects in 2D
- 3D Estimation Module: Converts 2D detections to 3D world coordinates using camera calibration and geometric reasoning
- Tracking Module: Maintains object identity and trajectory across frames using state estimation
Technologies Used
- ROS (Robot Operating System)
- OpenCV for image processing
- Python / C++
- Hough Transform algorithms
- Camera calibration tools
- Point cloud processing
Publication
This work was published as a preprint on TechRxiv:
Contributors
- Oussama Errouji
- Imad-Eddine NACIRI
- Jade Bousliman