Overview
Why Rule-Based Instead of a Neural Network?
- Which fingers are extended or folded
- Whether the thumb is pointing sideways, crossing over, or tucked under
- The exact spatial relationship between the thumb and each finger
The Core Detection Logic
Python
The Hardest Problem: Disambiguating Similar Fist Gestures
Python
Python
Stabilization System
Python
Technologies Used
- Python: Core language for all gesture logic and application orchestration.
- MediaPipe: Google's real-time hand tracking solution, providing 21 3D landmarks per hand at high FPS with no GPU required.
- OpenCV: Handles webcam capture, frame flipping, landmark visualization, and the real-time UI overlay rendering.
- NumPy: Powers all coordinate math — distance calculations, positional comparisons, and geometric reasoning between landmarks.
Challenges and Learnings
Project Repository
Bash