RT-DETR system hit 99.3% mAP50 on 505 clean frames
TruPath Labs Research published findings on a real-time computer vision system for bounded-court projectile scoring, highlighting that data quality significantly impacts model performance more than dataset size, along with commercial licensing constraints and sensor-level encoding parameters being critical production considerations. The system, built on an RT-DETRv2-S transformer backbone and deployed on Apple Silicon, achieved 99.3% mAP50 on a custom 3-class domain.
Key Takeaways
- 505 carefully curated frames outperformed 4,398 noisy frames, with mAP50 rising from 72.5% in v11 to 99.3% in v15.
- End-to-end latency measured 42ms at Phase 0, versus a 35ms production target.
- Detection stability moved from 66% at 4K CBR encoding (bppf 0.055) to 100% at 2304×1296 @ 15fps with I-frame interval 1x (bppf 0.183).
- The production model is RT-DETRv2-S deployed as a CoreML package on Apple Silicon, with Apache 2.0 cited as the licensing filter for network-service deployment.
- First live game testing finished at 11/11 correct throws, and total GPU training spend stayed under $5.
Why It Matters
The immediate takeaway is that this bounded-court CV stack is already accurate enough for live scoring, but it is still 7ms short of the sub-35ms gate. The broader lesson is operational: TruPath shows that licensing, sensor encoding, and annotation quality can matter more than raw model choice, which is why RT-DETRv2-S stayed in the production set while AGPL-3.0 model families did not. The key number to watch next is whether float16 CoreML quantization or removal of the remaining ONNX serialization boundary can close the gap from 42ms to under 35ms.
Read full article at trupathventures.net
