The problem
Training ML models for physical-world tasks requires enormous labeled datasets. Collecting real-world data is slow, expensive, and limited in scenario diversity.
The approach
Built configurable environments that procedurally generate diverse training scenarios with pixel-perfect ground-truth labels. Domain randomization improved sim-to-real transfer.
Outcomes
- Millions of labeled samples generated without manual annotation
- Domain randomization improved real-world generalization
- Reusable environment framework across multiple projects