Escrever um comentário

Avalilação PREreview de Gen-Drive: Enhancing Diffusion Generative Driving Policies with Reward Modeling and Reinforcement Learning Fine-tuning

de Tarun Reddy Nukala

Publicado: 31 de maio de 2025
DOI: 10.5281/zenodo.15565288
Licença: CC BY 4.0

Gen-Drive is a promising, yet compute-hungry, stride toward fully learned planning. The framework shows compelling closed-loop gains, but real-time feasibility remains out of reach.

Summary

Gen-Drive combines a diffusion generator with reinforcement learning to train autonomous-driving policies end to end. The generator proposes diverse trajectories; a reward model trained on pairwise human in addition to a VLM preferences scoring them. RL fine-tuning then supports the generation model toward high-reward, human-aligned behavior. The method outperforms imitation-only and other learning-based planners on the nuPlan closed-loop benchmark.

Strengths

Closed-loop boost: overall nuPlan score rises by about 16 points and collision rate drops by roughly 50 % compared with the imitation baseline.
Scalable data: VLM-assisted preference collection cuts annotation time significantly by reducing human effort.

Limitations and open questions

Latency: single-sample planning averages 282 ms and multi-sample planning (32 samples) averages 484 ms on an RTX 4090 GPU. Such delays risk missing sudden VRU incursions and are unrealistic for embedded automotive hardware.
Safety: the learned reward model imposes no hard constraints, so edge cases which are rare but catastrophic remain unbounded.

Future Scope

The authors plan show scope to integrate raw LiDAR and radar perception. The sensor noise propagation through the diffusion and RL loop can be scope for future study.

Overall, Gen-Drive advances generative planning, but its heavy compute budget and unproven reactive-safety margin calls for caution before deployment in real vehicles.

Competing interests

The author declares that they have no competing interests.

Você pode escrever um comentário nesta Avaliação PREreview de Gen-Drive: Enhancing Diffusion Generative Driving Policies with Reward Modeling and Reinforcement Learning Fine-tuning.

Antes de começar

Vamos pedir para você fazer login com seu ORCID iD. Se você não tiver um iD, você pode criar um.