Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 18 additions & 19 deletions _pages/Schedule_WACV_2026.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,24 +7,23 @@ nav: true
nav_order: 4
---

<!-- TBD -->
# LLVM-AD @ WACV 2026

TBD



<!-- | 13:40 -- 13:50 | Paper Presentation: Language-Driven Active Learning for Diverse Open-Set 3D Object Detection |
| 13:50 -- 14:00 | Paper Presentation: Scenario understanding of traffic scenes through Large Visual Language Models |
| 14:00 -- 14:10 | Paper Presentation: Enhancing Weakly-Supervised Object Detection on Static Images through (Hallucinated) Motion |
| 14:10 -- 14:20 | Paper Presentation: Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussians |
| 14:20 -- 14:50 | **Keynote: Towards Safe Open-World Autonomy (Dr. Manmohan Chandraker)** |
| 14:50 -- 15:10 | Coffee Break |
| 15:10 -- 15:20 | Paper Presentation: ScVLM: Enhancing Vision-Language Model for Safety-Critical Event Understanding |
| 15:20 -- 15:30 | Paper Presentation: VLMine: Long-Tail Data Mining with Vision Language Models |
| 15:30 -- 15:40 | Paper Presentation: SenseRAG: Constructing Environmental Knowledge Bases with Proactive Querying for LLM-Based Autonomous Driving |
| 15:40 -- 16:10 | **Keynote: Fast-Slow Dual Autonomous Driving Systems (Dr. Hang Zhao)** |
| 16:10 -- 16:20 | Paper Presentation: Glimpse of MCQ based VQA in Road & Traffic Scenarios |
| 16:20 -- 16:30 | Paper Presentation: OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving |
| 16:30 -- 16:40 | Paper Presentation: Evaluating Multimodal Vision-Language Model Prompting Strategies for Visual Question Answering in Road Scene Understanding |
| 16:40 -- 17:00 | Summary & Closing Remarks | -->
| Time | Schedule |
| --- | --- |
| **13:00-13:10** | **Opening Remarks** |
| **13:10-13:40** | **Keynote Presentation** <br> **Lars Hammarstrand** |
| **13:40-13:50** | **Paper 5:** *Lightweight Multi-Scale Fusion for Real-Time Autonomous Driving Segmentation* |
| **13:50-14:00** | **Paper 7:** *FROST-Drive: Scalable and Efficient End-to-End Driving with a Frozen Vision Encoder* |
| **14:00-14:10** | **Paper 8:** *Efficient Visual Question Answering Pipeline for Autonomous Driving via Scene Region Compression* |
| **14:10-14:20** | **Paper 10:** *Benchmarking Vision-Language Models for Traffic Scene Understanding in Inclement Winter Weather: The AWDB Benchmark* |
| **14:20-14:30** | **Paper 11:** *Role of Language-Guidance in Knowledge Distillation for Semantic Segmentation Under Limited Field-Of-View Autonomous Driving* |
| **14:30-15:00** | **Keynote Presentation** <br> **Tong Shen** |
| **15:00-15:20** | **Coffee Break** |
| **15:20-15:30** | **Paper 12:** *Less Is More: Agentic Prompt Design for Safe VLM Action Selection* |
| **15:30-15:40** | **Paper 13:** *Trust-Guided Multimodal LLM Integration with Reinforcement Learning for Autonomous Driving* |
| **15:40-16:10** | **Keynote: Towards robust and efficient VLA for end-to-end autonomous driving** <br> **Litian Liu** |
| **16:10-16:20** | **Paper 14:** *VLA4CoDrive: Vision–Language–Action Dataset for Cooperative Autonomous Driving* |
| **16:20-16:30** | **Paper 18:** *2COOOL: An Evaluation Benchmark for Generating Incident Reports on Out-of-Distribution Hazards in Autonomous Driving* |
| **16:30-16:40** | **Paper 21:** *GATEPose: A Graph Attention Transformer Enhanced with Pose and Orientation Angles for Pedestrian Crossing Intention Prediction* |
| **16:40-17:00** | **Summary & Closing Remarks** |
10 changes: 10 additions & 0 deletions _pages/WACV_2026.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,16 @@ TBD
{% include people.html name="Litian Liu" affiliation="Qualcomm" url="https://litianliu.github.io/" img="https://litianliu.github.io/assets/img/prof_pic.jpg?0f86e7255c77a6dfd89f051d80803a8d" %}
</div>
</div>

### Keynote Talks

#### Towards robust and efficient VLA for end-to-end autonomous driving (Litian Liu)

**Abstract:**
Vision-Language-Action (VLA) models are emerging as powerful planners for end-to-end autonomous driving, but challenges remain in robustness and efficiency. In this talk, we first show that leveraging the generative capabilities of VLA can enhance planning through Generative Scenario Rollouts (GeRo), a plug-and-play framework that generates language-grounded future traffic scenes via autoregressive rollouts while performing planning, achieving state-of-the-art performance. In addition, to improve efficiency, we introduce a multitask distillation policy that transfers knowledge from large VLA models to lightweight runtime models. We also demonstrate how uncertainty quantification can further improve domain adaptation and scene generation. Together, these methods provide practical strategies for building robust, efficient, and adaptable VLA systems.

**Bio:**
Litian Liu is a research scientist at Qualcomm AI Research, where she focuses on uncertainty, robustness, and efficiency in machine learning for safety-critical applications, particularly developing Vision-Language-Action (VLA) planning for end-to-end autonomous driving systems. She received her Ph.D. in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology.
----------

### Program Committee
Expand Down