Hai Pham

Hai Pham

I like ... 🧠📐🚗


09/2025 –
MVA
MSc. in the Mathématiques, Vision, Apprentissage (MVA) program at École Normale Supérieure Paris–Saclay. Also researching how to improve spatial reasoning for VLMs with Shizhe Chen and Cordelia Schmid.
04/2025 –
08/2025
Qualcomm AI Research
I was a research intern at Qualcomm AI Research, working on efficient 3D generation with diffusion transformers — including SharpDepth (CVPR 2025), which sharpened metric depth boundaries and gave reviewers one more reason to care about thin structures. Our work was also featured in a Qualcomm blog post.
09/2023 –
08/2024
VinAI
I was at VinAI Research, supervised by Rang Nguyen, trying to make 3D scene understanding more affordable (less supervision) and more practical (more robust to messy real-world data). I had the pleasure to collaborate with Binh-Son Hua, Phong Nguyen, and Khoi Nguyen. This era produced semantic scene completion ideas that later escaped into AAAI 2025, and a multi-view occupancy challenge entry that briefly made Southeast Asia look good on a leaderboard.
09/2020 –
06/2024
HCMUT
B.Eng in Computer Engineering at Ho Chi Minh University of Technology (Dai hoc Bach Khoa). This is where I learned that computers are mostly linear algebra with good branding, and that sleep is a hyperparameter you can tune poorly.
bio

Greeting humans, I am currently a master student doing something (website in progress).

publications
SharpDepth
SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation
CVPR 2025
Duc-Hai Pham*, Tung Do*, Phong Nguyen, Binh-Son Hua, Khoi Nguyen, Rang Nguyen
Semi-supervised SSC
Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance
AAAI 2025
Duc-Hai Pham, Tuan Ho, Duc Dung Nguyen, Anh Pham, Phong Nguyen, Khoi Nguyen, Rang Nguyen

Also on Google Scholar for the canonical, auto-updated list.

pet projects

Trying to do more stuff on the side. Thanks Cursor for helping me.

SwiftPolicy
SwiftPolicy is a small playground for testing score distillation on diffusion policies for visuomotor control — built for the MVA robotics course with Ianis Hammani. It is embarrassingly simple and somehow still works, which is the best kind of research software.
Occupancy
3D Semantic Occupancy from Surrounding View Images — a CVPR 2023 3D Occupancy Prediction Challenge entry with Tuan Ngo, Tuan Ho, Khoi Nguyen, and Rang Nguyen. We predicted voxel semantics from multi-view images and ranked 16 / ~200 teams — at the time, the highest-ranked team from Southeast Asia on the leaderboard. See the challenge site.
blogging

Amateur blogging at the moment. Key audience is myself and potential recruiters.

misc unsorted