Chuong Huynh

Chuong Huynh (Ryan)

I am a machine learning engineer at ATG Pinterest, working on improving visual search. I obtained my Ph.D. from Computer Science at Univeristy of Maryland, College Park, advised by Prof. Abhinav Shrivastava.

Before that, I was a research resident and a research engineer at VinAI Research, Vietnam (now Qualcomm AI Research), under the mentorship of Prof. Minh Hoai. I obtained my B.S. in Computer Science from University of Science, Vietnam National University, Ho Chi Minh City in 2018 under the supervision of Prof. Minh-Triet Tran.

My research passion lies at the intersection of computer vision and user interaction. I leverage Large Language Models to advance image-text alignment and have previously developed cutting-edge techniques for highly detailed interactive segmentation and matting.

Experience

Machine Learning Engineer II, Avanced Technologies Group at Pinterest Jan 2026 - recent
Work on Visual Search Improvements.

Research Intern, Vision Intelligence at Samsung Research America May 2025 - Aug 2025
Work on Image and Text Alignment.
Mentor: Deen Dayal Mohan, Hossein Souri, and Vitali Petsiuk

Applied Scientist Intern, Rufus Multimodal at Amazon.com May 2024 - Dec 2024
Work on Composed Image Retrieval.
Mentor: Jinyu Yang and Son Tran

Research Intern, Digital Media at Adobe Research May 2023 - Dec 2023
Work on Image and Video Matting.
Mentor: Joon-Young Lee and Seoung Wug Oh

Research Intern, Photoshop ART at Adobe Research May 2022 - May 2023
Work on Interactive Segmentation.
Mentor: Yuqian Zhou, Zhe Lin, Connelly Barnes, Eli Shechtman and Sohrab Amirghodsi

Research Resident and Engineer, VinAI Research now Qualcomm AI Research July 2019 - July 2021
Work on High-Resolution Segmentation.
Mentor: Prof. Minh Hoai and Anh Tran

First author papers are highlighted.

	Efficient and High-Fidelity Omni Modality Retrieval Chuong Huynh, Manh Luong, Abhinav Shrivastava CVPR, 2026 project page / arXiv The first universal retrieval model for text, vision, and audio with attention-based resampling and Attention Sliced Wasserstein Pooling for efficient, high-fidelity omni-modal representations.
	ARGENT: Adaptive Hierarchical Image-Text Representations Chuong Huynh, Hossein Souri, Abhinav Kumar, Vitali Petsiuk, Deen Dayal Mohan, Suren Kumar ECCV, 2026 project page / arXiv A stronger hyperbolic VLM baseline with adaptive entailment loss and probabilistic entailment evaluation for hierarchical image-text representations.
	All-in-One Conditioning for Text-to-Image Synthesis Hirunima Jayasekara, Chuong Huynh, Yixuan Ren, Christabel Acquaye, Abhinav Shrivastava ICPR, 2026 arXiv A zero-shot, scene graph-based conditioning mechanism for compositional text-to-image generation with soft visual guidance.
	VeriGraph: Scene Graphs for Execution Verifiable Robot Planning Daniel Ekpo, Mara Levy, Saksham Suri, Chuong Huynh, Archana Swaminathan, Abhinav Shrivastava ICRA, 2026 project page / arXiv / code A action planning model leveraged visual scene graph.