An Vo
Research Engineer @ Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
I am now Research Engineer at MBZUAI working with Thamar Solorio. I obtained my MS from KAIST with Anh Totti Nguyen and Daeyoung Kim. During that time, I also worked closely with Mohammad Reza Taesiri. My MS Program was fully funded by Hyundai CMK Global Scholarship (prestigious scholarship for graduate students). Prior to join KAIST, I obtained my BS degree as the valedictorian at Vietnam National University - Ho Chi Minh City (VNU-HCM) in 2023 with Ngoc Hoang Luong.
research
I am broadly interested in Large Language Models (LLMs) and Vision Language Models (VLMs), especially in making them more trustworthy and explainable in edge/hard cases. In my previous life, I worked at the intersection of Evolutionary Computation, Multi-objective Optimization, and AutoML.
Recently, I am the lead author of VLMs are Biased, a paper that was featured on Hacker News (front page; top-5), LinkedIn, Gary Marcus's article (GPT-5: Overdue, overhyped and underwhelming), Lucas Beyer's tweet. In this work, we introduce VLMBias, a benchmark for evaluating visual counting in VLMs. We show that state-of-the-art models (e.g., o3, o4-mini, Gemini 2.5 Pro, Claude 3.7 Sonnet) achieve 100% accuracy counting on images of popular subjects (e.g. knowing that the Adidas logo has 3 stripes and a dog has 4 legs) but are only ~17% accurate in counting in counterfactual images (e.g. counting stripes in a 4-striped Adidas-like logo or counting legs in a 5-legged dog). My work has been used by Google DeepMind and ByteDance and has been accepted at top venues: ICML, AAAI, GECCO, etc.