GazePrompt: Enhancing Low Vision People’s Reading Experience with Gaze-Aware Augmentations (CHI 2024)

GazePrompt is a gaze-aware reading aid that provides timely and targeted visual and audio augmentations for people with low vision based on users’ gaze behaviors (Wang et al. 2024)

Ru Wang, Zach Potter, Yun Ho, Daniel Killough, Linda Zeng, Sanbrita Mondal, and Yuhang Zhao

ACM DL | Direct Download PDF

 

Reading is a challenging task for low vision people. While conventional low vision aids (e.g., magnification) offer certain support, they cannot fully address the difficulties faced by low vision users, such as locating the next line and distinguishing similar words. To fill this gap, we present GazePrompt, a gaze-aware reading aid that provides timely and targeted visual and audio augmentations based on users’ gaze behaviors. GazePrompt includes two key features: (1) a Line-Switching support that highlights the line a reader intends to read; and (2) a Difficult-Word support that magnifies or reads aloud a word that the reader hesitates with. Through a study with 13 low vision participants who performed well-controlled reading-aloud tasks with and without GazePrompt, we found that GazePrompt significantly reduced participants’ line switching time, reduced word recognition errors, and improved their subjective reading experiences. A follow-up silent-reading study showed that GazePrompt can enhance users’ concentration and perceived comprehension of the reading contents. We further derive design considerations for future gaze-based low vision aids.

Publication accepted to CHI 2024 and presented in Honolulu, Hawaii.

Inclusive Avatar Guidelines for People with Disabilities: Supporting Disability Representation in Social Virtual Reality (CHI 2025)

Our work aim to advance the avatar design practices by delivering a set of centralized, comprehensive, and validated design guidelines that are easy to adopt, disseminate, and update. Through a systematic literature review and interview with 60 participants with various disabilities, we derived 20 initial design guidelines that cover diverse disability expression methods through five aspects, including avatar appearance, body dynamics, assistive technology design, peripherals around avatars, and customization control. We further evaluated the guidelines via a heuristic evaluation study with 10 VR practitioners, validating the guideline coverage, applicability, and actionability. Our evaluation resulted in a final set of 17 design guidelines with recommendation levels.

Kexin Zhang, Edward Glenn Scott Spencer, Andric Li, Ang Li, Yaxing Yao, and Yuhang Zhao

ACM DL | Direct Download PDF | Open Source

Publication accepted to CHI 2025 and presented in Yokohama, Japan.

Practices and Barriers of Cooking Training for Blind and Low Vision People (ASSETS 2023)

We interviewed six professionals to explore their training strategies and technology recommendations for blind and low vision clients in cooking activities (Wang et al. 2023).

Ru Wang, Nihan Zhou, Tam Nguyen, Sanbrita Mondal, Bilge Mutlu, Yuhang Zhao

ACM DL | Direct Download PDF

 

Cooking is a vital yet challenging activity for blind and low vision (BLV) people, which involves many visual tasks that can be difficult and dangerous. BLV training services, such as vision rehabilitation, can effectively improve BLV people’s independence and quality of life in daily tasks, such as cooking. However, there is a lack of understanding on the practices employed by the training professionals and the barriers faced by BLV people in such training. To fill the gap, we interviewed six professionals to explore their training strategies and technology recommendations for BLV clients in cooking activities. Our findings revealed the fundamental principles, practices, and barriers in current BLV training services, identifying the gaps between training and reality.

Publication accepted to ASSETS 2023 and presented in New York City, New York.

Springboard, Roadblock or “Crutch”?: How Transgender Users Leverage Voice Changers for Gender Presentation in Social Virtual Reality (IEEE VR 2024)

We interviewed 13 transgender and gender-nonconforming users of social VR platforms, focusing on their experiences with and without voice changers to explore the connection between avatar embodiment and voice representation (Povinelli and Zhao 2024).

Kassie C Povinelli, Yuhang Zhao

IEEE Xplore | Direct Download PDF

Social virtual reality (VR) serves as a vital platform for transgender individuals to explore their identities through avatars and foster personal connections within online communities. However, it presents a challenge: the disconnect between avatar embodiment and voice representation, often leading to misgendering and harassment. Prior research acknowledges this issue but overlooks the potential solution of voice changers. We interviewed 13 transgender and gender-nonconforming users of social VR platforms, focusing on their experiences with and without voice changers. We found that using a voice changer not only reduces voice-related harassment, but also allows them to experience gender euphoria through both hearing their modified voice and the reactions of others to their modified voice, motivating them to pursue voice training and medication to achieve desired voices. Furthermore, we identified the technical barriers to current voice changer technology and potential improvements to alleviate the problems that transgender and gender-nonconforming users face.

Publication accepted to IEEE VR 2024 and presented in Orlando, Florida.

Understanding How Low Vision People Read Using Eye Tracking (CHI 2023)

We collected the gaze data of 20 low vision participants and 20 sighted controls who performed reading tasks on a computer screen to thoroughly explore their challenges in reading based on their gaze behaviors and compare gaze data quality between low vision and sighted people (Wang et al. 2023).

Ru Wang, Linxiu Zeng, Xinyong Zhang, Sanbrita Mondal, Yuhang Zhao

ACM DL | Direct Download PDF

 

While being able to read with screen magnifiers, low vision people have slow and unpleasant reading experiences. Eye tracking has the potential to improve their experience by recognizing fine-grained gaze behaviors and providing more targeted enhancements. To inspire gaze-based low vision technology, we investigate the suitable method to collect low vision users’ gaze data via commercial eye trackers and thoroughly explore their challenges in reading based on their gaze behaviors. With an improved calibration interface, we collected the gaze data of 20 low vision participants and 20 sighted controls who performed reading tasks on a computer screen; low vision participants were also asked to read with different screen magnifiers. We found that, with an accessible calibration interface and data collection method, commercial eye trackers can collect gaze data of comparable quality from low vision and sighted people. Our study identified low vision people’s unique gaze patterns during reading, building upon which, we propose design implications for gaze-based low vision technology.

Publication accepted to CHI 2023 and presented in Hamburg, Germany.

VisiMark: Characterizing and Augmenting Landmarks for People with Low Vision in Augmented Reality to Support Indoor Navigation (CHI 2025)

We designed VisiMark, an AR interface that supports landmark perception for PLV by providing both overviews of space structures and in-situ landmark augmentations.

Ruijia Chen, Junru Jiang, Pragati Maheshwary, Brianna R. Cochran, and  Yuhang Zhao

ACM DL | Direct Download PDF

Abstract: Landmarks are critical in navigation, supporting self-orientation and mental model development. Similar to sighted people, people with low vision (PLV) frequently look for landmarks via visual cues but face difficulties identifying some important landmarks due to vision loss. We first conducted a formative study with six PLV to characterize their challenges and strategies in landmark selection, identifying their unique landmark categories (e.g., area silhouettes, accessibility-related objects) and preferred landmark augmentations. We then designed VisiMark, an AR interface that supports landmark perception for PLV by providing both overviews of space structures and in-situ landmark augmentations. We evaluated VisiMark with 16 PLV and found that VisiMark enabled PLV to perceive landmarks they preferred but could not easily perceive before, and changed PLV’s landmark selection from only visually-salient objects to cognitive landmarks that are more important and meaningful. We further derive design considerations for AR-based landmark augmentation systems for PLV.

VRBubble: Enhancing Peripheral Awareness of Avatars for People with Visual Impairments in Social Virtual Reality (ASSETS 2022)

We designed VRBubble, an audio-based VR technique that provides surrounding avatar information based on social distances. Based on Hall’s proxemic theory, VRBubble divides the social space with three Bubbles—Intimate, Conversation, and Social Bubble—generating spatial audio feedback to distinguish avatars in different bubbles and provide suitable avatar information (Ji, Cochran, and Zhao 2022).

Tiger F Ji, Brianna Cochran, Yuhang Zhao

ACM DL | Direct Download PDF

 

Social Virtual Reality (VR) is growing for remote socialization and collaboration. However, current social VR applications are not accessible to people with visual impairments (PVI) due to their focus on visual experiences. We aim to facilitate social VR accessibility by enhancing PVI’s peripheral awareness of surrounding avatar dynamics. We designed VRBubble, an audio-based VR technique that provides surrounding avatar information based on social distances. Based on Hall’s proxemic theory, VRBubble divides the social space with three Bubbles—Intimate, Conversation, and Social Bubble—generating spatial audio feedback to distinguish avatars in different bubbles and provide suitable avatar information. We provide three audio alternatives: earcons, verbal notifications, and real-world sound effects. PVI can select and combine their preferred feedback alternatives for different avatars, bubbles, and social contexts. We evaluated VRBubble and an audio beacon baseline with 12 PVI in a navigation and a conversation context. We found that VRBubble significantly enhanced participants’ avatar awareness during navigation and enabled avatar identification in both contexts. However, VRBubble was shown to be more distracting in crowded environments.

Publication accepted to ASSETS 2022 and presented as a workshop in Athens, Greece.

VRSight: An AI-driven Scene Description System to Improve Virtual Reality Accessibility for Blind People (UIST 2025)

We present VRSight, an end-to-end system that recognizes VR scenes post hoc through a set of AI models (e.g., object detection, depth estimation, LLM-based atmosphere interpretation) and generates tone-based, spatial audio feedback, empowering blind users to interact in VR without developer intervention.

Daniel Killough, Justin Feng, Zheng Xue Ching, Daniel Wang, Rithvik Dyava, Yapeng Tian, Yuhang Zhao

ACM DL | Direct Download PDF

Virtual Reality (VR) is inaccessible to blind people. While research has investigated many techniques to enhance VR accessibility, they require additional developer effort to integrate. As such, most mainstream VR apps remain inaccessible as the industry de-prioritizes accessibility. We present VRSight, an end-to-end system that recognizes VR scenes post hoc through a set of AI models (e.g., object detection, depth estimation, LLM-based atmosphere interpretation) and generates tone-based, spatial audio feedback, empowering blind users to interact in VR without developer intervention. To enable virtual element detection, we further contribute DISCOVR, a VR dataset consisting of 30 virtual object classes from 17 social VR apps, substituting real-world datasets that remain not applicable to VR contexts. Nine participants used VRSight to explore an off-the-shelf VR app (Rec Room), demonstrating its effectiveness in facilitating social tasks like avatar awareness and available seat identification.

Publication accepted to UIST 2025 and presented in Busan, Korea.