CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

Cooking is a central activity of daily living, supporting independence as well as mental and physical health. However, prior work has highlighted key barriers for people with low vision (LV) to cook, particularly around safely interacting with tools, such as sharp knives or hot pans. Drawing on recent advancements in computer vision (CV), we present CookAR, a head-mounted AR system with real-time object affordance augmentations to support safe and efficient interactions with kitchen tools. To design and implement CookAR, we collected and annotated the first egocentric dataset of kitchen tool affordances, fine-tuned an affordance segmentation model, and developed an AR system with a stereo camera to generate visual augmentations. To validate CookAR, we conducted a technical evaluation of our fine-tuned model as well as a qualitative lab study with 10 LV participants for suitable augmentation design. Our technical evaluation demonstrates that our model outperforms the baseline on our tool affordance dataset, while our user study indicates a preference for affordance augmentations over the traditional whole object augmentations.

File: CookAR-compressed.pdf

Abstract: Cooking is a central activity of daily living, supporting independence as well as mental and physical health. However, prior work has highlighted key barriers for people with low vision (LV) to cook, particularly around safely interacting with tools, such as sharp knives or hot pans. Drawing on recent advancements in computer vision (CV), we present CookAR, a head-mounted AR system with real-time object affordance augmentations to support safe and efficient interactions with kitchen tools. To design and implement CookAR, we collected and annotated the first egocentric dataset of kitchen tool affordances, fine-tuned an affordance segmentation model, and developed an AR system with a stereo camera to generate visual augmentations. To validate CookAR, we conducted a technical evaluation of our fine-tuned model as well as a qualitative lab study with 10 LV participants for suitable augmentation design. Our technical evaluation demonstrates that our model outperforms the baseline on our tool affordance dataset, while our user study indicates a preference for affordance augmentations over the traditional whole object augmentations.

VRBubble: Enhancing Peripheral Awareness of Avatars for People with Visual Impairments in Social Virtual Reality (ASSETS 2022)

We designed VRBubble, an audio-based VR technique that provides surrounding avatar information based on social distances. Based on Hall’s proxemic theory, VRBubble divides the social space with three Bubbles—Intimate, Conversation, and Social Bubble—generating spatial audio feedback to distinguish avatars in different bubbles and provide suitable avatar information (Ji, Cochran, and Zhao 2022).

Tiger F Ji, Brianna Cochran, Yuhang Zhao

ACM DL | Direct Download PDF

 

Social Virtual Reality (VR) is growing for remote socialization and collaboration. However, current social VR applications are not accessible to people with visual impairments (PVI) due to their focus on visual experiences. We aim to facilitate social VR accessibility by enhancing PVI’s peripheral awareness of surrounding avatar dynamics. We designed VRBubble, an audio-based VR technique that provides surrounding avatar information based on social distances. Based on Hall’s proxemic theory, VRBubble divides the social space with three Bubbles—Intimate, Conversation, and Social Bubble—generating spatial audio feedback to distinguish avatars in different bubbles and provide suitable avatar information. We provide three audio alternatives: earcons, verbal notifications, and real-world sound effects. PVI can select and combine their preferred feedback alternatives for different avatars, bubbles, and social contexts. We evaluated VRBubble and an audio beacon baseline with 12 PVI in a navigation and a conversation context. We found that VRBubble significantly enhanced participants’ avatar awareness during navigation and enabled avatar identification in both contexts. However, VRBubble was shown to be more distracting in crowded environments.

Publication accepted to ASSETS 2022 and presented as a workshop in Athens, Greece.

“It’s Just Part of Me:” Understanding Avatar Diversity and Self-presentation of People with Disabilities in Social Virtual Reality (ASSETS 2022)

We explored people with disabilities’ avatar perception and disability disclosure preferences in social VR by (1) conducting a systematic review of fifteen popular social VR applications to evaluate their avatar diversity and accessibility support and (2) interviewing 19 participants with different disabilities to understand their avatar experiences (Zhang et al. 2022).

Kexin Zhang, Elmira Deldari, Zhicong Lu, Yaxing Yao, Yuhang Zhao

ACM DL | Direct Download PDF

 

In social Virtual Reality (VR), users are embodied in avatars and interact with other users in a face-to-face manner using avatars as the medium. With the advent of social VR, people with disabilities (PWD) have shown an increasing presence on this new social media. With their unique disability identity, it is not clear how PWD perceive their avatars and whether and how they prefer to disclose their disability when presenting themselves in social VR. We fill this gap by exploring PWD’s avatar perception and disability disclosure preferences in social VR. Our study involved two steps. We first conducted a systematic review of fifteen popular social VR applications to evaluate their avatar diversity and accessibility support. We then conducted an in-depth interview study with 19 participants who had different disabilities to understand their avatar experiences. Our research revealed a number of disability disclosure preferences and strategies adopted by PWD (e.g., reflect selective disabilities, present a capable self). We also identified several challenges faced by PWD during their avatar customization process. We discuss the design implications to promote avatar accessibility and diversity for future social VR platforms.

Publication accepted to ASSETS 2022 and presented in Athens, Greece.

A Preliminary Interview: Understanding XR Developers’ Needs towards Open-Source Accessibility Support (IEEE VRW 2023)

We investigated XR developers' practices, challenges, and needs when integrating accessibility in their projects (Ji et al. 2023).

Tiger F Ji, Yaxin Hu, Yu Huang, Ruofei Du, Yuhang Zhao

IEEE Xplore | Direct Download PDF

 

While extended reality (XR) technology is seeing increasing mainstream utilization, it is not accessible to users with disabilities and lacks support for XR developers to create accessibility features. In this study, we investigated XR developers’ practices, challenges, needs when integrating accessibility in their projects. Our findings revealed developers’ needs for open-source accessibility support, such as code examples of particular accessibility features alongside accessibility guidelines.

Publication accepted to IEEE VR 2023 and presented as a workshop in Shanghai, China.

Understanding How Low Vision People Read Using Eye Tracking (CHI 2023)

We collected the gaze data of 20 low vision participants and 20 sighted controls who performed reading tasks on a computer screen to thoroughly explore their challenges in reading based on their gaze behaviors and compare gaze data quality between low vision and sighted people (Wang et al. 2023).

Ru Wang, Linxiu Zeng, Xinyong Zhang, Sanbrita Mondal, Yuhang Zhao

ACM DL | Direct Download PDF

 

While being able to read with screen magnifiers, low vision people have slow and unpleasant reading experiences. Eye tracking has the potential to improve their experience by recognizing fine-grained gaze behaviors and providing more targeted enhancements. To inspire gaze-based low vision technology, we investigate the suitable method to collect low vision users’ gaze data via commercial eye trackers and thoroughly explore their challenges in reading based on their gaze behaviors. With an improved calibration interface, we collected the gaze data of 20 low vision participants and 20 sighted controls who performed reading tasks on a computer screen; low vision participants were also asked to read with different screen magnifiers. We found that, with an accessible calibration interface and data collection method, commercial eye trackers can collect gaze data of comparable quality from low vision and sighted people. Our study identified low vision people’s unique gaze patterns during reading, building upon which, we propose design implications for gaze-based low vision technology.

Publication accepted to CHI 2023 and presented in Hamburg, Germany.

Practices and Barriers of Cooking Training for Blind and Low Vision People (ASSETS 2023)

We interviewed six professionals to explore their training strategies and technology recommendations for blind and low vision clients in cooking activities (Wang et al. 2023).

Ru Wang, Nihan Zhou, Tam Nguyen, Sanbrita Mondal, Bilge Mutlu, Yuhang Zhao

ACM DL | Direct Download PDF

 

Cooking is a vital yet challenging activity for blind and low vision (BLV) people, which involves many visual tasks that can be difficult and dangerous. BLV training services, such as vision rehabilitation, can effectively improve BLV people’s independence and quality of life in daily tasks, such as cooking. However, there is a lack of understanding on the practices employed by the training professionals and the barriers faced by BLV people in such training. To fill the gap, we interviewed six professionals to explore their training strategies and technology recommendations for BLV clients in cooking activities. Our findings revealed the fundamental principles, practices, and barriers in current BLV training services, identifying the gaps between training and reality.

Publication accepted to ASSETS 2023 and presented in New York City, New York.

A Diary Study in Social Virtual Reality: Impact of Avatars with Disability Signifiers on the Social Experiences of People with Disabilities (ASSETS 2023)

We conducted a diary study with 10 People with Disabilities who freely explored VRChat for two weeks, comparing their experiences between using regular avatars and avatars with disability signifiers (i.e., avatar features that indicate the user’s disability in real life) (Zhang et al. 2023).

Kexin Zhang, Elmira Deldari, Yaxing Yao, Yuhang Zhao

ACM DL | Direct Download PDF

 

People with disabilities (PWD) have shown a growing presence in the emerging social virtual reality (VR). To support disability representation, some social VR platforms start to involve disability features in avatar design. However, it is unclear how disability disclosure via avatars (and the way to present it) would affect PWD’s social experiences and interaction dynamics with others. To fill this gap, we conducted a diary study with 10 PWD who freely explored VRChat—a popular commercial social VR platform—for two weeks, comparing their experiences between using regular avatars and avatars with disability signifiers (i.e., avatar features that indicate the user’s disability in real life). We found that PWD preferred using avatars with disability signifiers and wanted to further enhance their aesthetics and interactivity. However, such avatars also caused embodied, explicit harassment targeting PWD. We revealed the unique factors that led to such harassment and derived design implications and protection mechanisms to inspire more safe and inclusive social VR.

Publication accepted to ASSETS 2023 and presented in New York City, New York.

Springboard, Roadblock or “Crutch”?: How Transgender Users Leverage Voice Changers for Gender Presentation in Social Virtual Reality (IEEE VR 2024)

We interviewed 13 transgender and gender-nonconforming users of social VR platforms, focusing on their experiences with and without voice changers to explore the connection between avatar embodiment and voice representation (Povinelli and Zhao 2024).

Kassie C Povinelli, Yuhang Zhao

IEEE Xplore | Direct Download PDF

Social virtual reality (VR) serves as a vital platform for transgender individuals to explore their identities through avatars and foster personal connections within online communities. However, it presents a challenge: the disconnect between avatar embodiment and voice representation, often leading to misgendering and harassment. Prior research acknowledges this issue but overlooks the potential solution of voice changers. We interviewed 13 transgender and gender-nonconforming users of social VR platforms, focusing on their experiences with and without voice changers. We found that using a voice changer not only reduces voice-related harassment, but also allows them to experience gender euphoria through both hearing their modified voice and the reactions of others to their modified voice, motivating them to pursue voice training and medication to achieve desired voices. Furthermore, we identified the technical barriers to current voice changer technology and potential improvements to alleviate the problems that transgender and gender-nonconforming users face.

Publication accepted to IEEE VR 2024 and presented in Orlando, Florida.

Exploring the Design Space of Optical See-through AR Head-Mounted Displays to Support First Responders in the Field (CHI 2024)

We interviewed 26 first responders in the field who experienced a state-of-the-art optical-see-through AR HMD, soliciting their first-hand experiences, design ideas, and concerns on its interaction techniques and four types of AR cues (Zhang et al. 2024).

Kexin Zhang, Brianna R Cochran, Ruijia Chen, Lance Hartung, Bryce Sprecher, Ross Tredinnick, Kevin Ponto, Suman Banerjee, Yuhang Zhao

ACM DL | Direct Download PDF

 

First responders (FRs) navigate hazardous, unfamiliar environments in the field (e.g., mass-casualty incidents), making life-changing decisions in a split second. AR head-mounted displays (HMDs) have shown promise in supporting them due to its capability of recognizing and augmenting the challenging environments in a hands-free manner. However, the design space has not been thoroughly explored by involving various FRs who serve different roles (e.g., firefighters, law enforcement) but collaborate closely in the field. We interviewed 26 first responders in the field who experienced a state-of-the-art optical-see-through AR HMD, as well as its interaction techniques and four types of AR cues (i.e., overview cues, directional cues, highlighting cues, and labeling cues), soliciting their first-hand experiences, design ideas, and concerns. Our study revealed both generic and role-specific preferences and needs for AR hardware, interactions, and feedback, as well as identifying desired AR designs tailored to urgent, risky scenarios (e.g., affordance augmentation to facilitate fast and safe action). While acknowledging the value of AR HMDs, concerns were also raised around trust, privacy, and proper integration with other equipment. Finally, we derived comprehensive and actionable design guidelines to inform future AR systems for in-field FRs.

Publication accepted to CHI 2024 and presented in Honolulu, Hawaii.

GazePrompt: Enhancing Low Vision People’s Reading Experience with Gaze-Aware Augmentations (CHI 2024)

GazePrompt is a gaze-aware reading aid that provides timely and targeted visual and audio augmentations for people with low vision based on users’ gaze behaviors (Wang et al. 2024)

Ru Wang, Zach Potter, Yun Ho, Daniel Killough, Linda Zeng, Sanbrita Mondal, and Yuhang Zhao

ACM DL | Direct Download PDF

Reading is a challenging task for low vision people. While conventional low vision aids (e.g., magnification) offer certain support, they cannot fully address the difficulties faced by low vision users, such as locating the next line and distinguishing similar words. To fill this gap, we present GazePrompt, a gaze-aware reading aid that provides timely and targeted visual and audio augmentations based on users’ gaze behaviors. GazePrompt includes two key features: (1) a Line-Switching support that highlights the line a reader intends to read; and (2) a Difficult-Word support that magnifies or reads aloud a word that the reader hesitates with. Through a study with 13 low vision participants who performed well-controlled reading-aloud tasks with and without GazePrompt, we found that GazePrompt significantly reduced participants’ line switching time, reduced word recognition errors, and improved their subjective reading experiences. A follow-up silent-reading study showed that GazePrompt can enhance users’ concentration and perceived comprehension of the reading contents. We further derive design considerations for future gaze-based low vision aids.

Publication accepted to CHI 2024 and presented in Honolulu, Hawaii.