Indian Flag
Government Of India
A-
A
A+

AiSee: From Accessibility Tool to Intelligent Companion

AiSee is an AI-powered assistive system designed to help visually impaired individuals interpret their surroundings and access information independently.

About Use Case

AiSee is an AI-driven accessibility solution designed to help people with visual impairments navigate the world more independently. Everyday activities such as identifying public transport, reading labels, locating facilities, or interpreting surroundings can present major barriers for individuals without sight. Traditional assistive technologies often require complex interactions or specialized hardware, which can limit usability and adoption. AiSee was developed to address these challenges by providing a more intuitive, conversational interface that integrates environmental awareness with intelligent assistance.

The concept behind AiSee emerged from discussions with visually impaired students who faced difficulties accessing lecture materials and navigating daily environments. Early insights revealed that many users preferred wearable solutions that did not require additional visual interfaces. Instead of developing smart glasses, the system was designed as a smart headphone-based platform, allowing users to interact naturally through voice commands and audio responses. This approach makes the technology comfortable to use and compatible with devices that visually impaired individuals already use in their daily lives.

AiSee combines several advanced artificial intelligence capabilities to interpret the user’s surroundings and deliver contextual assistance. One of its core functions is visual intelligence, which allows the system to analyze images and live video captured by the user’s device. Using computer vision models, the platform can identify objects, recognize text, and describe scenes. Users can ask questions about what they are seeing, and the system responds conversationally with relevant information.

Another key feature is smart assistance, which extends beyond immediate visual interpretation. The system can perform web searches, remember user preferences, and provide contextual recommendations. For example, if a user is waiting at a bus stop, the system can identify incoming buses and inform the user about routes and arrival times. This contextual awareness helps visually impaired individuals navigate public spaces more confidently.

AiSee also includes automation capabilities, allowing users to perform actions on third-party applications through voice commands. For instance, users can request the system to book a ride, check schedules, or access digital services without manually interacting with a smartphone. This reduces friction when performing routine tasks.

The platform further provides navigation assistance through personalized routing and obstacle alerts. Within partner locations such as public parks or tourist sites, AiSee can guide users with step-by-step directions while also offering contextual information about the environment. In pilot deployments, the system has been used in locations such as the Singapore Botanic Gardens, where it provides navigation and informational guidance to visually impaired visitors.

AiSee has been developed using a strong user-centered design approach. The system was refined through extensive conversations and testing with visually impaired users. Real-world testing included headset and mobile application trials where participants used the technology in everyday environments. Feedback from these users helped improve navigation accuracy, response clarity, and interaction design.

Beyond supporting visually impaired individuals, AiSee is also expanding into eldercare applications. As populations age, there is increasing demand for technologies that support independence, memory assistance, and daily task management. AiSee’s conversational AI and contextual intelligence make it well suited for these broader accessibility needs.

Overall, AiSee represents a shift from simple assistive tools to intelligent companions that integrate perception, reasoning, and interaction. By combining computer vision, conversational AI, and wearable interfaces, the platform demonstrates how AI can significantly improve accessibility, independence, and quality of life for visually impaired individuals. 

For additional context and detailed documentation of this use case, please refer to pages 27-29 in the attached Casebook.

Source Organization Source Organization

IndiaAI

Tags Tags

  • Accessibility

Tags Sector

Social

Resources Resources

External Resources:

Related Datasets Related Datasets

Updated 9 month(s) ago
VAANI: Multi-modal, Multi-lingual Dataset
VAANI: Multi-modal, Multi-lingual Dataset
Information-
VAANI is a multi-modal, multi-lingual dataset designed to represent the rich linguistic diversity of India. It currently includes data from two phases—Phase 1 (80 districts) and Phase 2 (40 districts)—spanning a total of ~21,500 hours of spontaneous, image-prompted speech collected from more than 110K speakers across 120 districts, describing 210K images in 86 languages. From this, 835 hours of transcribed audio data is available, distributed nearly evenly across all 120 districts.
Bengali
Gujarati
Kannada
Nepali
Punjabi
Telugu
Urdu
Sindhi
English
Tamil
Low-Resource Languages
Marathi
Malayalam
Odia
speech transcription
spontaneous speech
Sangtam
Ao
Halbi
Malvi
Sadri
Chhattisgarhi
Bhili
Malvani
multilingual corpus
Sumi
Bagheli
Khorth
Nyishi
multimodal dataset
Shekhawati
Bagri
Mewati
Meitei
86 Indian languages
Surgujia
audio-visual dataset
Garo
MagadhiMagahi
TTS training
Wancho
Awadhi
Galo
Oriya
speech + image + text
Tenyidie
regional dialects
Rajbanshi
Nagamese
manual transcription
Thethi
LLM speech integration
KhorthKhotta
Konkani
quality evaluated
speaker identification
diverse demographics
telemedicine AI applications
Marwadi
speaker diversity
Tulu
Assamese
Harauti
Rajasthani
language identification
120+ districts
Tagin
Marwari
Gondi
Bajjika
22 Indian states
Surjapuri
dialect diversity
Kokborok
image-prompted data
Santali
speech enhancement
Khariboli
Rengma
geo-centric data collection
Hajong
Hindi
ASR training
Wagdi
Bhatri
Dorli
multi-modal language resources
reallife recording environments
Mewari
linguistic diversity
Nimadi
Khandeshi
data for conversational AI
benchmarking dataset
Lotha
Kurmali
Bhojpuri
NissiDafla
Angika
Lambani
Magahi
Jaipuri
Bihari
Chakhesang
Magadhi
Angami
Chakma
Duruwa
Bearybashe
Khortha
Kumaoni
Kurukh
Garhwali
Maithili
Agariya
Bundeli
large-scale speech corpus
  • See Upvoters1
  • Downloads100
  • File Size0
  • Views871

INDIAN INSTITUTE OF SCIENCE (IISC), BANGALORE