Sori Album

DChallenge

Home

Hackathons

AI Projects

Select Language

English

Project Overview

Project Summary:

Sori Album is a gallery application designed to enhance visual accessibility for individuals

with visual impairments. Through Sori Album, users can upload or capture photos and

receive concise image descriptions generated by Google’s Gemini AI. These descriptions

are automatically saved with the corresponding photos, enabling screen readers to read

each caption as users browse their gallery.

For deeper insights, users can tap the "Detailed Image Descriptions" button, which

activates GPT-4o to provide a richer interpretation—covering not only the visual content but

also the mood, expressions, and contextual elements of the image. Additionally, by

selecting the "Scan Text" feature, users can extract and access printed text within images.

After saving a photo in the album, users can easily search, edit the descriptions

themselves, and share the images with others or on different platforms.

Sori Album empowers visually impaired users to explore, organize, and take full ownership

of their visual content in a more meaningful and independent way.

Identifying the Challenge

The Social Problem:

Visually impaired individuals access information in digital environments using screen

readers, which convert text into speech. However, appropriate alternative text for the

majority of images is still not provided, making it impossible for them to perceive visual

content. Although various legal regulations and accessibility guidelines have been

introduced to address this issue, they remain largely ineffective.

This gap in accessibility was clearly revealed through interviews conducted by the Sigongan

team with over 150 visually impaired individuals. These interviews showed that most

image-based information in digital settings—such as photos uploaded to social media or

shared via messaging platforms like KakaoTalk—is inaccessible to visually impaired users.

Beyond the inability to perceive image content, they face additional challenges in

situations requiring image-based authentication, storing and retrieving important image

information, or sharing personal photos online. In such cases, visually impaired individuals

experience barriers due to the limitations of the digital environment.

Innovation and Uniqueness

Why Our Project Stands Out:

Sori Album is the first AI-powered gallery app designed exclusively for the visually impaired, enabling

users to independently access and manage their photos. Unlike existing services that provide brief,

one-time descriptions that disappear after viewing, Sori Album stores images alongside detailed

descriptions, allowing users to revisit and organize their visual memories effortlessly.

Leveraging advanced technologies—including Google's Gemini AI for initial captioning, GPT-4o for

in-depth contextual explanations, and NAVER's HyperCLOVA OCR for precise text extraction—Sori

Album offers comprehensive and meaningful access to visual information. Developed through

extensive interviews with over 200 blind individuals and adherence to accessibility guidelines, the

app's user-centered design addresses real-world needs and behaviors.

By transforming passive image consumption into an active, engaging experience, Sori Album bridges

a critical gap in digital inclusivity, empowering visually impaired users to fully own and interact with

their visual content.

Insights and Development

Learning Journey:

Throughout the development of Sori Album, our team gained a deep understanding of how digital

exclusion impacts the visually impaired—especially regarding photo accessibility. Interviews with

over 200 blind users revealed a key insight: they don’t just want to “know” what’s in a photo—they

want to organize, revisit, and share it like sighted users. This shifted our focus from generating simple

descriptions to building a fully navigable gallery.

To achieve this, we optimized every interface for screen reader compatibility, studying how blind

individuals use smartphones in real-life contexts. We also applied image description guidelines to

craft captions that are context-rich and truly helpful. These insights led us to build not just an

accessible app, but a truly usable, user-centered one.

Development Process:

Through over 150 interviews with visually impaired individuals, as well as feedback from

visually impaired developers, we create the most straightforward UI/UX design. The screen

layout is designed using Figma, and the development is done using Flutter. For the

backend, we use Firebase, and AI functionalities are developed using Python. After

deployment, team members systematically use the app with screen readers to review the

user flow, carefully checking if the focus moves correctly through each button, ensuring

proper labeling that, while not critical for sighted users, directly affect the experience of

visually impaired users. Furthermore, we are advancing the AI-based alternative text

generation model through prompt engineering, utilizing various methodologies such as

Few-Shot learning.

Created by

Seoyeon Yu

Seoul National University English Langauge Education

Jiwon Shin

Seoul National University College of Music & Consumer Studies

Hanbi Lee

Seoul National University Art History & Business Administration

Hungi Kim

Seoul National University Economics & Statistics

#Visually Impaired People

#AI

#Photo

Created by

Seoyeon Yu

Seoul National University English Langauge Education

Jiwon Shin

Seoul National University College of Music & Consumer Studies

Hanbi Lee

Seoul National University Art History & Business Administration

Hungi Kim

Seoul National University Economics & Statistics

Hackathons

AI Projects

Find us on:

Hackathons

AI Projects

Find us on:

Hackathons

AI Projects

Find us on: