April 6, 2017

Clarifai Featured Hack: Describe the World to the Vision-Impaired with See

Table of Contents:

See is an accessibility app that uses your webcam to describe what’s happening in the room around you. This can be handy for the vision-impaired, who can better understand their surroundings through the descriptive app.

See is an app that allows the user to ask for descriptive information about their surroundings using Clarifai’s API and the user’s computer webcam. This functionality can be useful to give additional context to the vision-impaired – just ask See if something is in the room, or get a description of the top five tags related to the room.

see


WHY WE ❤ IT
Accessibility for the vision-impaired is a common use of our technology. See is particularly interesting in the way it pairs language microservices with Clarifai’s visual recognition API. On a side note, we also happen to love apps that make it hard for people to sneak up on us! Read more abou See and try it out on Devpost!

HOW YOU DO IT
We caught up with Aran Long, Comp Sci student from Birmingham, to talk about his inspiration for See.

Clarifai: What inspired your idea for See?
Aran: Lack of sleep can often cause you to talk to inanimate objects – I tried to spin it into an accessibility hack. In the future I would love to explore the natural language processing aspect further, potentially generating full sentences describing the room and being able to infer the meaning of more advanced inputs.

How did you build the app?
There are several core microservices behind See:

LANGUAGE
Language is the microservice for taking in sentences, tokenizing them and offering several different services.

Similarity – This is the similarity of two words based on their shared synonyms
Nouns – This returns all of the nouns in a given sentence
Tag – This tags words with their correct word classes.
Language is hosted using Amazon AWS EC2.

TAGGING
Tagging is a microservice that is always connected to the client. Using Socket.IO and base64 encoded image streams I am able to have a real-time tagging service using the Clarifai API.

Tagging is also hosted using Amazon AWS EC2, it also statically serves it’s images using Caddy TLS at images.aran.site (which are named using UUID generation). This also using HTTPS.

What was the best part about working with the Clarifai API?
Clarifai has great docs, I would love to see more of this in the industry. I had originally spent a lot of time using the Microsoft cognitive service computer vision API. However, I found a flaw in the API regarding its Image URL parameter.


Thanks for sharing, Aran!


To learn more, check out our documentation and sign-up for a free Clarifai account to start using our API – all it takes is three lines of code to get up and running! We’re super excited to share all the cool things built by our developer community, so don’t forget to tweet @Clarifai to show us your apps.


And give Aran some props in the comments below. Until next time!