Object detection w/ audio notifs - RPi4

To give a brief overview about my project, it is all about object detection using raspberry pi 4 model B with yolo v4 algorithm, open cv library, and coco dataset. This project/prototype is aimed to help the blind people, so technically I want my project to work like this: as the external webcam detects an object in front of it in real time, I want an audio notification saying what is the certain object in front by the use of an audio jack earphone.

Currently, I am working on downloading the essential things like yolo v4 and open cv, so since I’m just a beginner, I don’t know exactly what steps I should take to make this work, especially in the audio notification part. I would greatly appreciate your effort in assisting me by providing a simple step-by-step procedure on what I should do in order to make this work. (p.s. This is for my school research project.) Thank you in advance!

Hi @Ahron268938

Welcome. Good to have you with us. :slight_smile:
Cool ideas. Lots to do but it’s all manageable.

Tools

This tech chain seems similar to this popular guide by a Tim who I think used to work at Core electronics. Maybe have a look at it for inspiration. :point_down:

This github repository also uses a similar tech stack.
This repository is NOT under a open source license which means you cannot clone it and use the code freely. However, it is there for you to read and learn from :).
If you do learn anything from this GitHub repository don’t forget to reference HimanchalChandra (author) as a source. :point_down:

Audio

I’m going to assume your working in python here based on the rest of your tool-chain.
There are two paths here and it depends on your project requirements.

  1. Do you want your final project to be able to identify ANY OBJECT in general? This would mean you need some kind of “text to speech engine.”
  2. Do you want your final project to be able to identify a pre-planned list of object. The way I would accomplish this is asking my most American friend record samples of him/her reading aloud from your list. Then playing back those recordings on demand with some audio library.

Which one of the above sounds more like the direction you want to head?

Pix :heavy_heart_exclamation:

1 Like