Object and Animal Recognition With Raspberry Pi and OpenCV

You are looking very close indeed! Have you got the entire contents of the folder | Object_Detection_Files | on your desktop?

Also type and enter the following terminal command (if below fixes the problem I will update the guide with it, it just so happened to work straight away for me without it).

sudo apt-get install python-opencv python3-opencv opencv-data


hello brother ,

can you provide a solution i only wanted to detect the waste material

can you provide how can i do that step by step

I would like to use this to identify specific cats (cat facial recognition). Do you have any recommendations for me using this?


Hey Rob,

To solve your problem, come take a look at this - Edge Impulse. This is free for a maker and will let you customise an already built Machine Learned system (like the Coco library used in this guide).

Hopefully, the specific cats are different colours as identifying them via that will be more reliable/easier than trying to get them to look straight in-camera :stuck_out_tongue:. So you can then futher teach the above Coco Library by using a whole bunch of photos of the pussy cats in combination with Edge Impulse. This training is hardware intensive and something you definitely want a full-sized desktop to do.

There are usually lots of different ways to solve the same problem, and I’m sure there are other ways you could go about it but above is what I would do.


Hey mate, this is an easy fix and you just need to type and enter the below line into your terminal. I reckon you just missed it going through the steps. By not having that | cmake | library installed you will run into that | bash: cmake: command not found | error.

sudo apt-get install build-essential cmake pkg-config

Below is an image of what it looks like on my system when I run the step your up to.

If that doesn’t fix your problem I’ll work through with you until we get it running properly on your set up :slight_smile:


You can definitely overclock the system. Just make sure you provide some active and passive cooling for the Raspberry Pi’s CPU. This will make it operate faster.

If you want an instant speed boost using this hardware consider checking out the Coral USB Accelerator.

This video shows a nice comparison between using it and not for different machine-learned Computer Vision systems on a Raspberry Pi 4 Model B . At around 6.30 is where you’d want to check out.



I am following this tutorial to recognize objects and people but i see that you have many other tutorials with Open cv.

I am also interested on adding the functionality to recognice people’s faces and control servos just like you tutorial before this one.

my question is: I noticed the the face recognition tutorioal has different CMake commands than the turorial for object recognitions.

for example:

in the object recognition tutorial we have the command


where on the face recognition tutorial we have the command:


is there any documentation or any guidance you can give me on how to accomplish both tutorials at the same time on my Raspberry pi 4?

thank you

1 Like

Hello! I was able to finish the tutorial successfully on a Raspberry Pi 4B with 4gb ram, using the pi camera module 2. Unfortunately I’m getting very slow response time on the display: there’s about a 10 second lag. I’ve tried changing the threshold and nms numbers, setting my overall pi resolution lower (1024x768), adjusting the size of the viewport in the python code (i tried 480x360), using the activity monitor to increase the priority of Python +10, and changing the GPU Memory under Raspberry Pi Configuration->Performance. Does anyone know what can speed things up for me? Any help is greatly appreciated, thank you!

1 Like

Hi Tim. I’ve fully implemented this project and even added speaker output for what is detected, but I’ve been experimenting with how to run this from my iPhone 8, without a monitor. I’ve been using this tutorial: RaspController - Use Your Phone to Control Your Raspberry Pi - Tutorial Australia (it’s your tutorial on how to access the Raspberry Pi system from a smartphone using the Simple Pi App). I am able to run certain commands from Simple Pi, specifically ones that don’t use the camera just fine, such as |espeak “hello”|. However, when I try to run the object-ident.py file, I ran into some issues due to the app not being able to display what the camera sees (which was included in the default code). I turned off the camera window output by commenting out the line in object-ident.py that read "cv2.imshow(“Output”, img) and I also set the “draw” boolean in the getObjects function to false. I thought this would be sufficient enough for the program to fully work but just not display the camera view. Actually, everything worked completely when I run this command directly on my Raspberry Pi Terminal using keyboard/monitor: |sudo python /home/pi/Desktop/Objection_Detection_Files/object-ident.py|. The speaker correctly says what is detected, without any external window popping up to display the camera’s view. But when I run that exact same command on the Simple Pi app, I get an error that keeps coming on the terminal screen of the app. The only way to stop it is to disconnect my phone app from the raspberry pi system, wait a few seconds, and reconnect. Shown in the picture is the error I get. Can you please help me resolve this, so that my Phone will be able to run the object-ident.py file without any monitor/keyboard/mouse? It might have something to do with network/SSH but I don’t really know. Thanks!


Hi Eduardo,

BUILD_EXAMPLES, to my knowledge, just sets up some extra stuff for if you’re running specific CV2 exaples. I’d try building with it ON, and seeing if that works for both. My inkling is that it will, but I struggled to find clear documentation.


1 Like

Heyya mate.

Good eyes realising there is a difference between the Open-CV build process. Facial Recognition uses a slightly different build than my Open-CV guides about Object Identification, Speed camera, Face controlled Pan-Tilt HAT, Hand Tracker, and Pose Identification (which all use exactly the same Version of Open-CV).

My suspicion is that it will Facial Recognition will work using the same Open-CV version that Object Recognition uses. So I would recommend using that version of Open-CV to get both system up and running on the same OS. I have not tested it yet, so if you do get it running please tell us here on the forum :slight_smile:


10 Second lag is waaaay to long! Perhaps you have underclocked your Raspberry Pi. Check out this guide on how to do the opposite and overclock your Raspberry Pi. This will make it run faster (and hotter so keep that in mind). Guide to check on how to do that is this one

When altering the config.txt file use the below values for your particular Raspberry Pi.


To see what your current clock speed is type and enter the below into your terminal.

watch -n 1 vcgencmd measure_clock arm


Hey mate, proper interesting project! If you have access to an Andriod phone I’d love to see what would happen if you the RaspController App instead of the Simple Pi. If you run into the error using that copy it through so I can see.

Turning off the preview window is definitely a good idea but potential not necessary. The errors (or potentially just Error messages) all have to do with your Audio output. Check out these forum posts as they have run into the same scenario as you, just for slightly different projects. First one, Second one.

We’ll get this running + I’d love to see the end result, when you get it running please show us :slight_smile:


Hi James, How can I make the system detect just one object in the Image with a delay of 2-5 seconds?


Hey, thanks for this brilliant guide!

I only had 1 error where a video resize function broke which I solved by simply reseating the camera!
The build took over 24 hours on my ancient pi 1 and I get about -10 fps, but it is still exciting to see it in action.

I am hoping to use this to create a simple project where I detect either a cat or dog and send an output to a server so I am wondering how I can use this with just scripts and no output viewer?
(I’m guessing just remove the *cv2.imshow(...)* line? )

I’m also hoping that this would speed things up a little, could you explain a little about the getObjects() parameters do?

As a side note - do not forget to set your swap file size back to 100 for the sake of speed and your SD card’s life!

Thanks again!


Hey mate, are you are focus the script to look only for certain Objects/Animals. Check this section of the guide.

This will also increase the speed at which the system will operate.

1 Like

Hi Tim, Not a certain object but one object at a time with a 2-5 seconds delay. So that the Text-to-speech will speak 1 word every 2-5 seconds

1 Like

Hey mate,

It was my pleasure :slight_smile: so stunned to know you have object detection running on a Raspberry Pi 1! (please do send through some pictures I am mighty impressed)

Definitely removing | CV2.imshow | line is a good start to achieving your project. Part of the example script also involves taking each frame and adding overlayed squares on the live feed for the location of the objects. You won’t need to do this processing for your stripped-back version of your project which should save on computing power.

If you want to dive into the deep end a lot can be learned about getObject() and other classes here - OpenCV: cv::MultiTracker Class Reference - Altering the NMS number and the Threshold number like I talked about in the comments for | object-indent.py | is all the control I needed from it.

And a very good note to add at the end too about reverting the swap file back to the right size.

1 Like

Hi Harold,

If you’d just like it to slowly read out the objects it did find, that’d be pretty easy.
In the main loop of Tim’s code, you’d just have something like:

while True:
        success, img = cap.read()
        result, objectInfo = getObjects(img,0.45,0.2)
        # Your code here:
        for object in objectInfo: 
            #The above iterates over all the found objects
            #The above should read an object name. The text to speech method is up to you
            #The above waits 2 seconds, or when a key is pressed, which ever comes sooner.

This would iterate over the objects found, reading them one at a time, with delay depending on your preference.

If you want it to “focus” on one object alone, you’d need to categorise the objects you were looking for in order of importance (as it is, the code has no idea contextually what it’s looking at and how important it is). This could be the size of the object (how much of the field of view does it occupy, this would be easy to calculate with Tim’s code), or perhaps a list that you compare found objects too. Something like IndexOf could be used perhaps.

Let me know if this spurs more questions, and good luck!

1 Like

Ohhh got it. So how can I implement the created model from Edge Impulse to this code?

1 Like