Hey peeps, got another one for you. We have just updated our computer vision guides to run on the Pi 5. This time around we are getting the YOLO vision models a go and are getting some very respectable performance on the Pi 5. We also go through setting up YOLO World - an open vocabulary model that can be prompted to identify objects instead of relying on its own pre-trained data, very cool and very futuristic: “Getting Started with YOLO Object and Animal Recognition on the Raspberry Pi”
Have you ever wanted to dive into computer vision? How about on a low-power and portable piece of hardware like a Raspberry Pi?
Well, in this guide we will be setting up some with the YOLO vision model family, OpenCV and the COCO object library. We…
Read more
Fantastic article, thanks for the intro to Yolo! Just wanted to add a small improvement I found, under the lowering the resolution section it’s said that you can either convert the model OR lower the resolution. I had success doing both, but you have to convert the model to expect the lower resolution. I had to do this:
model.export(format="ncnn", imgsz=320)
Then in the main vision program I could init the camera with a lower resolution, although this isn’t strictly necessary:
picam2.preview_configuration.main.size = (320,320)
and then could run the model on a frame with:
results = model(frame, imgsz=320)
If you skip any step then the model fails to run OR you get garbage results.
With this on a Pi5 4gb model I was able to get FPS as high as 50, with an average of around 30.
hey @PhilipCodes,
That’s a great addition. Thanks for letting us know!
Hey Philip,
I previously tried this and was running into detection issues where it was identifying everything in the coco library at once. I’ll give it another go as we have some more Yolo-based guides on the way and this would be amazing!
Yep, that’s exactly what I noticed if I only specified the image size in conversion or in inference. If I specified both it worked well. For reference I was testing with the yolo 11 model, just in case that makes a difference for your testing.
Also I very briefly experimented with the int8 quantization option when converting and it actually seemed to reduce FPS which was unexpected. So I didn’t explore that too much.
Just booted up the Pi and had no issues with this, managed to get 2fps at imgsz-320 on the XL model which is insanity! Don’t know what silly mistake I was making before, I think I wasn’t using a multiple of 32.
Regardless thank you very much for finding this, will update the guide and include it in future ones!
Hello everyone,
Is it possible that the model yolov8s-world.pt cannot detect plants? I used keywords like “plant,” “green object,” and “flowers.” Do you have any other ideas?
Hello!
The project is excellently described and presented. I followed it step by step, and everything works.
Thank you very much for your pedagogical effort.
I have one request.
I am very inexperienced with these applications and library installations.
I would like to add a library for forklift detection to the existing application:
- keremberke/yolov8s-forklift-detection · Hugging Face
- or this one: GitHub - ToanNguyenKhanh/Forklift-Object-detection: This project utilizes the YOLOv5 object detection model to identify forklifts within images or video frames. (this one is for YOLO 5),
- or some other option.
I want to create an application that, in addition to people, also detects forklifts. It doesn’t need to detect other objects.
How and where should I install the additional library and finally integrate it into the program?
Thank you very much for your advice.
Hi @Loki269894,
From what I can find YOLO 8 uses the COCO model set by default.
You can find the GitHub for this here:
From what I can see in this file your best bet may be to try “potted plant” as this is a term inside of this dataset. Have a read-through, it contains a list of all the terms it can recognise!
Hope this helps!
Hi @Milan282068, welcome to the forums!
From what I can see from keremberke/yolov8s-forklift-detection · Hugging Face should be easy enough to add to this project.
First, make sure you have this installed correctly on your Pi using:
pip install ultralyticsplus==0.0.23 ultralytics==8.0.21
Then make sure the library is imported correctly in your code by adding:
from ultralyticsplus import YOLO, render_result
And replace the standard model import line with:
model = YOLO('keremberke/yolov8s-forklift-detection')
If you run into issues let us know but this should cover the basics!
Hey @Loki269894,
Yolo world can be a bit hit-and-miss, some objects do really well and some unfortunately don’t and the only thing I can recommend with it is trying out different prompts.
Another good option though might be to try a dedicated model trained for plants. If you head on over to this community-trained model, you can find the model called “best.pt” under files and versions. Simply download and copy that into the same folder as your scripts and other models, and then you’ll need to modify your Python code to tell it to use the model with something like:
# Load YOLOv8
model = YOLO("best.pt")
If that works you can treat it exactly like a normal model and can optimise it by converting to NCNN and lowering the resolution (by the way we have a new updated method in the written article).
Hope this helps you out!
Hey @Milan282068,
The guide sets you up for success in that you can just find any Yolo model, drop it into the project folder with the other models, and then change the name to use that model. That first Hugging Face link looks like it will do fine and the sample images show it detects only humans and forklifts.
You can find the model called “best.pt” under files and versions (after training a custom model it will default call it “best.pt”). Simply download and copy that into the same folder as your scripts and other models, and then you’ll need to modify your Python code to tell it to use the model with something like:
# Load YOLOv8
model = YOLO("best.pt")
If that works you can treat it exactly like a normal model and can optimise it by converting to NCNN and lowering the resolution (by the way we have a new updated method in the written article).
Best of luck!
Hi! I am working on object detection with a Raspberry Pi 4. I am following this article: Getting Started with YOLO Object and Animal Recognition on the Raspberry Pi.
Unfortunately, there is an error (ModuleNotFoundError: No module named ‘picamera2’) and no matter what solutions are found online, nothing is working. Have you had this problem and do you know if there is a solution?
Hey Sarah, welcome to the forums!
When are you getting this error? When you run the detection code for the first time?
If so I got this error once when I didn’t create the Venv correctly. Picamera2 comes pre-installed and when you create the Venv, you need to include the existing system packages for it to be recognised.
You may be able to simply enter your venv and install Picamera2 with:
source yolo_object/bin/activate
pip install picamera2
If that doesn’t work you may need to recreate the virtual environment and reinstall the packages. You can delete the old environment by punching into a new terminal window:
rm -rf yolo_object
And then ensure you create the Venv again with system packages:
python3 -m venv --system-site-packages yolo_object
Hope this helps!
This is great, using all of these and running 160 by 160, I was able to get 150 fps on my pi 5 with 8 gigs of ram!
However, I have one question, how do I train my own model and use that instead of the yolo model?
If any of you guys could tell me how I could train my own model and upload it, that would be greatly appreciated!
Thanks so much!
Hey Jey, welcome to the forums!
Congrats on getting some nice FPS, glad to hear people are pushing their RPIs! If you get your hands on another model it is as simple as putting the model in the project folder with the other models and changing the model line in the code to:
# Load YOLOv8
model = YOLO("best.pt")
(the reason that we have best.pt here is that it is usually the output of custom training)
In terms of training your own model that is where it gets quite a bit involved. The first hurdle is processing power, even with a high-end GPU like an RTX 4080 it can take several hours (we also tried it on a Pi 5 as a joke and it took 5 days to get 1.5% ), so the easiest solution might be to use a service like Roboflow or Ultralytics HUB which lets you rent out this high-end hardware (sometimes for free). Both of these have extensive documentation and community help which is fantastic!
You will also need to annotate your data as well, and this can be quite a task depending on how much training data you want to use. Both HUB and Roboflow have tools to help with these though. There is also LabelImg which I hear is a good tool for this as well.
I hope some of this can get you started, and I would definitely recommend checking out Hugging Face first because someone may of already trained a similar model that’s ready to download!
Best of luck!
That sounds good, I was thinking about using google collab and their T1’s to train my model, but I will definitely be taking a look at the sources you provided me.
One last thing just to confirm, so in the example you showed me with best.pt, best.pt is a file which contains the model I trained correct? So if I train my model using one of these sources of images provided and download it, it should be in the .pt format which I can upload to my raspberry pi?
Once again, thanks alot! ( I am looking for a model that detects roads and speedbumps, and was sadly not able to find any on hugging face, ill probably use the links you provided me to train my model)
Yes, the output of the training process should spit out a .pt Pytorch file (and best is just the default output name). If it is a YOLO Pytorch model you should be able to happily run it on the Pi and also convert it to NCNN and what ever you need with the steps from the video
Best of luck!
Alright, thanks so much! I will keep you posted if I have any questions!
In Getting Started with YOLO Object and Animal Recognition on the Raspberry Pi regarding increasing processing speed you refer to a file named “ncnn conversion.py”. Can you tell me where this is? Not sure how to use this.