Hello forum goers, got a short and sweet guide on getting started with the Raspberry Pi AI camera - a nice little collaboration project between Sony and Raspberry Pi. It is a camera module WITH an IMX500-based accelerator chip for computer vision tasks. This thing is really fun to get going and keeps computer vision tasks off your Pi. We also have some Python code at the bottom to get started with implementing it in your own scripts (which can be a bit of a challenge with how fresh the software for all of this is, but we put together a nice little library for it): “Raspberry Pi AI Camera Quick-Start Guide”
In this guide, we will be getting the Raspberry Pi AI camera up and running as quickly as possible, as well as taking a look at how to get started with it in your own projects.
The Raspberry Pi AI camera is a unique and interesting piece of hardwar…
Read more
Credit to @Jaryd for packaging this all together into a simple process.
That camera is wild. Import a library and all of a sudden youre labeling objects in a while loop.
Thank you very much @Pixmusix, but a lot of the hard work was done by RPI themselves, I just piggybacked off it and adapted their 200-line Python demo script into a simple library.
It is amazing though that you can plug in a single piece of hardware and run object recognition in a while loop!
What are its advantages over RPI camera V2?
Hey lia!
In terms of the actual camera sensor itself, it’s 12.3 vs 8 megapixels and can record at a slightly higher resolution. But the main advantage of this is that it has the IMX500 AI accelerator chip which is optimised for computer vision tasks like object detection and pose estimation.
While you can get similar performance on a Pi 5 without the AI chip, it frees up the Pi to do other tasks as it handles all the heavy processing. Its also a bit easier to get going!
-Jaryd
Hey Jaryd - a great introduction and your tutorial and code are understandable and work seamlessly.
I’m trying to use the AI Camera to log when pesky cats are around, the cat ID works great and I would like to save a 10s video when detected. Trying this I get a conflict (camera already in use) - how do I work around this without getting too complex?
Thanks
Dennis
Hi Dennis,
Welcome to the forum! How does your code currently look? It should be some small tweaks to get it recording.
Hi Jack
Thanks for the offer to help
I have it working with some probably janky code using , picamera2 H264 encoder and CircularOutput
from ai_camera import IMX500Detector
import time
from picamera2.encoders import H264Encoder
from picamera2.outputs import CircularOutput
camera = IMX500Detector()
# Start the detector with preview window
camera.start(show_preview=True)
# Main loop
while True:
# Get the latest detections
detections = camera.get_detections()
# Get the labels for reference
labels = camera.get_labels()
# Process each detection
for detection in detections:
label = labels[int(detection.category)]
confidence = detection.conf
# Example: Print when a cat is detected with high confidence
if label == "cat" and confidence > 0.4:
print(f"cat detected with {confidence:.2f} confidence!")
# Capture and save the image
encoder = H264Encoder()
output = CircularOutput()
timestamp = int(time.time())
filename = f"cat_detected_{timestamp}.H264"
print(f"Image saved as {filename}")
camera.picam2.start_recording(encoder, filename)
output.start()
time.sleep(5)
output.stop
# Small delay to prevent overwhelming the system
time.sleep(0.1)
Next step is to identify other things (birds, people etc) and record, ultimately to maybe work towards an animal camera trap for safaris!
Cheers
Dennis
Hey @Dennis283326,
I don’t have much direct hands-on experience with this camera, but I think I know what the problem is.
The documentation for the AI Camera is really useful, this module is effectively the same as the old Pi Cam module, with an additional AI processor on-board (the IMX500). The topology differences can be seen below, a typical camera module is pictured left, the IMX500 module pictured right.
As seen in this image, the camera module has two outputs, while a typical module has only one. When using the picam2 module, you would assign a variable to the image data output from the module, which you could then record, modulate, etc.
In your code above, you are assigning the module a variable, but specifically assigning the Output Tensor to your variable “camera”.
Thus, by re-using “camera” for the encoder you are effectively trying to record output tensors with the picam2 encoder.
Maybe try creating a new object for the raw image data, similarly to what’s seen in the Picam2 module documentation.
Like I said previously, I haven’t really gotten hands-on with this module yet, so it’s possible this isn’t the case, I might be able to ask @Jaryd to confirm when he gets the chance.
Hope this helps!
Trying this on a Pi 5 8gb in January 25
the console commands work perfectly but when I run the demo code
and getting some bugs:
Traceback (most recent call last):
File “/home/nephi/Desktop/demo.py”, line 4, in
camera = IMX500Detector()
^^^^^^^^^^^^^^^^
File “/home/nephi/Desktop/ai_camera.py”, line 17, in init
self.imx500 = IMX500(model_path)
^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/picamera2/devices/imx500/imx500.py”, line 321, in init
self.__set_network_firmware(os.path.abspath(self.config[‘network_file’]))
File “/usr/lib/python3/dist-packages/picamera2/devices/imx500/imx500.py”, line 594, in __set_network_firmware
raise RuntimeError(f’Firmware file {network_filename} does not exist.')
RuntimeError: Firmware file /usr/share/imx500-models/imx500_network_yolov8n_pp.rpk does not exist.
Hey Nephi, welcome to the forums!
It seems like the model the code is trying to use is missing, when we do the console command version we use the Mobile Net model, but the demo script uses Yolov8n.
I did a fresh install and it is missing as well, a bit of digging and it seems that Raspberry Pi has recently stopped distributing the model due to a licensing issue (I think the Yolo models are in a weird semi-opensource but not for commercial state). We can change the library code to use Mobile Net, but Yolo is a far more powerful and capable model.
I grabbed the Yolo model from a previous installation and you can download it here.
Unzip the model form the file and paste it into your /usr/share/imx500-models/ folder where you will find all the other .rpk models. Run the demo script again and everything should work fine now!
Thank you for spotting this we will update the guide!
Cheers,
Jaryd
Thanks Jaryd, appreciate the response, I too now have it working.
I am looking at pulling the x,y,h,w of the windows on the screen, I can see in the library file they use it to draw the box, where would I find the documentation trying to find details about exporting that info?
thanks in advance
I am not too familiar with extracting the position with the AI camera, but it should be similar to how other Yolo model set ups do it. In the demo script it should be in the detections variable that we get in this line here.:
# Main loop
while True:
# Get the latest detections
detections = camera.get_detections()
When we want to get a detection we use:
detection.category
And the confidence with:
detection.conf
I would whack a print line in the beginning there because you will likely find the coordinates you are looking for in that structure:
# Main loop
while True:
# Get the latest detections
detections = camera.get_detections()
print(detections)
A dig around in there and you should find what you are after! I found the dealing with coordinates to be a tad confusing, but an LLM like ChatGPT helped out a lot!
Cheers,
Jaryd
Hello there, thanks for giving me a tip to check this out. Ive been trying to get stuff working using the pose model to track my face as one of you suggested at youtube. With the help of an LLM. and reading documentations. After 6h i give in and ask again. I found that when i run your demo, it used to work, a couple of days ago, but now it dont. when i run it i get these error
im starting to think this is way beyond my level, but i need it working. Thanks to anyone wanning to chip in.
[quote=[0:45:42.649745857] [11401] INFO Camera camera_manager.cpp:327 libcamera v0.4.0+53-29156679
[0:45:42.812103527] [11405] WARN RPiSdn sdn.cpp:40 Using legacy SDN tuning - please consider moving SDN inside rpi.denoise
[0:45:42.816903007] [11405] INFO RPI vc4.cpp:447 Registered camera /base/soc/i2c0mux/i2c@1/imx500@1a to Unicam device /dev/media3 and ISP device /dev/media0
[0:45:42.817103215] [11405] INFO RPI pipeline_base.cpp:1121 Using configuration file ‘/usr/share/libcamera/pipeline/rpi/vc4/rpi_apps.yaml’
NOTE: Loading network firmware onto the IMX500 can take several minutes, please do not close down the application.
[0:45:43.073464331] [11401] INFO Camera camera.cpp:1202 configuring streams: (0) 640x480-XBGR8888 (1) 2028x1520-SRGGB10_CSI2P
[0:45:43.074753081] [11405] INFO RPI vc4.cpp:622 Sensor: /base/soc/i2c0mux/i2c@1/imx500@1a - Selected sensor format: 2028x1520-SRGGB10_1X10 - Selected unicam format: 2028x1520-pRAA
Network Firmware Upload: 100%|████████████████████████████████████| 1.59M/1.59M [00:01<00:00, 1.09Mbytes/s]
Traceback (most recent call last):
File “/home/bebop/demo.py”, line 15, in
labels = camera.get_labels()
^^^^^^^^^^^^^^^^^
AttributeError: ‘IMX500Detector’ object has no attribute ‘get_labels’
[/quote]
Hi @Anti290567
Welcome to the forum!
Looking at the error you’ve posted it would appear that get_labels isn’t being found, making sure that the ai_camera.py file is in the same working directory that you’re running the script from.
Hey @Anti290567, sorry for the late reply!
To get the demo working again I would back up what you have been experimenting with, and redownload the original. If that doesn’t work, I would maybe backup and remove all the AI camera folders that were downloaded and run the steps in the video again to reinstall it all. It might be easier to start again than to try and figure out where the issue is.
In terms of getting pose estimation to work, I took a look at the library again and there probably isn’t enough in there for an LLM to rewrite it for you, you might want to check out this Github repo which has some other Python examples in it. These are the examples we used to figure out how to write the library from the demo, and the pose estimation demo should be able to run out of the box.
There is also no need to create a library, we did so as it allowed the 150 lines of Python code to operate hidden away from the user, and we could have a simple demo script - originally it was all in 1 file. Do what you think your project needs, but there should hopefully be some helpful code there to get you going.
Happy making!
I’m just getting into electronics and have been using an ESP32 kit. I have a project where I would like to have a camera to be able to identify one of a dozen specific parts that would be programmed into the library database. Think of these parts coming down a conveyor belt one at a time, the camera will identify which part it is and there would be logic to direct the part physically using actuators to sort the parts. Parts would be coming by slow, one at a time, every 60 seconds. I’m curious what hardware and setup you would recommend for this project. Raspberry Pi with AI camera that is constantly monitoring the live video feed and waiting to detect a part? Thanks
Hey Scott,
The AI camera would definitely be my first instinct for this sort of thing. There are definitely some cheaper ways to do it, if you’re interested.
Another option that might work better with the ESP32 is the Grove vision AI module.
If your items are drastically different weights or sizes, I’m sure you could use another sensor to detect these traits instead. What are the specific parts you’re sorting?
Good question. Some of the parts are roughly the same weight. For example the weights of some pcs are: 10, 12, 14, 16, 18, 24, 24 & 26 grams. They may have some oils or machining chips on them that may slightly affect their weight. And I’m thinking that if the method used for weighing starts to build up with oils, it may affect the measurements. I know just barely enough to connect the ESP32 to a laptop and load programs to it with a breadboard and sensors/output devices. I’ve used AI to help write code with ease, but getting into cameras or storing data is totally new and seems like a few steps ahead of my skill level. Do you have any resources on where I can learn more about these different camera setups?
Hey @Scott294214,
We have a fair few guides for working with the Pi Cameras in particular. You can see them all at this link if you sort by ‘camera’ using the box on the right:
From the sounds of it, you don’t need to process the parts particularly fast, so a Pi AI camera setup should be perfect for this.
I would have a think about what features are the most distinctly different between your parts like weight, size, colour, etc in case you can sort them by something simpler than image recognition.
Do you have the ability to uniquely mark these parts in any way before they get to this step? In our Core Electronics warehouse we use little QR codes for identification on a ton of products as they are super quick to scan, easy to generate and its unlikely a QR code would be scanned and show up as something it is not.
To summarise. An AI camera setup would absolutely work and would be a super entertaining project but may not be the ‘simplest’ solution if another characteristic of these parts could be used for comparison instead.
Hope this helps!