Hello forum goers, got a short and sweet guide on getting started with the Raspberry Pi AI camera - a nice little collaboration project between Sony and Raspberry Pi. It is a camera module WITH an IMX500-based accelerator chip for computer vision tasks. This thing is really fun to get going and keeps computer vision tasks off your Pi. We also have some Python code at the bottom to get started with implementing it in your own scripts (which can be a bit of a challenge with how fresh the software for all of this is, but we put together a nice little library for it): “Raspberry Pi AI Camera Quick-Start Guide”
In this guide, we will be getting the Raspberry Pi AI camera up and running as quickly as possible, as well as taking a look at how to get started with it in your own projects.
The Raspberry Pi AI camera is a unique and interesting piece of hardwar…
Read more
Credit to @Jaryd for packaging this all together into a simple process.
That camera is wild. Import a library and all of a sudden youre labeling objects in a while loop.
Thank you very much @Pixmusix, but a lot of the hard work was done by RPI themselves, I just piggybacked off it and adapted their 200-line Python demo script into a simple library.
It is amazing though that you can plug in a single piece of hardware and run object recognition in a while loop!
What are its advantages over RPI camera V2?
Hey lia!
In terms of the actual camera sensor itself, it’s 12.3 vs 8 megapixels and can record at a slightly higher resolution. But the main advantage of this is that it has the IMX500 AI accelerator chip which is optimised for computer vision tasks like object detection and pose estimation.
While you can get similar performance on a Pi 5 without the AI chip, it frees up the Pi to do other tasks as it handles all the heavy processing. Its also a bit easier to get going!
-Jaryd
Hey Jaryd - a great introduction and your tutorial and code are understandable and work seamlessly.
I’m trying to use the AI Camera to log when pesky cats are around, the cat ID works great and I would like to save a 10s video when detected. Trying this I get a conflict (camera already in use) - how do I work around this without getting too complex?
Thanks
Dennis
Hi Dennis,
Welcome to the forum! How does your code currently look? It should be some small tweaks to get it recording.
Hi Jack
Thanks for the offer to help
I have it working with some probably janky code using , picamera2 H264 encoder and CircularOutput
from ai_camera import IMX500Detector
import time
from picamera2.encoders import H264Encoder
from picamera2.outputs import CircularOutput
camera = IMX500Detector()
# Start the detector with preview window
camera.start(show_preview=True)
# Main loop
while True:
# Get the latest detections
detections = camera.get_detections()
# Get the labels for reference
labels = camera.get_labels()
# Process each detection
for detection in detections:
label = labels[int(detection.category)]
confidence = detection.conf
# Example: Print when a cat is detected with high confidence
if label == "cat" and confidence > 0.4:
print(f"cat detected with {confidence:.2f} confidence!")
# Capture and save the image
encoder = H264Encoder()
output = CircularOutput()
timestamp = int(time.time())
filename = f"cat_detected_{timestamp}.H264"
print(f"Image saved as {filename}")
camera.picam2.start_recording(encoder, filename)
output.start()
time.sleep(5)
output.stop
# Small delay to prevent overwhelming the system
time.sleep(0.1)
Next step is to identify other things (birds, people etc) and record, ultimately to maybe work towards an animal camera trap for safaris!
Cheers
Dennis
Hey @Dennis283326,
I don’t have much direct hands-on experience with this camera, but I think I know what the problem is.
The documentation for the AI Camera is really useful, this module is effectively the same as the old Pi Cam module, with an additional AI processor on-board (the IMX500). The topology differences can be seen below, a typical camera module is pictured left, the IMX500 module pictured right.
As seen in this image, the camera module has two outputs, while a typical module has only one. When using the picam2 module, you would assign a variable to the image data output from the module, which you could then record, modulate, etc.
In your code above, you are assigning the module a variable, but specifically assigning the Output Tensor to your variable “camera”.
Thus, by re-using “camera” for the encoder you are effectively trying to record output tensors with the picam2 encoder.
Maybe try creating a new object for the raw image data, similarly to what’s seen in the Picam2 module documentation.
Like I said previously, I haven’t really gotten hands-on with this module yet, so it’s possible this isn’t the case, I might be able to ask @Jaryd to confirm when he gets the chance.
Hope this helps!