Raspberry Pi AI Camera Quick-Start Guide

Hello forum goers, got a short and sweet guide on getting started with the Raspberry Pi AI camera - a nice little collaboration project between Sony and Raspberry Pi. It is a camera module WITH an IMX500-based accelerator chip for computer vision tasks. This thing is really fun to get going and keeps computer vision tasks off your Pi. We also have some Python code at the bottom to get started with implementing it in your own scripts (which can be a bit of a challenge with how fresh the software for all of this is, but we put together a nice little library for it): “Raspberry Pi AI Camera Quick-Start Guide”



In this guide, we will be getting the Raspberry Pi AI camera up and running as quickly as possible, as well as taking a look at how to get started with it in your own projects.
The Raspberry Pi AI camera is a unique and interesting piece of hardwar…

Read more

4 Likes

Credit to @Jaryd for packaging this all together into a simple process.

That camera is wild. Import a library and all of a sudden youre labeling objects in a while loop. :exploding_head:

3 Likes

Thank you very much @Pixmusix, but a lot of the hard work was done by RPI themselves, I just piggybacked off it and adapted their 200-line Python demo script into a simple library.

It is amazing though that you can plug in a single piece of hardware and run object recognition in a while loop!

3 Likes

What are its advantages over RPI camera V2?

Hey lia!

In terms of the actual camera sensor itself, it’s 12.3 vs 8 megapixels and can record at a slightly higher resolution. But the main advantage of this is that it has the IMX500 AI accelerator chip which is optimised for computer vision tasks like object detection and pose estimation.

While you can get similar performance on a Pi 5 without the AI chip, it frees up the Pi to do other tasks as it handles all the heavy processing. Its also a bit easier to get going!

-Jaryd

2 Likes

Hey Jaryd - a great introduction and your tutorial and code are understandable and work seamlessly.
I’m trying to use the AI Camera to log when pesky cats are around, the cat ID works great and I would like to save a 10s video when detected. Trying this I get a conflict (camera already in use) - how do I work around this without getting too complex?

Thanks
Dennis

Hi Dennis,

Welcome to the forum! How does your code currently look? It should be some small tweaks to get it recording.

Hi Jack

Thanks for the offer to help

I have it working with some probably janky code using , picamera2 H264 encoder and CircularOutput

from ai_camera import IMX500Detector
import time
from picamera2.encoders import H264Encoder
from picamera2.outputs import CircularOutput

camera = IMX500Detector()

# Start the detector with preview window
camera.start(show_preview=True)

# Main loop
while True:
    # Get the latest detections
    detections = camera.get_detections()
    
    # Get the labels for reference
    labels = camera.get_labels()
    
    # Process each detection
    for detection in detections:
        label = labels[int(detection.category)]
        confidence = detection.conf
        
        # Example: Print when a cat is detected with high confidence
        if label == "cat" and confidence > 0.4:
            print(f"cat detected with {confidence:.2f} confidence!")
            
            # Capture and save the image
            encoder = H264Encoder()
            output = CircularOutput()
            timestamp = int(time.time())
            filename = f"cat_detected_{timestamp}.H264"
            print(f"Image saved as {filename}")
            camera.picam2.start_recording(encoder, filename)
            output.start()
            time.sleep(5)
            output.stop

    
    # Small delay to prevent overwhelming the system
    time.sleep(0.1)

Next step is to identify other things (birds, people etc) and record, ultimately to maybe work towards an animal camera trap for safaris!

Cheers
Dennis

Hey @Dennis283326,

I don’t have much direct hands-on experience with this camera, but I think I know what the problem is.

The documentation for the AI Camera is really useful, this module is effectively the same as the old Pi Cam module, with an additional AI processor on-board (the IMX500). The topology differences can be seen below, a typical camera module is pictured left, the IMX500 module pictured right.

As seen in this image, the camera module has two outputs, while a typical module has only one. When using the picam2 module, you would assign a variable to the image data output from the module, which you could then record, modulate, etc.

In your code above, you are assigning the module a variable, but specifically assigning the Output Tensor to your variable “camera”.

Thus, by re-using “camera” for the encoder you are effectively trying to record output tensors with the picam2 encoder.

Maybe try creating a new object for the raw image data, similarly to what’s seen in the Picam2 module documentation.

Like I said previously, I haven’t really gotten hands-on with this module yet, so it’s possible this isn’t the case, I might be able to ask @Jaryd to confirm when he gets the chance.

Hope this helps!

2 Likes

Trying this on a Pi 5 8gb in January 25

the console commands work perfectly but when I run the demo code

and getting some bugs:

Traceback (most recent call last):
File “/home/nephi/Desktop/demo.py”, line 4, in
camera = IMX500Detector()
^^^^^^^^^^^^^^^^
File “/home/nephi/Desktop/ai_camera.py”, line 17, in init
self.imx500 = IMX500(model_path)
^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/picamera2/devices/imx500/imx500.py”, line 321, in init
self.__set_network_firmware(os.path.abspath(self.config[‘network_file’]))
File “/usr/lib/python3/dist-packages/picamera2/devices/imx500/imx500.py”, line 594, in __set_network_firmware
raise RuntimeError(f’Firmware file {network_filename} does not exist.')
RuntimeError: Firmware file /usr/share/imx500-models/imx500_network_yolov8n_pp.rpk does not exist.

Hey Nephi, welcome to the forums!

It seems like the model the code is trying to use is missing, when we do the console command version we use the Mobile Net model, but the demo script uses Yolov8n.

I did a fresh install and it is missing as well, a bit of digging and it seems that Raspberry Pi has recently stopped distributing the model due to a licensing issue (I think the Yolo models are in a weird semi-opensource but not for commercial state). We can change the library code to use Mobile Net, but Yolo is a far more powerful and capable model.

I grabbed the Yolo model from a previous installation and you can download it here.

Unzip the model form the file and paste it into your /usr/share/imx500-models/ folder where you will find all the other .rpk models. Run the demo script again and everything should work fine now!

Thank you for spotting this we will update the guide!

Cheers,
Jaryd

Thanks Jaryd, appreciate the response, I too now have it working.

I am looking at pulling the x,y,h,w of the windows on the screen, I can see in the library file they use it to draw the box, where would I find the documentation trying to find details about exporting that info?

thanks in advance

I am not too familiar with extracting the position with the AI camera, but it should be similar to how other Yolo model set ups do it. In the demo script it should be in the detections variable that we get in this line here.:

# Main loop
while True:
    # Get the latest detections
    detections = camera.get_detections()

When we want to get a detection we use:

detection.category

And the confidence with:

detection.conf

I would whack a print line in the beginning there because you will likely find the coordinates you are looking for in that structure:

# Main loop
while True:
    # Get the latest detections
    detections = camera.get_detections()

    print(detections)

A dig around in there and you should find what you are after! I found the dealing with coordinates to be a tad confusing, but an LLM like ChatGPT helped out a lot!

Cheers,
Jaryd