Camera Question

Does the RPi camera allow access by the RPi4 to the pixel data or does it only allow a video screen to be connected to the RPi4?

If so, in what form is the RGB pixel data?


Hey Tom,

The camera is only a sensor, you would want to use a few libraries (likely in Python on a Pi) to pull the pixel data into Numpy arrays for example so you can use them in your programs. There’s a few pages and discussions about this online such as this StackExchange post below:

Please let us know if you have any other questions


Thanks Bryce. I never would have found that


No worries Tom,

If you run into any issues with your code, feel free to throw it onto the forum here between some tildas and we’ll see what ideas we can the community can come up with to help out :grinning_face_with_smiling_eyes:




# That way your code ends up looking like this

def helloWorld():
    print("Hello World!")


All the best with your project! Out of curiosity, what are you using the pixel values from the camera for? Is it a computer-vision project for segmasks/bounding-boxes etc?


I want to run code conditional on detected image elements i.e. detect object X run code Y. Object can be a physical thing or text.

Photos seem readily available in RPi in PDF form. Can RPi do PDF to text conversion as an alternative to pixel data? Can it crop a PDF image?

I need to find that RPi can do the necessary things before embarking on the project of a lifetime.


Hi Tom,

I’d start with the OpenCV library, Tim has a guide on using it for facial recognition, but you should be able to adapt it or use it to learn before jumping into object recognitio:

As for text recognition, this is usually called OCR, and OpenCV has a library called Tesseract that can do it well from what I hear:

You should show us what you plan to make, I’m sure it’s interesting!

1 Like

Hey Tom,

It’ll certainly be a bit of a challenge, I’m personally not sure that a Pi is the best hardware to run on this one. You could rig it up with a camera, run a script which takes advantage of pre-trained models for image recognition and text conversion, and it should all be possible on a Pi. But will be majorly sub-optimal.

That being said, to ensure that you get decent performance, and if you do need to train any models for image recognition of specific objects, you’re much much better off using a machine which includes a GPU which TensorFlow or similar libraries can take advantage of (you want to avoid processes like this from running on relatively slow, low-power CPU such as on a Pi, you may be able to get away with using a compute module with an external GPU, but that begins to introduce other problems too).

Personally, although Core Electronics doesn’t have stock of these boards at the moment as far as I am aware. The Coral boards and appropriate tutorials will be much more suitable for your image recognition projects, I’ve linked some info to get started with it for you below:

Let us know how you go with it! I’m curious to see how you solve it! :slightly_smiling_face:

1 Like

Hey Bryce,

Do you think one of the Lattepanda or Nvidia Jetson boards would work for something like this?
It wouldnt be as tuned for the learning portion of the algorithm but good for use in any following projects that would need more general computing power

1 Like

Hey Liam,

Yes, the Jetson Nano v3 may do the trick and I’m sure there’s other dev boards you could use too. Once the model is trained on another machine it shouldn’t be too big a problem for the Pi, Nano, Delta, etc. to run the script and identify objects, and you likely won’t notice too big a difference in performance if it’s running without having to handle a GUI for the user as well.

That being said, you’ll notice a steep drop off in frames per second on a machine without a ‘decent’ GPU and depending on which source images are being recognised from and how quickly they need to be identified, if you’re only using the board for that one specific application, something built with the end-goal in mind so to speak such as a Coral board would be a much better option.

The tutorial above elaborates on this further. But in terms of getting optimal performance, and best value out of your hardware, I’d personally say the Coral would be the best way to go.


I am taking your suggestions and shall follow through.

I am struck how the human eye operates as an advanced pattern recognition device, much different from horizontal line scanning. The eye focusses on a portion and tries to pattern match. It detects object limits then determines what is in the object. It is more of a circular or radial scanner with variable radius. Text is characters surrounded by space. It’s as though the eye has a library of quick recall patterns in various sizes. If a pattern is matched to, say, 90% it’s good enough.


You’re welcome Tom, all the best with it!

Hey Tom,

If you were after some resources for how it works I would have a look at this 3B1B playlist: