Furthering my quest for complete knowledge on artificial intelligence with Raspberry Pi the natural next step was to investigate Pose Recognition (Human Keypoint Detection) and Face Masking (Facial Landmark Recognition) with the formidable Raspberry Pi single-board computer. Machine and deep learning have never been more accessible.
Face Masking is a computer vision method that will exactly identify and map the geometry of your face which can then be represented by dots and segments across all your features. Doing this means it will know exactly where your eyes are in relation to your eyebrows or your nose in relation to your lips. Using very similar geometry mapping principles, Pose estimation expands on this by identifying the location of every key part of your body. I demonstrate how to set these systems up and how to edit the code so you can pull location data from the system.
Hey all - I’ve looked on all forums and posts and just now going around in circles. I have been able to get other things working, such as facial recognition using OpenCV.
Went through the tutorial (Pose / hand tracking - OpenCV, MediaPipe)
Although no matter what I seem to do, I always get the following when trying to run the scripts (which are listed in the link above):
Traceback (most recent call last):
File "/home/pi/posetrack.py", line 3, in <module>
import mediapipe as mp
File "/usr/local/lib/python3.9/dist-packages/mediapipe/__init__.py", line 16, in <module>
from mediapipe.python import *
File "/usr/local/lib/python3.9/dist-packages/mediapipe/python/__init__.py", line 17, in <module>
from mediapipe.python._framework_bindings import resource_util
ModuleNotFoundError: No module named 'mediapipe.python._framework_bindings'
I’ve found other people who have had this error and tried any suggestion. I’ve reinstalled ALL the things
Thank you aaaaaand nope. I’d say I am trying to do too much with one board.
In the meantime, I have been able to get examples of tensorflow lite pose estimation going, even if I cant trigger them via node.red, but I really think it’s more a case of dividing what I’m trying to do between Pi boards. I’ll try a fresh install on a clean one so to not have any chance of conflicting dependancies.
Thank you again for the post! A shame it didn’t work, but I think its more of a ‘me’ jamming too much in there, than an actual ‘Pi’ not being able to do it style situation…
I had the same issue regarding the mediapipe import error. I was able to work around it by uninstalling mediapipe-rpi3 and mediapipe-rpi4, then using pip to just install mediapipe generally like this:
“pip install mediapipe”
I am sure this will come back to get me eventually, but for these examples it worked fine!
I tried your workaround with no success, I got the following (after completing the first stage: unsinstalling both mediapipe-pi3 and 4):
pip install mediapipe
Looking in indexes: Simple index, piwheels - Simple index
ERROR: Could not find a version that satisfies the requirement mediapipe
ERROR: No matching distribution found for mediapipe
So I’m wondering how you managed to make this examples work (on RPi), hm? They do work on my PC (Linux Mint 20.3), but I wish to make a small project with them on RPi…
My RPi hardware is: Pi Model 3B V1.2, I’m running Raspbian GNU/Linux 11 (bullseye). UPDATE: I’ve tried to run this on smae machine with different OS version (Raspbian GNU/Linux 10 (buster)) with SUCCESS (it’s slow, but OK) - as was mentioned at the beginning of the article. But buster is getting older, anyone knows how to make it run on bullseye?
Glad to hear you have had success with Buster OS. Teams of people are no doubt working furiously on getting machine learnt systems to work and be stable on Bullseye OS and I’m sure they are getting very close. The Buster OS Pose Estimation and Face Landmark tracking versions that I have shown above are quite stable. I have more machine learning guides on the way which will ease the transition between Bullseye and Buster OS.
Hey Tim,brilliant codes~
Do you know how to extract ROIs(region of interest) like eye rois,nose roi,cheek roi,mouth roi(normalised rectangular rendered image)using OpenCV & MediaPipe?Just a tip not entire codes.
A lot can be learned directly from the Google MediaPipe Website in regards to their machine learnt system - Redirecting. This face mesh script identifies 468 3D face landmarks so you can definitely unpack that and turn a group of close-knit landmark points into ROIs.
If you are interested in pinpointing these regions and nothing else (for instance the Eyes) check this variant of the machine learnt system here - Redirecting.
Glad you responsed~I’ve actually searched amoug the Mediapipe pages these days,although not entirely understand what each line of codes means,I gradually figured out that they use some certain anchor points to define the ROIs,and I successfully draw my face segmentation ROIs with the same pipeline.
Now that I’m doing a performance art-walk with a wearable device later to be invented,I still need to distribute these Rois to corresponding display screens in my next step,how to do that with which kind of python codes is what I’m pondering now. So emmm…if you have any tips to help me,just do thank you!
Again,your tutorial has lent a tremendous inspiration to my work,Thank you!
I love this idea, you could really get some wild, Dutch angle, close-up videos of your target face outputted to the monitors. Turning a target ROI into a new preview window is definitely the next challenging hoop to approach.
The closest code I’ve seen that does something similar is this script here for a Oak-D lite and Raspberry Pi System - Lossless Zoom By Luxonis. That Python Script evaluates the footage from a 4K camera, identifies a Human target as the ROI, and then outputs the ROI to a preview window. There are a lot of translatable ideas from your project and that system, however, you’re going to have to dive in and figure out how to modify it appropriately.
Hey Tim,sorry for the late response.I’ve been digesting the nutrition of your last note recently. Apparently I’m just a beginner in python and computer vision field,so I definetely need some time to catch up with the new revelations here.
Let me elaborate my on-going work to you.In short,I intend to make an ‘amorphous’ mask to disguise myself from surveillance system around us,and I think, why don’t I use pedestrians’ faces as my transformation materials.I can switch my facial feature entirely or partially to other’s,disguising myself from CCTVs by confusing them.So I need a camera to film when I’m walking,meanwhile use codes to seize passers’ faces as my material,and output the amorphous organs into seperate screens,just like a syber protestor.
After comparing OAK-D Lite camera to 6mm & 16mm cameras,I may choose 16mm.Oak-D absolutely has largest resolution and least distortion,while 6mm is just opposite,but appearance also matters(I don’t what the style to be too modern,I want to keep it in a junky style),thus 16mm ‘single-eye look’ camera seems fit.The code of this Luxionis dude and the Artist Liam recommanded seems solve the transmitting part.And for screens,I have a question,how many micro-monitors can function simultaneously connected to one 4B Pi?
I’ll be very glad to share it with you when I finish,Thank you for your warm help all along.
I have an impression of him, his work is interesting, he uses Processing to make these animations.What you are talking about should be the one he often uses as his avatar, but unfortunately I didn’t find it,either.
Thank you for your accurate recommendation~
These approaches are very COOL~~~I like them,very inspiring and encouraging!Though I want to be more adversarial in appearance,you know, strange enough for a nature face but seems ok to cctvs. I want to use the faces of passers-by directly as the material, but I want to add some machine learning algorithms to keep the faces in constant and slow deformation. I imagine that the protection device in the future must be light and convenient. It may be lasers carrying anti-algorithms, or it may be a subcutaneous camouflage tissue. I hope my work can arouse enough alertness，I hope to use the existing techniques with limitation and inspire future works. Thanks!