Hi Jaryd, Of all the show and tell videos out there, those created by Core are among the best. So having whet out appetite for the AI HAT +2 running a VLM .. will you guys be doing a show and tell on setting one up any time soon? I have a small group of final year degree students that are doing a group project where they are trying to make an aid for a visually impaired disabled person and I was going to point them in the directyion of the AI HAT +2. Cheers from the UK!
The software support for VLMs has been a bit rough on launch. The original AI HATs were a bit rough like this as well, and it took a few months before they became really polished. Using object detection and LLMs on the AI HAT 2 is super easy and straightforward, just waiting for VLMs to catch up.
Because of that, it might be a little while before we make some VLM guides for them as the current method of getting them going is a bit janky, and will likely break as an official method is released soon.
Until then, we do have a guide on using Moondream - a very lightweight VLM that can run on the Pi’s CPU. It is no where near as fast as the AI HAT running more powerful VLMs, but there is an option to use cloud processing on a very generous free plan if that works? If not, for simple yes or no answer, you can get the processing time down to about 8 seconds.