Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

(github.com)

78 points | by karimf 15 hours ago

5 comments

zerop 4 minutes ago
I have been looking forward to build something like this using open models. A voice assisstant I can talk while I am driving, as I do have long commute. I do use chatGPT voice mode and it works great for querying any information or discussions. But I want to do tasks like browsing web, act like a social media manager for my business etc.
dvt 3 hours ago
Solid work and great showcase, I've done a bunch of stuff with Kokoro and the latency is incredible. So crazy how badly Apple dropped the ball... feels like your demo should be a Siri demo (I mean that in the most complimentary way possible).
[-]
- karimf 3 hours ago
  Thank you. This reminds me of a paragraph from the LatentSpace newsletter [0]
  > The excellent on device capabilities makes one wonder if these are the basis for the models that will be deployed in New Siri under the deal with Apple….
  https://www.latent.space/p/ainews-gemma-4-the-best-small-mul...
k-almuraee 53 minutes ago
Amazing, love your work ,
techpulse_x 39 minutes ago
[dead]