I have been looking forward to build something like this using open models. A voice assisstant I can talk while I am driving, as I do have long commute. I do use chatGPT voice mode and it works great for querying any information or discussions. But I want to do tasks like browsing web, act like a social media manager for my business etc.
Solid work and great showcase, I've done a bunch of stuff with Kokoro and the latency is incredible. So crazy how badly Apple dropped the ball... feels like your demo should be a Siri demo (I mean that in the most complimentary way possible).
Thank you. This reminds me of a paragraph from the LatentSpace newsletter [0]
> The excellent on device capabilities makes one wonder if these are the basis for the models that will be deployed in New Siri under the deal with Apple….
> The excellent on device capabilities makes one wonder if these are the basis for the models that will be deployed in New Siri under the deal with Apple….
https://www.latent.space/p/ainews-gemma-4-the-best-small-mul...