Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. With this vision, we're designing a new kind of computer, focused on making voice personal agents part of our daily lives. More details from Sequoia: https://www.sequoiacap.com/article/partnering-with-sesame-a-...
Our team brings together founders from Oculus and Ubiquity6, alongside proven leaders from Meta, Google, and Apple, with deep expertise spanning hardware and software.
Open Roles: https://jobs.ashbyhq.com/sesame
- ML Engineers
- Product Designers
- Product Managers
- iOS & Android Engineers
- ML Model Serving Engineer
- Embedded OS Architect
- Mechanical Engineer, Product Design
- Embedded Engineers
- Electrical Engineer
- Audio Systems Engineer
Human voices don't take 30 seconds to think, retrieve, research, and summarize a high quality answer. Humans are calibrated in their knowledge, they know what they understand and what they don't. They can converse in real time without bullshitting.
Frontier real time-ish LLM generated voice systems are still plagued by 2024 era LLM nonsense, like the inability to count Rs in strawberry. [1]
I'd personally love a voice interface that, constrained by the technology of today, takes the latency hit to deliver quality.
[1] https://www.instagram.com/reel/DTYBpa7AHSJ/?igsh=MzRlODBiNWF...