Dispatches from the Future: Meta Connect 2024 Highlights from Karthik Kannan, CTO at Envision
Introduction
The past week has been eventful, a bit like stepping out of a time machine.
I just wrapped up my first ever Meta Connect in sunny San Francisco. It's a defining moment for Envision as we're partnering with AI at Meta in bringing AI to over 300 million blind and low vision people, across the world.
That's not all, there was a slew of announcements from open source AI to Mixed Reality Glasses that felt like the beginning of a new era.
It's hard to capture the excitement around the event but here are my main highlights!
My Agenda
Day 0 (24/09):
Meta kicked it off with an exclusive reception, gathering the Llama team and cutting-edge AI developers. The room buzzed with innovation, from AI for Indian farmers to AWS-challengers slashing intelligence costs.
The unifying theme at the dinner was how much innovation open source AI fosters. Open source AI can be fine tuned for niche use cases, often at 1/10th of cost of deploying proprietary AI. This wasn't just Meta's vision—Amazon, Salesforce, and NVIDIA executives I met during the dinner, echoed their commitment to open source.
Big thanks to Prince Gupta and Helen Suk , our partners at Project Aria & Llama team for the invite. What exactly are our partnership plans? Stay tuned!
Day 1 (25/09):
Sunrise saw us joining the eager queue for the 10 AM keynote. We had front row seats to the future unfolding.
At Envision, we've been working on smart-glasses for people who are blind or have low vision for over 7 years. We’ve always said that this is the best way to experience AI is through smart glasses, that reality will start unfolding right now.
A highlight: AI at Meta spotlighted our work during the main Llama keynote. They showcased how Envision harnesses multimodal Llama to push accessibility boundaries, featuring us in a detailed blogpost.
Evening of Day 1 was me presenting a technical paper outlining our work with Llama.
It was a personal stretch— learning to craft a robust academic poster in just a week. We got a tremendous response from all Connect attendees who dropped by, including folks from AWS, Citibank to bring Envision into the workplace.
Day 2 brought a more relaxed vibe. I spent time connecting with new acquaintances from the conference and exploring the VR demos. Despite being firmly in the 'in-person board games' camp, I was impressed by the quality of VR entertainment. The immersive experiences on display highlighted the technology's rapid advancement, blending realism with imagination in compelling ways.
The Big Takeaways
🕶️ Project Orion
Even though we've been tracking the evolution of these glasses for a while, I still find myself amazed. Orion is truly a marvel that we'll someday take for granted. These glasses are fantastic for accessibility because they offer an unobtrusive, hands-free way to experience independence.
One of the biggest challenges is perfecting the display while keeping the glasses lightweight.
Imagine this: a blind person able to interact with surfaces like touchscreens that were once inaccessible, guided by audio beacons overlaid through AR glasses.
Even if these glasses are still years away, the journey to create them is already delivering so many useful advances for the blind and low vision community. Take Envision Glasses, for example—they're a fantastic piece of hardware that's both easy to use and powerful enough to make a real difference for users today. It's only onwards and upwards from here.
🦙 Llama 3.2 1B & 3B
Meta's latest Llama model introduces multimodal capabilities, combining both vision and text components—a significant leap over the text-only models.
Being completely open source(open weights) with a permissive license means developers can fine-tune it however they want. For me though, the standout feature is the smaller 1B and 3B models that perform incredibly well and can be deployed directly on devices like smartphones and laptops.
A powerful trend in AI is that the cost of deploying models is dropping to near zero. Sam Altman's proclamation that "intelligence will be too cheap to meter" is quickly becoming reality.
The 1B and 3B LLaMA models can be deployed at no cost, yet they enhance the entire AI pipeline by performing small, specialized tasks swiftly. And let's not forget, compact models like Meta's LLaMA and Google's Gemma are huge wins for privacy.
Envision plans to tap into Llama 3.2's vision capabilities to build specialized models that can, for example, better understand document layouts including charts, figures, images and tables present in them.
What's also exciting is that Llama 3.2 seems to be the first series of models trained on images from smartglasses, making it a perfect fit for Envision.
The graph above is one of the most compelling visuals about AI you'll see this year. It highlights how the cost of intelligence, through a combination of hardware advances and open source development, has dropped by a factor of 240.
The Random Bits
It's not all work at tech conferences—I enjoy the random bits just as much.
🍪 This year, I indulged in Llama-themed and AI-generated cookies, and we even had a standup comedy show entirely orchestrated by Llama & 2 NYC comics!
🖥️ Meta's shuttles are coder-friendly with monitors and HDMI cables lying around; want to keep working on your way to work? Just plug in, connect to the local wifi and start coding... like me 🤣
✨ I absolutely loved the Meta swag this year. Attendees could ask Meta AI to generate an image based on any prompt and have it printed on a tote bag. My prompt was two toddler AIs—one representing the East and one the West—hiking up a Californian landscape together. It wasn't any different than what I've seen with other image generation tools, honestly, but it's a nice conversation starter 🤷
Looking forward
2025 will be the year of AI on Glasses—a culmination of the past seven years we've dedicated at Envision, championing AI on glasses as the ultimate accessibility platform. Meta Connect 2024 marks the beginning of a tectonic shift, with Project Orion and Llama 3.2 standing out as game-changers for me.
Over the week, I connected with many more folks in the Bay Area. There's an electrifying energy in the air, a sense of great possibilities. As we stand on the brink of another revolution, it feels like we're not just witnessing the future—we're crafting it.