I've been using Substack for a few months now and this is BY FAR the most interesting article I've read. It does a great job explaining overlooked points and links to resources that dive deeper into small language models (I think this opens a new world). Seeing all the Mac mini photos floating around Twitter it felt like people were just going numb and following the crowd (which could be true). But this piece made me realize Apple's moat in AI.
There’s a lot to be said for “good enough” models. A company might get a lot of utility out of a low-cost LLM that can answer questions about the HR manual or summarize engineering diagrams. Especially if attention is paid to training and preparing the model for those specific tasks. It doesn’t need to quote bridge specifications in the voice of Shakespeare to be useful.
I like this take, been so swept up in SF and the new next new next I hadn’t thought about edge or on device inference all that much let alone this take
I am betting heavily on this thesis with my own application, which uses AI on-device, and I’m going to link to your work on my marketing site - thank you
Dedicated page that centers your article is already live at https://www.almostrealism.com/philosophy/ - I think this really helps my marketing story. Much appreciated
I’ve been recommending your work a lot actually: I find that one of the biggest gaps I encounter talking with people is its so rare they understand both finance and technology well enough to see what’s going on right now with AI spending
I think edge inference is gonna be real. There’s massive room for efficiency improvements within all these models; and people are starting to look at model efficiency. That trend helps Apple I think.
Thanks. Agree that edge inference will be real. You will end up with edge-based inference for most tasks, cloud-based for the rest. But I do not think inference is a zero-sum game. In other words, edge-bsaed inference won't displace or reduce cloud-based inference. Rather frontier-level inference will remain in the cloud, and continue to grow, at the same time that edge-based inference *also* grows. So the conclusion here isn't that Apple's edge-based inference steals demand from, say, Google's cloud-based inference. Rather, it's that *overall* inference demand grows both at the edge and in the cloud.
Regarding the topic of the article, your point about Apple's on-device inference strategy is so thought-provoking. What if this fundamentally changes how we interact with personal AI, priotising privacy and efficiency above all?
I am betting my farm on the local-first platform and applications. I see real opportunity in converting many outdated native Windows/Mac (Excel included) to zero-latency local-first with cloud collaboration and AI on-device.
My bet is on the local-first browser apps with local AI running on WebGPU.
I've been using Substack for a few months now and this is BY FAR the most interesting article I've read. It does a great job explaining overlooked points and links to resources that dive deeper into small language models (I think this opens a new world). Seeing all the Mac mini photos floating around Twitter it felt like people were just going numb and following the crowd (which could be true). But this piece made me realize Apple's moat in AI.
There’s a lot to be said for “good enough” models. A company might get a lot of utility out of a low-cost LLM that can answer questions about the HR manual or summarize engineering diagrams. Especially if attention is paid to training and preparing the model for those specific tasks. It doesn’t need to quote bridge specifications in the voice of Shakespeare to be useful.
Yeah I think this is basically right. My guess is that most tasks don't require frontier-level LLMs.
I like this take, been so swept up in SF and the new next new next I hadn’t thought about edge or on device inference all that much let alone this take
Appreciate the compliment, thanks!
I am betting heavily on this thesis with my own application, which uses AI on-device, and I’m going to link to your work on my marketing site - thank you
Dedicated page that centers your article is already live at https://www.almostrealism.com/philosophy/ - I think this really helps my marketing story. Much appreciated
Glad you found the piece to be useful!
I’ve been recommending your work a lot actually: I find that one of the biggest gaps I encounter talking with people is its so rare they understand both finance and technology well enough to see what’s going on right now with AI spending
Appreciate that! I agree, most people do not seem to understand both the financial and technical aspects of AI.
Downloading your app right now
Great article, I enjoyed it.
I think edge inference is gonna be real. There’s massive room for efficiency improvements within all these models; and people are starting to look at model efficiency. That trend helps Apple I think.
Thanks. Agree that edge inference will be real. You will end up with edge-based inference for most tasks, cloud-based for the rest. But I do not think inference is a zero-sum game. In other words, edge-bsaed inference won't displace or reduce cloud-based inference. Rather frontier-level inference will remain in the cloud, and continue to grow, at the same time that edge-based inference *also* grows. So the conclusion here isn't that Apple's edge-based inference steals demand from, say, Google's cloud-based inference. Rather, it's that *overall* inference demand grows both at the edge and in the cloud.
Regarding the topic of the article, your point about Apple's on-device inference strategy is so thought-provoking. What if this fundamentally changes how we interact with personal AI, priotising privacy and efficiency above all?
Excellent point of view!!!
The "AI" we have today is still just to LLM. An interesting toy. Mere words. Bereft of real world experience, sensory inputs, et al.
Great article, thanks for sharing!
I am betting my farm on the local-first platform and applications. I see real opportunity in converting many outdated native Windows/Mac (Excel included) to zero-latency local-first with cloud collaboration and AI on-device.
My bet is on the local-first browser apps with local AI running on WebGPU.