A Chatbot on your Laptop: Phi-2 on Intel Meteor LakeBecau... | A Chatbot on your Laptop: Phi-2 on Intel Meteor LakeBecau...
A Chatbot on your Laptop: Phi-2 on Intel Meteor Lake
Because of their impressive abilities, large language models (LLMs) require significant computing power, which is seldom available on personal computers. Consequently, we have no choice but to deploy them on powerful bespoke AI servers hosted on-premises or in the cloud.

Why local LLM inference is desirable
What if we could run state-of-the-art open-source LLMs on a typical personal computer? Wouldn't we enjoy benefits like:

Increased privacy: our data would not be sent to an external API for inference.
Lower latency: we would save network round trips.
Offline work: we could work without network connectivity (a frequent flyer's dream!).
Lower cost: we wouldn't spend any money on API calls or model hosting.
Customizability: each user could find the models that best fit the tasks they work on daily, and they could even fine-tune them or use local Retrieval-Augmented Generation (RAG) to increase relevance.