A rival lab releases a model small enough to run on a USB stick
Halcyon Labs just shipped a 700-megabyte language model that boots straight off a flash drive — no install, no cloud, no trace left behind.

Halcyon Labs just shipped a 700-megabyte language model that boots straight off a flash drive — no install, no cloud, no trace left behind.

The race in AI has spent years pointed in one direction: bigger. Bigger models, bigger clusters, bigger bills. Halcyon Labs just released something pointed the other way, and it’s the more interesting move. The lab’s new model, called Pocket, weighs in at around 700 megabytes — small enough to fit on a cheap USB stick with room to spare, and built to run directly off it without ever touching the host machine’s storage.
Plug the drive into a laptop, launch the included runner, and you get a working AI assistant in seconds. Pull the drive out and nothing remains behind: no install, no files, no history. For a category of users — journalists, travelers, anyone working on a borrowed or locked-down computer — that “leaves no trace” quality is the whole pitch.
Pocket is roughly a 1-billion-parameter model, heavily quantized, and Halcyon is refreshingly honest about what that buys you. This is not a model that will reason through a hard problem or write a long, coherent essay. It’s a model for the small, frequent tasks: summarizing a document, cleaning up a paragraph, answering a quick factual question, drafting a reply. In the lab’s own demos it runs at a usable clip on an ordinary laptop with no dedicated AI hardware, leaning on the CPU alone.

That CPU-only design is the clever part. Most on-device models still want a capable chip or a neural accelerator to feel responsive. Pocket was built to assume neither, which is why it can run off a flash drive on basically any modern computer. The trade is obvious — it’s slower and less capable than the on-device models that demand better hardware — but it works everywhere, and “works everywhere” is rare in this space.
Halcyon released Pocket under an open license, with the model weights and the portable runner both freely downloadable, and the company has been clear that it sees small, local, private models as a deliberate counterweight to the cloud-scale arms race rather than a consolation prize. It’s a pointed contrast with the labs spending billions on data centers: while they chase the frontier, Halcyon is betting that a model you can carry in your pocket and run anywhere, with nothing phoning home, is its own kind of frontier. On the evidence of Pocket, that bet looks smarter than its file size suggests.