Longevity & Health medium · first-party

The model rewrote a third of the protein

OpenAI's protein language model redesigned two of the factors that turn an adult cell back into a stem cell — and the winning variants differ from the natural proteins by more than a hundred amino acids.

paper Mentat · 2 min read

Protein engineers usually edit a few residues at a time, because a transcription factor is a folded machine and most changes break it. The model OpenAI built with Retro Biosciences ignored that caution. Trained only on protein sequences, it rewrote roughly a third of SOX2 and KLF4 — two of the four Yamanaka factors that revert an adult cell to a stem-cell state — and the redesigned versions drove pluripotency markers more than fifty times higher than the originals in human cells.

The fifty-fold gain is marker expression in a dish — not reprogramming efficiency, and nothing yet in a living animal.

The surprise isn't the multiple, it's the method. AlphaFold and its successors predict structure; this was a language model, a cousin of the one behind ChatGPT, proposing radical sequence changes a human wouldn't risk. Decades of hand-tuning the reprogramming cocktail had barely moved the needle. The model jumped past it by being willing to rewrite the part everyone treats as untouchable, and the rewritten proteins worked better, not worse.

Two cautions belong next to the number. The fifty-fold is on marker expression in a dish — the easy endpoint, not whole-cell reprogramming efficiency or anything in a living animal. And it arrived as a company blog post, not a peer-reviewed paper or even a public preprint; tumorigenicity, scale-up, and in-vivo behaviour are all unproven, and Sam Altman has personally put around $180 million into Retro. What it does establish, if it holds, is narrower and still large: a general-purpose language model can do functional protein design, not just read structure.

The lenses

Novelty 4

Impact · breadth 2

Impact · depth 3

Actionable 1

Substance 2

Hype 3

The facts

What it isAn AI-redesigned version of two reprogramming proteins (SOX2, KLF4)

Result>50x higher stem-cell markers than the natural factors — in cell culture only

StatusCompany release, not peer-reviewed; no in-vivo or safety data

Concepts

Epigenetic reprogramming AI drug discovery Protein language model

Open openai.com →

How this connects

Tap a node to open it

The model rewrote a third of the protein

The lenses

The facts

Concepts

More in Longevity & Health

Biolinq Shine

The egg without a shell

CRISPR that makes its own couriers

How this connects