We built Living Models because six seasons is too long to wait.
Founded in Lyon in 2023, we started with a question: if language models can predict protein structure from sequence, what can foundation models predict from plant genomes?
Lyon is not the obvious place to start an AI company. But it is one of the most logical places to start a plant genomics company. The INRAE Auvergne cluster is twenty minutes away. The computational biology community at Université Claude Bernard Lyon 1 is active and accessible. When Cyril completed his post-doc on genomic selection for wheat adaptation, Lyon's agricultural research ecosystem was the natural environment to stay in.
The original question was simple and frustrating: why do plant breeders still need six seasons to characterize drought tolerance in a new variety? The molecular data has been there for decades. Genotyping costs have dropped 1000-fold. Sequence databases contain hundreds of millions of plant reads. The information to predict phenotype from genotype exists — it just hasn't been organized into a model that can do the inference.
The insight that changed direction came from watching AlphaFold land in 2021. Protein structure prediction from sequence had been a hard problem for fifty years. A foundation model trained at scale solved it — not by learning the rules of folding explicitly, but by learning to represent protein sequence in a way that made structure derivable. The question became: what is the plant genomics equivalent of that pre-training corpus? And what can you predict from it?
Sofia joined from a clinical genomics infrastructure role. Pradeep came from the breeding side — fifteen years of field trials and selection cycles. We spent most of 2023 building the data curation pipeline and running experiments on subsets of the corpus. The 2024 results were convincing enough to commit full-time. We've been building, validating, and writing ever since.
Three things we won't do, and why
We don't wait for journal acceptance to share results
We publish methodology notes, validation benchmarks, and known limitations on bioRxiv before journal submission. A breeder evaluating whether to change their selection pipeline should not have to wait 18 months for a journal to confirm what the pre-print already shows. Early disclosure is a deliberate choice, not an oversight.
We don't optimise for benchmark competitions
We build for breeding companies and university plant science programs — not for genomics database aggregators or leaderboard performance. Every API design decision starts from how a breeder actually runs a candidate evaluation cycle: what file formats they work with, when in the season they make selection decisions, what they do with a confidence interval.
We don't return point estimates without confidence bounds
Every API response includes a confidence interval and a declared limitation flag. If the input falls outside our training distribution, the response says so explicitly via the out_of_distribution field. If prediction accuracy is reduced for heterozygous materials or rare alleles, we document the reduction and flag it per-accession. Breeding decisions have consequences — a number without uncertainty quantification is not actionable.