Platform Science Use Cases Research Blog Request Access
Genotype-to-phenotype prediction for crop breeders

Foundation models for crop genomics.

Living Models trains large-scale AI on plant sequence data to predict phenotype from genotype — cutting years off trait selection cycles for seed companies and breeding programmes.

In collaboration with research programs at:
Institut Pasteur de Lyon — Computational Genomics Division CropNet Europe INRAE Auvergne Unit
6
reduced to a single prediction call
field seasons required to characterise drought tolerance in traditional breeding programmes
200M+
plant sequences in our pre-training corpus, spanning 47 crop species and 120 wild relative taxa
~91%
correlation between our embeddings and curated GRIN phenotype records across 12,000 maize accessions

From sequence to prediction in three steps

Step

Submit your genotype data

Upload VCF, FASTA, or SNP array CSV via our REST API or Python SDK. We accept standard plant genomics formats with no preprocessing required.

Step

Our model computes genomic embeddings

The foundation model generates a 512-dimensional representation encoding population history, evolutionary context, and variant-level signal.

Step

Receive trait predictions

Get drought tolerance index, yield stability score, and pathogen resistance flags — each with a confidence interval and known limitations declared.

predict.py
import livingmodels as lm

result = lm.predict(vcf_path='sample.vcf', traits='all')
print(result.drought_tolerance_index)  # 0.847 (CI: 0.81–0.88)
Pre-trained foundation

Foundation model pre-trained on plant biology

Our model was trained ground-up for plant genomics — intron-aware tokenization, multi-species joint training, and variant-level embeddings validated against GWAS hits across five crops.

  • 47 crop species + 120 wild relative taxa in training corpus
  • Ensembl Plants + NCBI SRA + internal curation pipeline
  • Wheat, maize, tomato, soybean, sorghum coverage
  • k-mer tokenizer tuned for transposons and tandem repeats
Adaptation layer

Fine-tuning for your breeding program

Few-shot fine-tuning on as few as 200 phenotyped lines. Adapter layers preserve pre-training signal while capturing your program's specific genetic background and environment.

  • Works with as few as 200 phenotyped accessions
  • Adapter architecture preserves pre-trained representations
  • Runs on your proprietary data with confidentiality guarantees
  • Outputs are confidence-bounded, not point estimates

Predict what breeders actually measure

Six validated prediction targets covering the traits that matter most in modern crop development programmes.

Drought Tolerance Index

Probability-weighted index for survival under water deficit, scored against a panel of 3,400 characterised maize and wheat accessions. Returns a point estimate with confidence interval.

Yield Stability Score

Cross-environment consistency score derived from embedding-space proximity to varieties with documented multi-location trial stability. Validated across 3 climate zones.

Disease Resistance Probability

Resistance scores for six common pathogens including Fusarium, late blight, and powdery mildew.

Flowering Time Variance

Predicted days-to-heading distribution under standard and extended photoperiod regimes. Useful for matching variety flowering time to target environment growing season.

Nitrogen Use Efficiency

Genotypic nitrogen use efficiency signal estimated from embedding-space proximity to phenotyped high-NUE accessions. Particularly relevant for low-input breeding programs.

Heat Stress Tolerance

Predicted performance under high-temperature episodes during critical growth windows.

From the breeding programs using it

We submitted 800 candidate lines and received ranked drought tolerance predictions within hours. We brought 40 into the first field entry season — the shortlist quality was better than anything we'd achieved with marker-assisted selection alone.

Dr. Stefan Brouwers
Head of Genomics
GrainForge Benelux

We pre-screened 1,200 tomato lines for Fusarium and late blight resistance — without a single inoculation trial. Living Models predicted the resistance profile from sequence alone. The correlation with our inoculation data was compelling enough to change our pipeline.

Dr. Lucia Ferrandini
Research Scientist — Disease Resistance
MedSeed Research

Ready to compress six seasons into one?

We onboard research teams on a case-by-case basis. Bring your VCF file to a 30-minute technical call.