PUBLICATION

OpenSpliceAI provides an efficient modular implementation of SpliceAI enabling easy retraining across nonhuman species

Authors
Chao, K.H., Mao, A., Liu, A., Salzberg, S.L., Pertea, M.
ID
ZDB-PUB-251031-3
Date
2025
Source
eLIFE   14: (Other)
Registered Authors
Keywords
A. thaliana, PyTorch, Splice site prediction, SpliceAI, Transfer learning, arabidopsis thaliana, computational biology, deep learning, honeybee, human, mouse, splice junctions, systems biology, zebrafish
MeSH Terms
  • Animals
  • Computational Biology*/methods
  • Deep Learning*
  • Humans
  • RNA Splicing*
  • Sequence Analysis, DNA*/methods
  • Software*
PubMed
41165728 Full text @ Elife
Abstract
The SpliceAI deep learning system is currently one of the most accurate methods for identifying splicing signals directly from DNA sequences. However, its utility is limited by its reliance on older software frameworks and human-centric training data. Here, we introduce OpenSpliceAI, a trainable, open-source version of SpliceAI implemented in PyTorch to address these challenges. OpenSpliceAI supports both training from scratch and transfer learning, enabling seamless retraining on species-specific datasets and mitigating human-centric biases. Our experiments show that it achieves faster processing speeds and lower memory usage than the original SpliceAI code, allowing large-scale analyses of extensive genomic regions on a single GPU. Additionally, OpenSpliceAI's flexible architecture makes for easier integration with established machine learning ecosystems, simplifying the development of custom splicing models for different species and applications. We demonstrate that OpenSpliceAI's output is highly concordant with SpliceAI. In silico mutagenesis analyses confirm that both models rely on similar sequence features, and calibration experiments demonstrate similar score probability estimates.
Genes / Markers
Figures
Show all Figures
Expression
Phenotype
Mutations / Transgenics
Human Disease / Model
Sequence Targeting Reagents
Fish
Antibodies
Orthology
Engineered Foreign Genes
Mapping