Current protein drug discovery is expensive, time-consuming, and often unsuccessful in clinical trials. Existing AI solutions face three main challenges: the inability to optimize multiple protein functions concurrently, heavy reliance on protein structure, and a lack of scalable AI infrastructure in pharmaceutical wet-labs, limiting the development of innovative protein modalities. The DeepSeq.ai platform offers three key features: 1) Scalable AI, which combines synthetic biology, single-cell sorting, and AI active learning to create diverse protein libraries for efficient learning and optimization of sequence patterns across different protein types. 2) Generalized AI, achieved through well-normalized training datasets and proprietary sample preparation techniques, enabling the creation of a versatile protein language model capable of developing safe protein drugs. 3) Interpretable AI, with patented methods for deciphering sequence patterns tied to specific protein functions.
What is the problem?
Current protein drug discovery is costly and time-consuming, taking years and potentially millions of dollars, with a low success rate for entering clinical trials. While some AI solutions have been developed, three fundamental problems exist with current technologies: 1) No current solution can effectively optimize multiple protein functions simultaneously, such as binding affinity, immunogenicity, cell expression, and aggregation. This limitation leads to a time-consuming trial-and-error. 2) Existing AI models rely heavily on protein structure, which may not accurately correlate with various protein functions and developability. Additionally, it is very difficult to improve structure-based AI models. 3) The lack of scalable and generalizable AI infrastructure in pharmaceutical wet-labs impedes efficient AI training and results in biased models focusing primarily on antibody CDR regions. This hampers the development of truly innovative protein modalities.
What is their solution?
The DeepSeq.ai platform includes: 1) Scalable AI: They integrate synthetic biology, single-cell sorting, and AI active learning to create diverse protein libraries, enabling efficient learning, optimization of sequence patterns and multiple biological functions across various protein modalities at scale. 2) Generalized AI: Their proprietary sample preparation and DNA extraction techniques generate well-normalized training datasets, allowing their AI platform to rapidly learn biological functions from diverse protein modalities (peptide, enzyme, antibody, etc.). This results in a highly generalized protein language model capable of creating novel, developable, and safe protein drugs. 3) Interpretable AI: Their patented methodology accurately deciphers sequence patterns associated with specific protein functions. This not only provides valuable scientific insights into biology but also facilitates human feedback to enhance the model's accuracy and generalizability.