Publications

Bayesian optimization and machine learning for vaccine formulation development

Led by the team at Sanofi, this study used Bayesian optimisation to improve the accuracy of a machine learning model predicting critical quality attributes of viral vaccine formulations.

The project makes use of SmartChemistry® Optimiser, licensed from ChemAI, as the core decision support engine, with participation from our CTO, Thomas.

A great example of how industry collaboration can move the needle on trustworthy AI in life sciences.

Algorithm for A Novel Concept for the Search and Retrieval of the Derwent Markush Resource Database

In the article “A Novel Concept for the Search and Retrieval of the Derwent Markush Resource Database,” ChemAI researchers present an innovative method for integrating the Derwent Markush Resource into the STN platform. This integration allows for detailed compatibility with other structure and Markush databases on STN, enabling users to deploy the specific features and functions of the Derwent approach. The study demonstrates that combining different Markush languages into a single general description facilitates unified structure queries, potentially serving as a foundation for a common generalized description of Markush structures.

Algorithm for Reaction Classification

In the article “Algorithm for Reaction Classification,” ChemAI researchers introduce a novel method for classifying chemical reactions. The core of this algorithm involves testing all maximum common substructures (MCS) between reactant and product molecules to identify structural changes during reactions. This approach enhances the accuracy of reaction classification, providing a more detailed understanding of chemical processes. The algorithm’s effectiveness is demonstrated through various examples, showcasing its potential applications in cheminformatics and related fields.

Categorical-Continuous Bayesian
Optimization Applied to Chemical
Reactions

This study improves Bayesian optimization (BO) for chemical reactions with categorical and continuous variables. Traditional methods like DoE require many experiments, making machine learning approaches attractive.
The authors enhance BO with a specialized covariance function (COCABO) and compare acquisition function optimizers. Brute-force works best for few categorical variables, while ant colony optimization (ACO) is better for larger spaces.
Their method outperforms state-of-the-art approaches like Gryffin and SMAC in simulations, finding optimal reaction conditions faster. Future work includes lab validation.

Analysing a billion reactions with the RInChI

In the article “Analysing a billion reactions with the RInChI,” ChemAI researchers explore the effectiveness of the Reaction-InChI (RInChI) as a canonical identifier for chemical reactions. They demonstrate its utility in managing extensive reaction databases, such as the Synthetically Accessible Virtual Inventory (SAVI), which contains over a billion reactions. The study highlights how the RInChI facilitates the analysis of large molecular datasets, effectively addresses issues like NH tautomerism and stereochemistry, and provides a unique and canonical representation of reactions. The authors recommend incorporating the RInChI into reaction data models to enhance data integration and analysis.

Route Design in the 21st Century: The ICSYNTH Software Tool as an Idea Generator for Synthesis Prediction

In the article “Route Design in the 21st Century: The ICSYNTH Software Tool as an Idea Generator for Synthesis Prediction,” ChemAI researchers evaluate the performance of ICSYNTH, a computer-aided synthesis design tool. The study compares ICSYNTH’s ability to predict innovative synthetic routes against traditional methods, demonstrating its potential to enhance efficiency and creativity in chemical synthesis planning.