On October 27th 2022, CatSci Ltd, in collaboration with LabLinks, launched their Digital Webinar Series, which commenced with a free event on Computer Aided Retrosynthesis (CAR). The webinar consisted of four presentations, led by industry experts:
- Dr Jun Li – Scientific Director at Bristol Myers Squibb
- Dr Ilja Burdman – Customer Success Specialist at CAS (ACS division)
- Dr Quentin Perron – Co-Founder & CSO at Iktos
- Dr Julie Gai – Product Manager at deepmatter
The free webinar was chaired by Dr Sam Whitmarsh, Director of Digital Transformation at CatSci and Co-Founder of LabLinks. The event focused on disruptive retrosynthesis, the concept that artificial intelligence (AI) and machine learning (ML) approaches can inject new ideas into the labs using their digital intelligence. ML and AI have revolutionised our ability to successfully predict scientific outcomes. They help support chemists in making choices over the direction of planned synthetic routes to make their processes and decision-making even more efficient.
To begin, Dr Jun Li, Bristol Myers Squibb, opened the webinar with a presentation entitled “Retrosynthetic AI: Disruptive Innovation Hinges on the Interplay Between Human and Machine Intelligence.” He reflected on the progress of CAR in the past 15 years, and how Bristol Myers Squibb utilise the application of these tools in their process development teams. Jun highlighted the progress from relatively simple models that used one learning parameter (typically yield) as a measure of reaction viability, to the systems they use today. The digital tools they now use allow control over learning parameters (yield, reaction type, scalability, greenness (PMI)), database sources and subsources, and where different products use different algorithms. Jun described the use of these tools in a practical pharmaceutical industrial context as aids for the chemist. This not only increases speed and efficiency, but it also ensures that a wide retrosynthetic space is considered by the chemist, reducing bias to our own personal favourite reactions or processes that worked well for us in the past.
Next, Dr Ilja Burdman from CAS, gave a presentation entitled “CAS SciFinder-n: Optimise Your Retrosynthetic Planning,” which defined the rule-matching approach used in the SciFinder-n Retrosynthesis Tool. With the SciFinder-n Tool, a machine learning algorithm correlates reaction classes with structure and substructure from a large database to derive ‘rules,’ which are then applied in the predictive sense to the problem the chemist presents. The user has control over which bonds, substructure or whole structure that is to be used in the prediction, along with the commonality of rules applied (a factor linked to the frequency of reactions found in the database – common to rare). A large number of potential routes will then be produced. This output can be weighted according to a range of scoring profiles such as convergence, cost, atom efficiency, and others, to fine tune what is important to the chemist in their own particular challenge. To quickly access reactions of interest, the output can also be filtered by specific components, such as functional group, solvent, catalyst, etc. Plus, integration into the wider SciFinder-n platform allows quick searching of literature to bring up relevant experimental procedures. Ilja concluded that the tool is an effective way of helping chemists evaluate different route options. As such, the tool is not designed to replace a chemist by predicting the ‘correct’ synthetic pathway, but rather to quickly provide access to the chemical space and help expert scientists think through what potential there is in delivering a target molecule.
Thirdly, Dr Quentin Perron from Iktos delivered his presentation on “AI Driven Retrosynthesis: Essential Tool of the Lab of the Future”, in which he explored data driven retrosynthesis using the Spaya tool. Spaya is a 100% data-driven technology tool trained using the Pistachio database from NextMove; the model can be retrained on specific subsets of databases, or an inhouse database if required. This allows the model to leverage the customers’ historical expertise in different scientific areas to their best advantage. Spaya can act across 3 different scales – from millions to billions of compound options, through to a few hundred. For the larger number of compounds, it uses a predictive approach, modelling the output of the more detailed methods and takes about 2ms per compound. Once down to a few thousand options, a true retrosynthesis can be carried out using RScore. Quentin described a comparison of RScore to various other scoring approaches demonstrating improvements in recognising ‘impossible’ structures as intermediates. Once down to a few options, the tool allows significant tuning – knowing that the chemist will need to influence the search based on inhouse constraints – to tailor most important to them. This could be elements such as delivery time, chemical suppliers, or the price of starting materials. It can also be tailored to avoid certain reaction types and to avoid or impose certain intermediates. The result is a powerful workflow that allows a very wide screening of options followed by fine-tuning of the AI output to a chemist’s particular focus. Quentin concluded with an insight into the future of these retrosynthesis tools. He described the coupling of Spaya with robotics to produce an end-to-end solution from target molecule retrosynthesis through to physical synthesis in an automated fashion.
The concluding presentation from Dr Julie Gai of deepmatter, was entitled “IC Synth and ICFRP: Computer Aided Synthesis Design Based on Chemical Rules.” She discussed IC Synth, a software that has been in development since 2005 and acquired and developed by deepmatter since 2018. The software uses learning from large training sets of chemical databases, extracts the chemical rules, and then applies these rules to the retrosynthetic challenge set by the user. This approach allows the algorithms to be influenced by the unique training data it is built upon, while building specific expertise e.g., from inhouse ELN databases or customer proprietary databases. IC Synth allows the user to interrogate proposed routes and uses the extracted chemical rules as explanation as to how the algorithm chose to suggest various options. The user has the ability to understand the AI’s decision-making process for the proposed synthetic routes. Deepmatter understand that chemists want to have control over the variables involved in the search and provide an “Expert Mode” which allows the user to tailor exactly what variables are used in the search and their weighting. For faster, less tailored solutions, or for specific workflows, the software also includes pre-set templates to quickly generate ideas.
Sustainable chemistry features highly in the IC Synth feature set; it includes all 12 principles of green chemistry. Evaluation and comparison of each proposed route can be quickly visualised using a traffic light system that summarises waste, energy, and resource use. This enables the user to decide the most efficient and eco-conscious route for their desired outcome.
The automated rule extraction goes beyond retrosynthesis working in both a forward and backward sense, allowing for retrosynthesis and product prediction from a known starting material. This is particularly useful in medicinal chemistry when considering core structures and library synthesis.
Following the presentations, there was a Q&A session, and participants were invited to network with the speakers. The premier Digital Series Webinar was insightful, educational, and informative; we are sure attendees left feeling inspired and excited about the possibilities of the lab of the future. The CatSci and LabLinks teams would like to thank all our expert speakers for sharing their invaluable knowledge, as well as all the attendees for making the event such a success. The disruptive potential of artificial intelligence and machine learning approaches in retrosynthesis has huge potential, both currently and in the future, supporting expert chemists to make highly informed choices over the direction of planned synthetic routes. The future of these retrosynthesis tools as an input to a full end-to-end robotics solution has the potential to be a game-changer in the field of chemistry and laboratories all over the world.
If you want to catch up on some of the webinar’s recordings, head here.