Chemistry, Manufacturing and Controls (CMC) operations for synthetic small molecules have been improved in recent years by leveraging computational modeling and automation. High-throughput solvent-screening is central to several such operations and is critical for designing reactions, crystallization, liquid-liquid extraction, etc. However, manual screening performed by scientists is not only labor intensive but also prone to significant measurement errors from untraceable sources. Further, a full-factorial experiment across a panel of twenty commonly used process solvents and multiple temperatures is expensive at early stages of development when limited supply of the substrate must be used for chemical synthesis, analytical characterization and property measurement. Recently proposed automated workflows for solubility measurement have shown promise but they rely on indirect quantification of dissolved solids’ concentration
[1]. Previous studies have shown that liquid chromatography (LC) has the lowest detection limit among other commonly used assays for measuring thermodynamic solubility, however, the challenge here lies in simplifying LC assays for high-throughput applications
[2].
We propose a high-throughput material sparing workflow to measure the thermodynamic solubility of small molecule drug candidates and process intermediates in organic solvents. We quantified measurement variance originating from different steps of a manual LC workflow to perform solubility assays. The variance is minimized by selecting appropriate consumables, aspiration and dispense techniques, filtration and sample dilution methods. The gain in process efficiency is driven by two main factors. First, we reduce total time by automating the optimized workflow using a collaborative benchtop robot that can perform gravimetric solid and liquid dosing, slurry filtration and vial transfer to a LC-ready tray. Second, we reduce material consumption by designing experiments based on an ensemble of decision tree and graph-based machine learning (ML) models built for solvent ranking and regression tasks. The trained ML models initialize a set of solvents to start experiments on the automated platform by maximizing solvent functional group coverage across a solvent-ranking chart, and iteratively propose the next experiment based on observed changes in model uncertainty.
Author disclosure: All authors are Sanofi employees, and this work is funded by Sanofi.
References:
- Shiri, Parisa, et al. "Automated solubility screening platform using computer vision." Iscience3 (2021).
- Hoelke, Bettina, et al. "Comparison of nephelometric, UV-spectroscopic, and HPLC methods for high-throughput determination of aqueous drug solubility in microtiter plates." Analytical chemistry8 (2009): 3165-3172.