Structured and analysis-ready data for AI/ML-based drug discovery

Employing AI/ML techniques to identify small molecules for therapeutic development.

Client’s requirement:

The client required high-quality, harmonized and structured datasets of small molecules, encompassing comprehensive chemical, biological and pharmacological data. The final objective was to integrate the standardized small molecule datasets into their internal AI/ML platform for algorithm training, towards virtual hit-identification.

Our approach:

Excelra’s Global Online Structure Activity Relationship Database GOSTAR® provides a 360-degree view of million compounds, linking their chemical structure to biological, pharmacological and therapeutic information. The heterogeneous and unstructured data captured from various data sources is transformed into a structured relational database format in GOSTAR®. All the content in GOSTAR® is captured manually and passes through a 3-step quality control process. These normalized and structured datasets covering structure activity relationship (SAR), physicochemical properties, and ADMET parameters were integrated into the client’s internal platform to train the AI/ML algorithms for model building and activity/property prediction to support hit identification and lead optimization.