
Client’s requirement:
The client required high-quality, harmonized, and structured datasets of small molecules, encompassing comprehensive chemical, biological and pharmacological data. The final objective was to integrate the standardized small molecule datasets into their internal AI/ML platform for algorithm training, toward virtual hit identification.
Our approach:
Excelra’s Global Online Structure Activity Relationship Database (GOSTAR) provides a 360-degree view of millions of compounds, linking their chemical structure to biological, pharmacological, and therapeutic information. The heterogeneous and unstructured data captured from various sources is transformed into a structured relational database format in GOSTAR.
All the content in GOSTAR is captured manually and passes through a 3-step quality control process. These normalized and structured datasets cover structure-activity relationships (SAR), physicochemical properties, and ADMET parameters. They were integrated into the client’s internal platform to train the AI/ML algorithms for model building and activity/property prediction to support hit identification and lead optimization.