Client’s requirement:
The client required high-quality, harmonized and structured datasets of small molecules, encompassing comprehensive chemical, biological and pharmacological data. The final objective was to integrate the standardized small molecule datasets into their internal AI/ML platform for algorithm training, towards virtual hit-identification.
Our approach:
Excelra’s Global Online Structure Activity Relationship Database GOSTAR® provides a 360-degree view of million compounds, linking their chemical structure to biological, pharmacological and therapeutic information. The heterogeneous and unstructured data captured from various data sources is transformed into a structured relational database format in GOSTAR®. All the content in GOSTAR® is captured manually and passes through a 3-step quality control process. These normalized and structured datasets covering structure activity relationship (SAR), physicochemical properties, and ADMET parameters were integrated into the client’s internal platform to train the AI/ML algorithms for model building and activity/property prediction to support hit identification and lead optimization.