Skip to main content

Data science

Unlock the power of data

Our data scientists are committed to improving your efficiencies and simplifying your analysis process. We provide data science services that accelerate the identification of promising compounds and targets helping reduce your R&D expenditure and increase your accuracy.

Our process

Our data scientists analyze, visualize, and model large amounts of data to meet your research goals. We incorporate five phases to maximize data harnessing:

Capture

Objective-focused data acquisition, data entry, signal reception, and data extraction. In this stage, we gather raw unstructured and structured data, which is the raw material of insight and innovation.

Maintain

Data warehousing, data cleansing, data staging, data processing, and data architecture. We wrangle the raw data to accentuate its structure for enhanced utility.

Process

Data mining, clustering and classification, data modeling, and data summarization. We examine the prepared data’s patterns, ranges, and biases to determine its usability in predictive analyses.

Analyze

Exploration and confirmation, predictive analysis, regression, text mining, and qualitative analysis. This is the functional core of data science. We interrogate and analyze your data to suit your requirements and objectives.

Communicate

Data reporting, data visualization, business intelligence, and decision-making. In this final step, we prepare the results in the form of charts, graphics, and visual reports to summarize and further support your ongoing development journey.

Case study

HIT calling algorithm on DEL selection data

The client’s objective was to identify unique binders to the pool of target proteins via a functional HIT-calling algorithm on DNA-encoded library (DEL) selection data. Due to the compounds’ library size (in millions), the client required a pragmatic statistical method to reliably detect target-specific enrichment.

We started with a detailed analysis of the low read count data generated from the Next-generation sequencing (NGS) processing pipeline. We chose an appropriate normalization algorithm to mitigate the risk of selecting false positives or dropping false negatives, and we applied differentially expressed genes (DEG) concepts based on the contrast sheet. The pipeline was built to be scalable and could be used for identifying binders or non (weak) binders. For additional accuracy, we even implemented notions stemming from the chemical structure space.

Our work with the client produced a fully functional automated pipeline that identified true hits from DNA-encoded library (DEL) selection output data, even in cases where the raw number count is low.

Why Excelra

Led by science, driven by data

We’re a young, agile, cross-functional team of computational biologists, biologists, pharmacologists, and medicinal chemists. We’re tech natives and understand the power of data in the discovery to drug development value chain.

Future-facing solutions

We leverage the power of artificial intelligence (AI) and machine learning (ML) to develop customized data solutions for R&D challenges.

A partner that can scale to your needs

We’re the partner of choice for leading global pharmaceutical and biotechnology companies, delivering over 125+ data science projects.

Ready to get more from data?

Tell us about your objectives. We’ll help get you there.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.