Skip to main content

It is amply clear that digitization and utilization of data science is playing an important role in all aspects of human lives. Drug discovery and development is no exception. How far we have successfully implemented digitization and data science in drug discovery and development? Is this reality or a far-fetched dream?

Question: With the rapid foray of data science and digital transformation technologies in a pan-industry manner, there seem to be various interpretations regarding the identity and role of these terms in the life sciences and biopharma industry. Can you share your perspective on this?

Digital transformation is the buzz word these days, everyone is talking about it. But if we look under the hood, there are two major components:

‘Digitization’ and ‘Digitalization’

In ‘digitization’, unstructured and scattered data is converted into a structured and machine-readable format.

On the other hand, in ‘digitalization’, advanced analytics such as machine learning or deep learning methods are applied on top of the data to derive value from it.

Digital transformation in pharmaceutical industry is nothing but adaptation of digitization at organizational level and implementation of analytics to accelerate drug discovery and development, to bring better medicines for the benefit of humankind.

Why is data transformation important?

  • It helps bring efficiency in operations
  • It enables engagement of the stakeholders effectively in a data-driven approach
  • It helps uncover new trends thereby paving the way for generation of new ideas and innovation

Until recently, digital transformation seemed like a long-term vision across industries. However, recent global events and circumstances have forced enterprises to embrace it rather quickly.

Question: You alluded to the conversion of unstructured information into structured formats, data analytics driven by AI/ML tools and technologies, and the general principles of data transformation. How do all these diverse elements come together to function in tandem, where does it start and end?

Let us understand the journey from data to insights, as this sets the stage for fundamental understanding on this topic.

  • Multiple data points culminate into ‘information’
  • Linking the information together builds ‘knowledge’
  • Identifying patterns in the knowledgebase generates ‘insights’
  • The end-point of this journey is ‘wisdom’ which dictates what to do and what not to do

In this journey, everyone tends to focus on the “attractive” analytics piece, majorly driven by AI.

However, it is equally if not more important to focus on the initial part, up to building knowledge, as this forms the foundation of all future analytics.

If enough emphasis is not given to the first steps, we are left with artefactual results where we may not be able to make any sense of it. This initial phase can be broadly termed as ‘data digitization’ that involves structuring, harmonizing and integrating data.

The second phase is ‘data analytics’ where an outcome is predicted by leveraging AI/ML tools on structured data.

Question: How are all these “data” principles applied to the vast, multi-domain life sciences industry?

There is a deluge of data in the biopharma industry. If you look at the trajectory from drug target identification all the way to initiation of a clinical study, there are a number of nodes in between.

Each node requires data from various streams such chemistry, biology, discovery technology, DMPK, efficacy, safety, IND enabling studies and much more.

While it’s great to generate such rich datasets, unless we practice digitization principles, the data will become useless in no time.

There are several key aspects of digitization we must consider such as standardization, ontologies, annotation, FAIR data principle practices and data warehouse creation.

Question: The aspects of digitization, and how they are specific in context and practice within the biopharma space?

‘Annotation’ and ‘contextualization’ are complex and multi-layered problems, unique to life sciences.

‘Data standardization’ across experiments is another crucial element for performing any advanced analytics. The situation is further confounded today with the availability of many open access heterogeneous databases that pharma companies wish to combine with their proprietary data assets, a task that cannot be performed unless this data is standardized.

Question: Ontologies and data standards are key aspects to consider within the purview of data digitization in this industry. What is the importance of these topics?

Yes. Another important challenge in digitization is the usage of ‘ontologies’.

Regarding ‘data standards’, digitization is a reasonably well accepted practice in regulatory submission in drug discovery and development.

If we have consistent and well digitized data at the foundation level, we do not have to reinvent the wheel to submit the data to regulatory authorities as per their formats.

Question: Considering the sheer volume and diversity of data generated in biopharma research, how does one approach data digitization?

First, there is manual data curation by SMEs- this ensures good quality but yields low volume.

Second, is high throughput automated data curation using machine learning, text mining and NLP.

Finally, we have semi-automation as a middle ground, enabling validation and contextual enrichment.

Question: Having discussed digitization as the preliminary part of the data journey; Where does artificial intelligence come into the journey?

AI is any program that can sense, reason, act and adapt. ML is a subset of AI, and DL is a further subset of ML.

Irrespective of terminology, these technologies are useful across the pharma value chain.

At Excelra, we have been successful in implementing AI methods, having provided various scientific informatics and data science services to partners accelerating drug discovery and development.

Only after we have tackled all these aspects can we leverage the true power of AI and ensure that treatments reach the market faster and cheaper, to impact lives and improve outcomes.

Question: AI technologies support the entire pharma value chain; what are some specific examples of its utility and the major players using AI to optimize drug discovery and development?

Sure, there are several AI applications either under development or being implemented in all aspects of the drug discovery and development paradigm including pre-clinical, clinical, manufacturing, supply chain, commercial and post-market surveillance.

As we know, AI is more prevalent or practiced in clinical stage and thereafter, where the data is more structured and standardized. This is another testimony to emphasize the importance of structured data and need of digitization at the beginning of the journey.

At Excelra, we have been successful in implementing AI methods, having provided various services to our partners towards accelerating their drug discovery and development programs.

Few more examples come to mind where several traditional large pharma companies and even younger biotech start-ups have embraced AI and digital transformation in their R&D efforts:

Novartis has digital innovation hubs across several geographies, while on the other hand BI has digital labs housed within a separate entity called BI-X that supports initiatives within the organization.
Companies like Lilly and Teva leverage AI for manufacturing, while others like Pfizer are focused on employing AI for optimizing patient engagement.
Insilico Medicine an AI-based biotech, worked in tandem with WuXi and Uni. of Toronto, and identified a potential drug in a record time of 46 days which is 10-15 times faster than conventional methods.
Another highlight is the collaboration between a pharma company Sumitomo Dainippon and an AI-company Exscientia, who entered the first AI-predicted drug into clinical trials recently against obsessive compulsive disorder. This was done in under a year, whereas a traditional process would have taken up to 4 years to complete.
The last two examples specially provide substantial testimony to the utility of AI in accelerating drug discovery.

Question: Although it is early days; it surely looks like the life sciences industry has been able to successfully implement AI across the broad spectrum of activities in drug discovery and development. What is the reality of the global acceptance and practice of AI in the pharma world? Finally, what are the pitfalls one must look out for and where is this transformation heading towards?

At the outset, it is indeed noteworthy to acknowledge that pharma and biotech companies are exploring and taking advantage of AI as a mainstream tool to accelerate, optimize and improve numerous processes, functions and stages of drug design and development these days. However, the pharma industry is still lagging in the area of monetization on AI.

I recall a study by McKinsey published in The Economist that compared different sectors and their gains from AI. Pharma was at the very bottom with respect to the % share of total analytics, as well as gains in absolute numbers.

This points towards a huge room for improvement.

A lay person might wonder why pharma is last in the list. The bottom line comes down to what we discussed earlier; that we are presently challenged by data, specifically all the confounding factors we have noted at the beginning of the data journey: heterogeneity, complexity, context driven challenges, lack of standard ontologies etc.

We have to further admit that our field has been relatively slow in transitioning from legacy systems to sophisticated technologies.

Finally, in pharma, purely data science doesn’t feature as a standalone solution; rather, a deep understanding of the life science domain is fundamentally needed to draw meaningful insights.

Having said all this, there is a way forward to really derive synergy from the unison of data, data science and digital transformation in this industry. It is important that we develop standards for digitization, democratize data, adopt new technologies, build cross-functional teams and collaborate with external partners wherever necessary.

Only after we have tackled all these aspects can we leverage the true power of AI and ensure that treatments reach the market faster and cheaper, to impact lives and improve outcomes.