Skip to main content

When pharmaceutical and biotech companies begin studying a disease, the first step they take is to assess the existing data. They review previous attempts at curing or combating the disease and analyze the existing market conditions to identify commercial opportunities. The most efficient approach to do so is to commission a disease landscape.

Disease landscapes include pre-clinical, clinical, and commercial data. They inform a company’s decision to proceed with its research at the outset and support future decision-making stages throughout the development program. This white paper explores the variety of information available in disease landscape reports and reveals the value derived from the insights they provide.

The challenge of drug development

The COVID-19 pandemic provided a traumatic reminder of the impact of disease. Whether endogenous, exogenous, or infectious, diseases have the power to destroy families, devastate communities, and destabilize nations. To combat COVID-19, governments partnered with pharmaceutical companies to develop vaccines and were eventually able to limit the contagion. The industry expended on COVID-19 vaccine development was enormous. It was not, however, unusual. In truth, the engine of medicinal research runs constantly. Every day, in every part of the world, pharmaceutical companies, academic institutes, charity organizations, and hospitals are engaged in a race against time to combat the harm wrought by diseases.

The challenge facing any group seeking to cure or impair a disease is huge. The odds of any drug development program start slim and get thinner as they progress. Success promises great rewards, but pharmaceutical companies proceed with their research at great financial risk. Understandably, they look for any advantage they can gain in their pursuit to discover novel drugs and develop new treatments.

Before they even begin their journey toward drug development, R&D teams must survey the disease landscape to identify market opportunities, review previous failed attempts, and collect all the knowledge available to them. Increasingly, they seek disease landscapes to gain comprehensive, verified, high-quality insights to support the decisions they will make as their research program progresses.

The benefit of disease landscape assessments in drug development programs

Figure 1: Key benefits of Disease Landscape Assessment

Disease landscapes are a collection of all available information on a specific disease and treatments against it. Broadly speaking, the data included in a disease landscape falls into three categories: pre-clinical, clinical, and commercial (Fig.1). Companies benefit from this data when approaching key research milestones. Facing critical go/no-go decisions, R&D teams analyze disease landscape data through a scientific and then a strategic lens to assess a drug development program’s potential for success.

The scientific assessment typically involves data about a disease’s biology, pathophysiology, treatments, biomarkers, genes, pre-clinical models, approved drugs, and their toxicity, safety, and mechanism of action (MoA).
The strategic assessment focuses on market opportunity, commercial competitors, existing patents, and unmet patient needs.

If either scientific or strategic assessment reveals a low chance of success, the program is paused, re-calibrated, or even terminated. With the stakes so high, it’s imperative that disease landscapes provide accurate, high-quality data covering all the assessment requirements.

Confronting disease biology and data diversity

A disease is an abnormal physiological manifestation of cellular and molecular events that lead to disorder in otherwise normal functions of the body. To fully understand a disease, scientists investigate it from many different aspects.

Those aspects include:

  • Symptoms and complications
  • Pathophysiology
  • Triggers
  • Mechanism of disease onset
  • Disease progression
  • Anatomical correlations
  • The role of innate and acquired immunity
  • Gender bias
  • Disease frequency
  • Geographical correlation
  • Disturbances to the genetic makeup of cells and tissues
  • Gene and protein targets
  • Disease pathway

A broad coalition of scientific disciplines has contributed to our collective knowledge about diseases. Without their combined effort, medical progress would have made much slower advances. However, the diversity of domain specialisms studying different characteristics of diseases has created a problem: the data produced from their research is naturally heterogeneous, inconsistent, and often incompatible. It’s stored in different formats across disparate databases and requires distinct methods of analysis.

This makes it incredibly difficult to produce disease landscapes of the quality required to inform critical research decisions. Sourcing the data is one thing – extracting, normalizing, and analyzing it is quite another.

Faced with the resource-demanding complexity of selecting data from such a vast open landscape and presenting it in an easily digestible form, research institutions and pharmaceutical companies have sought external partners to fulfill their requirements. Industrious data service providers have taken up the task of supplying disease landscapes for drug development programs, but results have varied. Data specialists have been able to meet the standards required for extraction and normalization, but without scientific domain expertise, the quality of data can fluctuate dramatically, and coverage data can be insufficient. Suppliers with scientific credentials, on the other hand, are able to select relevant, verified data but lack the data-handling expertise to execute requests quickly whilst maintaining data consistency at high volume.

Clients seeking comprehensive, high-quality disease landscapes are ill-served by suppliers that can’t align data fluency with scientific expertise.

Assessing treatment regimens

As COVID-19 proved, diseases still have the potential to destabilize our world. But the outlook is promising. Technological advances and the exponential increase in data have led us to the point at which many diseases have treatments available to combat them. Or, at the very least, prospective treatments in active development. Before appearing on the market, these prospective treatments must successfully complete a series of trials and prove safe and effective enough to meet regulatory approval. Early in the development cycle, researchers compare their proposed compound against others already available. They do this by assessing the treatment regimens of all known approved drugs in circulation.

Treatment regime assessments incorporate analyses of a ‘ ‘drug’s mechanism of action, interaction with cellular targets, drug metabolism, efficacy, toxicity, and chemistry. Disease landscapes include all of this data, alongside information about previous and ongoing clinical trials, investigational compounds, animal models, unmet medical needs, and key opinion leaders (KOLs). These additional elements concentrate a ‘ ‘company’s focus on the potential for commercial success – a crucial consideration before committing to portfolio expansion.

With in-depth, validated data compiled on disease biology and treatment regimes, pharmaceutical companies can strengthen the foundations of their research decisions and accelerate their programs. Disease landscapes facilitate seamless progression from translational research into drug discovery and development, and through each stage of pre-clinical and clinical trials.

An outline of disease landscape assessments

Disease landscapes are composed following a series of assessment exercises. These exercises determine disease biology, druggability, safety, pre-clinical disease models, and clinical landscape.

Fig 2 - Icon 1 - Disease assessment

Disease assessments

A disease biology assessment is the first step in the disease landscape project. For this assessment, scientific curators investigate numerous sources of medical information and select content about a disease’s onset and progression. They also capture information on the primary, secondary, or tertiary effects of the disease, as well as gender bias, pathophysiology, symptoms, and complications. More extensive disease biology assessments even include details regarding the genetic and molecular perturbations associated with the disease. Modern technology can quickly identify genetic associations, so the inclusion of this data should not be neglected.

Fig 2 - icon 2 - Druggability assessments

Druggability assessments

Before committing to new drug development or portfolio expansion, pharmaceutical companies investigate target druggability. Druggable targets are more likely to translate to successful clinical candidates. In silico druggability assessment is widely used to understand the protein-ligand binding site and the conformationally active site. These assessments provide a druggability score, which helps identify targets most likely to respond to the proposed drug.

Fig 2 - icon 3 - Safety assessments

Safety assessments

Drug safety assessments are another key stage of drug development programs. Identifying the correct dose-to-response relations and potential hazards of a drug of interest must be established before the product can be commercialized. To meet the standards required for regulatory approval, pharmaceutical companies qualify a drug’s safety by analyzing its toxicity, adverse events analysis, by-products, and impact on physiological functions. There are many historical examples of insufficient safety assessments leading to public health crises, and governments now set extremely high safety standards to eliminate any possibility of undetected side effects. All potential risks must be averted in the development stage through rigorous and diligent safety evaluations. Pharmaceuticals employ a variety of analytical approaches to ensure their safety assessments are beyond contention, including adverse outcomes pathways (AOP) and on-target and off-target analysis.

Fig 2 - icon 4 - Pre-clinical disease model assessments

Pre-clinical disease model assessments

Pre-clinical disease models are invaluable to effective drug development programs, as the selection of an appropriate pre-clinical model promises a seamless translation to clinical studies. Identifying an appropriate animal model that manifests relevant disease phenotypes is also of huge importance.

Pre-clinical disease models have deepened the understanding of ADMET (absorption, distribution, metabolism, excretion, and toxicity) characteristics of potential drugs as well as their possible effects on the pathophysiology of biological systems. Developing on-site testing facilities requires huge investment, so pharmaceutical companies rely on pre-clinical disease model assessment exercises to provide insights from available test data.

Fig 2 - icon 5 - Clinical landscape assessments

Clinical landscape assessments

Clinical landscape assessments provide a holistic view of ongoing global research in the disease or drug of interest. The assessments capture all relevant information covering the successes and failures of similar drugs in clinical stages, competing drugs in the market, and ongoing clinical trials. These assessments help position the candidate in the context of the other drugs available in the market. They also confirm whether the proposed drug will be a first-in-class compound, informing ongoing commercial and intellectual property (IP) strategies.

Excelra’s contribution to disease landscape analysis

The value of disease landscapes has been established. So too, the importance of exceptional data quantity, quality, and coverage. Also, the necessity for pharmaceutical companies to partner with a disease landscape provider with equity of scientific domain expertise, technological excellence, and data curation experience.

The standout partner to meet these requirements is Excelra.

Excelra has been supplying disease landscapes to the world’s biggest pharmaceutical companies for ten years. Its disease landscape curation team includes biologists, chemists, and data analysts collaborating with each other and their clients to provide disease landscapes that deliver precisely what the research program requires. The size and flexibility of Excelra’s team mean that clients can engage them on short projects supporting immediate near-term objectives or on extensive, long-term projects to support every stage of the client’s drug development program.

Excelra’s disease landscape services fall into five broad categories as shown below.

Disease dossier

Fig 3 - icon 1 - Disease dossier

Target identification

Fig 3 - icon 2 - Target identification

Disease centric repurposing

Indication prioritization

Biomarker identification

Disease dossier

A disease dossier is created using text-mining algorithms that extract information from medical literature and data sets related to a particular disease. To ensure quality, the extracted data is manually validated by subject matter experts. The dossier provides insights into the clinical and commercial value associated with the disease or drug of interest. Depending on the objective, the dossier can include the benefits and limitations of different drug regimens, patient responses, clinical trials, possible targets, and disease similarities. All the information is delivered in a clear, easy-to-follow document to facilitate effective searching and easy referencing.

Target identification

Target identification is an early-stage discovery activity conducted by pharmaceutical companies in the pursuit of first-in-class or best-in-class drugs for their disease of interest. A scientifically ideal target must be demonstrably effective, impact relevant downstream pathways to improve the patient’s condition and/or achieve commercial viability by addressing unmet medical needs. Target identification exercises help researchers select targets most relevant to the pathophysiology of a disease and are conducted using gene or tissue expression, knock-out/knock-in studies, downstream pathways analysis, and safety assessments.

Disease centric repurposing

Pharmaceutical companies don’t only focus on the development of new drugs. Sometimes, repurposing existing drugs is a more effective and efficient approach to combating the causes or symptoms of a disease. Disease-centric repurposing requires the identification of drugs that could be used on alternative indications. Candidates for repurposing are selected by analyzing data on their mechanism of action, efficacies, target assessment, and safety analysis.

Indication prioritization

Advanced bioinformatics techniques and chemical, biological, and clinical intelligence can be applied to high-quality, manually curated data to reveal the most relevant indications for a drug or target under study. These indications are then evaluated and prioritized in order of success probability. The prioritization of indications can accelerate the drug development process and reduce the number of failures.

Biomarker identification

Biomarkers are relevant to the entire research program. They help map a disease’s progression, support patient stratification, contribute to the identification of perturbed pathways, and facilitate the correlation of mechanisms of action. Biomarker identification is, therefore, an essential stage of translational medicine. Using extensive omics data sets and advanced data-mining techniques, Excelra’s experts provide detailed insights on biomarkers associated with a given disease.

Excelra’s disease landscapes deliver exceptional value to pharmaceutical research programs

Excelra’s disease landscape services have been commissioned by pharmaceutical clients to meet a diverse range of objectives. An impressive selection of case studies is shared below.

Preparing a compendium of published data associated with post-acute COVID syndrome (PACS) with a focus on nutraceuticals and amino acids

Diseases are often accompanied by distinguishing changes in the metabolism of certain vital nutrients. Some of these nutrients, like amino acids, can be used to study disease onset, progression, or downstream implications. In this project, Excelra’s experts compiled a disease landscape report with studies related to nutraceuticals and amino acids in post-acute COVID syndrome.

The data was curated from multiple sources, including:

  • LitCOVID: A comprehensive resource providing access to COVID-related articles in PubMed.[i]
  • WHO COVID19: The World Health Organization’s research database, updated daily with bibliographic database searches, hand searches, and expert-referred scientific articles.[ii]
  • Survivor Corps: One of the most active and robust COVID-19 data sets and research tools in the world.[iii]
  • Embase(Excerpta Medica Database): A database of published literature with a focus on the regulatory requirements of licensed drugs.[iv]
  • iSearch COVID-19 Portfolio: A portfolio curated by experts in the field and updated daily for publications and preprints related to either COVID-19 or the novel coronavirus SARS-CoV-2.[v]

Excelra’s experts prepared a lexicon for the curation exercise before identifying suitable articles from the listed data sources. Data was curated from relevant articles and assigned to one of three categories: pharmacology, pathology, or symptoms.

Finally, Excelra created an extensive document capturing valuable information from each article. The document was delivered to the client’s specified requirements and – was updated every two weeks.

Figure 2: Disease data survey to capture details on amino acids, metabolites, disease pathology, and symptoms related to PACS

Value delivered

  • An exhaustive collection of articles and data associated with pharmacology, symptoms, and mechanism related to the indication.
  • The extracted data was used to find associations between amino acids/nutraceuticals and symptoms associated with PACS.

Conducting a disease-specific phenotype data survey to help prepare disease phenotype monographs for chronic obstructive pulmonary disease (COPD)

COPD is a group of lung diseases known to create breathing obstructions. Some of the phenotypic manifestations of COPD include mucus hyper-secretion, mucus clearance impairment, and mucus cell metaplasia.

In an engagement with a large pharma company, Excelra curated valuable information pertaining to disease phenotype, drug landscape, gene expression, and translatability for COPD (Fig.5).

Figure 3: Disease landscape assessment for chronic obstructive pulmonary disease (COPD)

Disease pathophysiology is the first indicator of a disease in a biological system, so recognition of symptoms is crucial for the early detection of disease. But symptoms are often manifestations of underlying genetic or epigenetic variations. Gene targets, biomarkers, point mutations, and perturbed pathways are included in the exploratory analysis to understand disease onset or progression. Gene expression studies are being extensively used to determine the genetic causes of disease. Excelra’s scientific experts are able to evaluate literature and analyze omics data sets to highlight mechanisms of action of causative mutations, proteomes, and metabolome variations in the perturbed cellular systems.

In this project, Excelra’s curators assessed the drug landscape for COPD to identify pre-clinical, clinical, and FDA-approved drugs. They also identified drug repurposing projects for the same indication. A comprehensive list of information was curated for pre-clinical models under study, drug safety, adverse events, and endpoints.

The first stage of the project was the development of a lexicon for text-mining literature in public and proprietary databases, focusing on the phenotypic manifestations of COPD. Following data collection, several omics datasets were formatted and processed in preparation for analysis. Where there’s a requirement to process multiple data sets, Excelra builds data analysis pipelines that can be deployed in cloud environments for repeat use, and that was the case in this project. As always, Excelra’s curators performed manual validation of search results before reporting the monographs to the client.

Figure 4: Phenotype monograph curation workflow for chronic obstructive pulmonary disease

Value delivered:

  • Monographs were developed for each phenotype: mucus hypersecretion, mucus clearance impairment, and mucus cell hyperplasia.
  • The Monographs were consistently structured to streamline queries and improve interpretation.

Identifying cardiac biomarkers in Fabry disease related to disease progression, clinical outcomes, and prognosis.

Biomarkers are measurable characteristics that can be used to detect the onset or progression of a disease. They’re especially useful in the study of rare diseases, which has experienced an increase in the application of biomarker research for disease prognosis, treatment, and diagnosis. Fabry disease, for example, is ‘an inherited neurological disorder that occurs when the enzyme alpha-galactosidase-A cannot efficiently break down fatty materials known as lipids into smaller components that provide energy to the body.’[vi] Fabry disease has been subject to a great deal of research toward developing a cure, and biomarkers are an essential feature of the ongoing research. Excelra’s client was engaged in this study area and requested a list of established and putative cardiac biomarkers associated with Fabry disease.

Using the disease name, identifiers, and synonyms as input, text-mining queries were built to search biomarker databases like GOBIOM and MarkersDB. Clinical trial registries and published articles from PubMed were also explored for cardiac biomarkers that have an established association with Fabry disease. Using the disease lexicon, Excelra’s experts shortlisted biomarkers with cardiac manifestation and symptoms. The lists were then annotated with additional function, utility, and MoA-related information related to disease progression, prognosis, and clinical outcomes (Fig. 5).

Figure 5: Workflow for biomarker identification in Fabry disease

Value delivered:

  • The client received a comprehensive annotated list of reported and putative biomarkers, including evidence, data source, and secondary data points such as category and utility.

Conducting a comprehensive survey and analysis exercise of publications on clinical trials in the US, EU, and Japan associated with hyperphosphatemia to reveal insights into dose-efficacy and endpoint relationships.

Clinical trial data helps pharmaceutical companies make informed decisions along their drug discovery journey. A biotech firm engaged Excelra to collect information on clinical trials associated with hyperphosphatemia to assist in its ongoing research program into that disease. The client requested a comprehensive disease landscape, including all available information on drug doses, efficacy, and endpoints (Fig. 6).

Figure 6: Concepts explored for hyperphosphatemia disease landscape

Excelra’s team collected the required information via in-depth data mining on, JMACTR, UMIN-CTR, and other relevant repositories. Those clinical trials that met the client’s criteria were shortlisted for curation by Excelra’s subject matter experts, who extracted efficiency, dose, and endpoints data. The data were analyzed and prepared for downstream processing, and delivered to the client to the standard required (Fig. 7).

Figure 7: Efficacy, dose, and endpoint profiling of drugs associated with hyperphosphatemia

Excelra also prepared a comprehensive landscape of phosphate binders. The pill burden and adverse effects (particularly gastrointestinal intolerance) associated with phosphate binders often contribute to poor medication adherence. At the end of Excelra’s investigation into phosphate binders, data was revealed relating to indications, reported efficacy, dosages, and physicochemical properties (Fig. 8).

Figure 8: Comprehensive landscaping of phosphate binders

Value delivered:

  • Comprehensive data were collected on clinical studies associated with hyperphosphatemia and phosphate binders, delivering valuable insight into dose efficacy and endpoint relationships.

Identifying and prioritizing compounds for the treatment of rare monogenic blood disorders

A small pharma company requested Excelra’s help to identify potentially effective drugs for the treatment of rare monogenic blood disorders. Rare monogenic disorders are primarily caused by single gene mutations, so Excelra explored existing literature and datasets to establish disease-drug correlations. Many approaches were used, including disease similarities, drug-gene signatures, and genome-wide association studies (GWAS). The extracted information was used to create disease-drug pairs based on their mechanism of action (Fig. 9).

Figure 9: Approaches for asset identification in relation to rare monogenic blood disorders

The top five mechanisms of action were selected, and drugs were prioritized for each of them. An in-depth analysis of each of the MoAs was completed considering the following points:

  • Relevance of targets and MoAs in the disease
  • Clinical or pre-clinical scientific evidence
  • Known literature on animal models, target safety, and hypotheses availability

Following this stage, the best drugs for each mechanism of action were recommended to the client, and the MoAs and compounds were prioritized according to Excelra’s prioritization process (Fig.10). Excelra conducted a further relevancy check on the mechanisms of action for compounds/assets to discover if they promoted or alleviated disease and complications.

Figure 10: Prioritization of MoAs and compounds/assets

Value addition:

  • A scientific rationale for each MoA in the context of disease pathophysiology was constructed
  • A priority list of drugs was delivered, and recommendations were made for each disease
  • Suitable animal models for PoC experiments were recommended to the client

Disease landscapes: a major advantage in the drug development journey

Disease landscapes provide valuable scientific and commercial insights to pharmaceutical companies engaged in drug discovery, development, and repurposing programs.

Exclera is the standout partner to deliver disease landscapes, given its unique combination of scientific domain expertise and data analysis capabilities.

Excelra’s disease landscapes are invaluable resources to support short-term objectives or long-term projects.

If you’re interested in Excelra’s disease landscape services, get in touch. Whatever your goals, we can help you achieve them.

Contact us


i. LitCovid. (n.d.). Nih.Gov. Retrieved 2 March 2023, from

ii. Coronavirus disease (COVID-19) – World Health Organization. (n.d.). Retrieved 2 March 2023, from website:

iii. Survivor Corps. (n.d.). Retrieved 2 March 2023, from Survivor Corps website:

iv. Welcome – Embase. (n.d.). Retrieved 2 March 2023, from website:

v. COVID-19 Portfolio. (n.d.). Retrieved 2 March 2023, from website:

vi. Spaeth, G. L. (1965). Fabry’s disease: Its ocular manifestations. Archives of Ophthalmology, 74(6), 760. doi:10.1001/archopht.1965.00970040762005

How can we help you?

We speak life science data and help you unlock its potential.