Contact Us
Posts By :


Using Legacy and Current Data To Accelerate Drug Development

A thought-leadership article.

Dr. Kavita Lamror

Director – Value Evidence Services

“Information is the oil of the 21st century, and analytics is the combustion engine.” – Peter Sondergaard

Randomized controlled trials (RCTs) are the gold standard of evidence for establishing the value of an intervention and obtaining regulatory approval. However, patient-level RCT data from legacy trials is not available for further hypothesis testing and analysis. As interest in retrospective analysis of clinical data arises, it is worth noting that most of this data is available in the Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) format and permissions need to be sought for aggregation and transformation.

In addition to RCT data, the United States Food and Drug Administration (US FDA) and European Medicines Agency (EMA) have created guidelines to accommodate real world evidence (RWE) generated from claims, registries and electronic medical records (EMR) data sets for demonstrating the efficacy, safety, and effectiveness of drugs. This data, if analysed as per approved guidelines, is accepted for regulatory and reimbursement approvals. However, real world data (RWD) is coded in multiple formats and needs to be processed further. Various organizations have started transforming this data to Observational Medical Outcomes Partnership (OMOP) common data model (CDM) formats for RWE generation. This adds enhanced substantiation to the added “value” of the drug.

As the pharmaceutical industry faces rising research and development costs (R&D) costs per drug brought to the market, there exists a compelling need to optimize the existing data assets and shorten the drug development life-cycle. Heterogeneity of treatment effect (HTE) is another challenge for the industry. HTE is defined as the difference in patient outcomes measured from post-launch RWD as compared to results observed in pre-launch RCTs. It might occur due to real-world risk exposures not accounted for in the target population and has the potential to have a significant impact on the accessibility to and acceptance of the drug by the end-user. While probable confounding factors and risk factors are identified from existing published research, it is sometimes difficult to estimate the impact of unobserved exposures on treatment impact. This evidence might be available in the vast amounts of data existing in RCTs and RWE.

If the data from RCTs and RWD could be aggregated in an analyzable format, it has the potential to be utilized for clinical trial planning, segmented patient targeting, predicting clinical outcomes, improving efficiencies in health care systems, and tracking safety outcomes with increased accuracy. The legacy and ongoing RCT data, can be transformed by mapping CDISC SDTM data sets into OMOP CDM for RWD.

Excelra’s Approach

Excelra understands the need of the scientific community to aggregate, extract-transform-load, standardize, visualize, and analyse this data. As key stakeholders in the research community move towards “findability, accessibility, interoperability, and reusability” (FAIR) data standards for improving biopharma productivity, our data scientists can help create scalable clinical data repositories for interacting with data in a convenient and efficient manner on an automated platform. An acute understanding of data provenance and lineage is key to successful insight generation and our skilled team leaves no stone unturned while transforming big data into effortless data engines for bespoke client solutions.

Excelra’s “Molecule to Market” processes are Health Insurance Portability and Accountability Act (HIPAA), Europe General Data Protection Regulation (EU GDPR), as well as 21 Code of Federal Regulations (CFR) part 11 compliant to ensure implementation of best practices while transforming confidential data into meaningful insights for accelerating your drug discovery needs.

Tocilizumab: Repurposing Candidate for Treatment Of Cytokine Release Syndrome Caused By Sars-Cov-2 Infection

Cytokine storm is an uncontrolled release of cytokines in the body in response to external stimuli leading to systemic inflammation. Ferrara et al. (1993) first coined the term “cytokine storm” in graft-versus-host disease. Approximately, 15.7% of COVID-19 patients develop severe pneumonia and cytokine storm. At present there are no specific drugs against either COVID-19 or the cytokine storm it causes in the virus infected patients. Studies have reported the involvement of IL-6 in infection-induced cytokine storm. An IL-6 receptor antagonist Tocilizumab, has been approved (FDA) for the treatment of cytokine release syndrome (CRS) associated with idiopathic arthritis, rheumatoid arthritis, and giant cell arthritis.

Tocilizumab may be used for the treatment of cytokine storm caused by SARS-CoV-2 infection. The cytokine, IL-6 binds to its receptor IL-6R which in turn binds to the signal transducer glycoprotein 130 (gp-130). This initiates and triggers downstream signal transduction. IL-6R may exists in membrane-bound form (mIL-6R), or in soluble form (sIL-6R). In the classical signal transduction pathway, IL-6 binds to mIL-6R to form a complex, and then binds to gp-130. While, in the trans-signaling pathway, IL-6 forms a complex with sIL-6R and gp-130. In either scenario, IL-6 activates two completely different signaling pathways:

  1. The JAK/STAT tyrosine kinase system (major pathway) and
  2. The Ras/mitogen activated protein kinase (MAPK)/NF-κB-IL-6 pathway.

Macrophages, neutrophils, T-cells, etc. express mIL-6R and are involved in the classical signaling at low IL-6 levels. While at high IL-6 levels, trans signaling via sIL-6R can activate virtually all cells of the body and regulate pro-inflammatory reactions. Tocilizumab, can bind to both mIL-6R and sIL-6R to inhibit both classical and trans-signals to control “Cytokine storm” caused due to SARS-CoV infection.

Excelra’s open-access COVID-19 Drug Repurposing Database is a synoptic compilation of ‘Approved’ small molecules and biologics, which can rapidly enter either Phase 2 or 3, or may even be used directly in clinical settings against COVID-19. The database additionally includes information on promising drug candidates that are in various clinical, pre-clinical and experimental stages of drug discovery and development.

Supported with referenced literature, we provide mechanistic insights into SARS-CoV-2 biology and disease pathogenesis. We hope that these drug repositioning approaches can help the global biotech and pharma community develop treatments to combat COVID-19.

The COVID-19 Vaccine Landscape

The ongoing COVID-19 pandemic is continuing to spread rapidly across the globe. As of 7th Feb 2021, the World Health Organization (WHO) has reported over 105 million cases and over 2 million deaths across 219 countries. Current clinical research is focused on accelerating the development of drugs and vaccines for treatment of SARS-CoV-2 infection.

With the number of clinical trials increasing by the day, identification and selection of potential biomarkers for inclusion in the clinical trials is of paramount importance for the success of COVID-19 clinical studies. Excelra’s COVID-19 Database, an open access biomarker database is our contribution to the global scientific community, to help identify biomarkers from published clinical trials against the novel coronavirus disease.

Vaccines in development

A broad range of candidate COVID-19 vaccines are being investigated globally using various platforms. A handful of vaccines have been approved by various regulatory authorities and many more remain in development at both clinical and pre-clinical stages.

Currently, there are 63 candidate vaccines in clinical development and 175 vaccines in pre-clinical development. There are 20 candidate vaccines in stage 3 clinical trials and 9 vaccines have been authorized across several countries.

Figure 1: Vaccines in clinical and pre-clinical development.

The vaccines can be broadly categorized into virus vaccines (Attenuated live virus, inactivated dead virus), protein-based vaccines (protein sub-units, virus-like particles), viral vector vaccines (replicating vector, non-replicating vector), and nucleic acid vaccines (DNA vaccine, RNA vaccine).

Among the candidate vaccines in clinical development, the popular ones are the protein-based vaccines followed by non-replicating vector vaccines and inactivated virus vaccines.

Figure 2: Vaccines in clinical development.

Similarly, protein-based vaccines followed by RNA based vaccines and non-replicating vector vaccines are among mostly researched vaccines in pre-clinical development.

Figure 3: Vaccines in pre-clinical development

Vaccines Approved

Below is a list of all vaccines that have achieved regulatory authorization or approval across different countries.

Table 1: List of approved vaccines.

GOBIOM’s free COVID-19 biomarker database

Excelra’s COVID-19 Biomarker Database is a collection of manually curated clinical biomarkers, meticulously annotated by our data-scientists, to support the development of drugs/vaccines for treating COVID-19. With the number of clinical trials increasing by the day, identification and selection of potential biomarkers for inclusion in clinical trials is of paramount importance for the success of COVID-19 clinical studies

Target Dossier Services

Target identification is an essential first step toward the discovery and development of any new therapeutic. Consequently, successful pharma organizations perform due diligence around a target of interest before embarking on the long, time-consuming, expensive and high-risk endeavor of drug development. It is therefore essential to establish a link between the target and the disease with an overlap between mechanism of action of the drug and disease pathophysiology. Ultimately, clinical translatability to ensure that a drug’s activity against the target is efficacious with minimal side effects, marks the crucial difference between success and failure of drug development.

Excelra’ s Custom Target Dossier Services

Excelra is strongly positioned to deliver tailor-made target assessment dossiers based on unique requirements of our global Biotech and Pharma clients. The dossier is a compendium of information on the complete target profile including structural, systemic and functional aspects of a protein and the gene-encoding it.

To facilitate critical ‘Go/No-go decision making’, we provide a 360-degree view of a target detailing:

  1. Role in healthy tissues as well as its association with disease(s).
  2. Molecular pharmacology, target expression across human tissues/organs, species, gene alterations and target interactions with other proteins/genes.
  3. Competitive landscape of drug development by stage (approved, clinical & pre-clinical drugs).
  4. Adverse events or toxicity data for compounds at any stage of development.
  5. ON- and OFF- target safety assessment and de-risking strategies.

With a host of additional features, Excelra’s custom target dossier reports provide actionable insights that apply to various use-case scenarios in drug discovery and development, as detailed below.

Semantic Metadata Catalog for Metadata Management

A reference point within the Enterprise Data Lakes

Traditional techniques of Data Cataloging at a data storage level creates dispersed data silos across the enterprise. An intelligent automated Data Catalog linked to diverse and distributed data storages will enable effective data governance through real time orchestration of people, processes, and technology; enabling an organization to leverage their data as an Enterprise Asset.

Excelra’s Semantic Metadata Catalog is specially designed to help automate and process organization-wide data, creating an Enterprise Data Lake to gain maximum advantage of its enterprise assets. Semantic Metadata is deeply interlinked, richly contextualized and has multiple interconnectivity. The addition of semantic metadata to the metadata content allows a higher-level of abstraction, enabling the creation of programmatic approach to cross-departmental functions and the use of dispersed data assets for a more holistic relationship view. Efficient enterprise data collaboration helps maximize the value in formats that are easy to comprehend, enabling business IT partnership.

Content is leveraged because of the people, places, organizations, brands, topics that it mentions, rather than just structural metadata itself (e.g. file format, file size, creation date etc.).

Semantic Metadata Catalog is based upon conceptual resources and REST. In effect, every resource of interest to an organization exists as a certain type (such as an employee, a product, a location etc.). Depending upon the granularity of the model and the size of the organization, there may be hundreds of these types, but there is typically an inheritance structure that can create a general taxonomy of entity types.

Source of metadata can be systems, end users and metadata API’s. An essential brick in metadata management, is to simplify and automate an enterprise information inventory, as well as update them from different databases as part of a future meta data management strategy.

Semantic Data Catalogs are often very useful for large number of heterogeneous, non-RDF based databases. This is typically the case with Biopharma data.

Semantic Metadata catalogs are ideal for storing information for real world catalogs as well. Most catalogs are, highly referential in nature, with lots of categorization, links to resources, and the need for consistent annotation. Certain aspects of catalog entries are less ideal, such as transactional content, but these can generally be stored externally and then linked to by reference. It should also be worth noting that this content data can also be retrieved as part of the generation of output either within or after a semantic query.

It is noteworthy that semantic catalogs essentially retrieve links to data, not necessarily data itself. The catalog does not automatically translate from one source to another, though having a semantic data catalog is a necessary precursor for this to happen. Schema to schema mapping (also known as ontology to ontology mapping) is a surprisingly complex process, very much akin to translating between language

These data points differ from semantic data catalogs because they are managing mappings from one ontology to another, and  constitute a pretty crucial step towards a universal data conversion engine.

Excelra has extensive domain capabilities around biology, chemistry, clinical and commercial space in developing standard ontologies for linking enterprise research data. The Enterprise Data Lake strategy always has a challenge due to variants of data catalog that usually come into play with organizations that are dealing with differing but conceptually overlapping ontologies. This is typically a problem for a given data catalog type environment. However, because of acquisitions, the enterprise still ends up with multiple ontologies that overlap and need to be translated.  In this case, there is usually the goal of creating a single ontology and although the source ontologies are still in use, an intermediate stage is needed to manage the translation until they can be phased out.

Excelra’s solution strategy  takes into account the importance of maintaining this intermediate layer with the required level of semi-automation along with a dynamic UI framework for non-technical end users to manage this with ease. While it is possible to  integrate everything into the database, by using a semantic data catalog approach, the best would be to bring in intermediate information, transforming it, and caching as appropriate for subsequent queries. This provides a mechanism for importing triples from source files at the time of querying, which can then be put into an intermediate graph, queried, and cached. Once the triples become stale, the graph is deleted.

The catalog entries enable us to effectively pick and choose the information to work with,  while allowing the system to retrieve data from the appropriate systems without having the end user to worry about the source system. The data lineage is always established through the required audit logs without making it complex to the end users.

Transformation of data is  not always reversible and hence becomes a challenge in complying with FAIR principles in an automated process. If, for instance, a transformation creates an attribute with different values based upon the state of two or more variables, disentangling that logic (which is not purely functional) can be extraordinarily complex if not outrightly impossible (for instance, calculating the average from a set of values and passing that average as the value of an attribute). However, knowing the transformation we can recalculate, there should be a change in the attribute in the source, as we have the transformation and the associated target property.

This is also critical for working with both content and digital asset management systems. The assets themselves are generally not stored within the same database as the catalog. Instead, they surface enough metadata, performing entity extraction of metadata and storing this annotational information within the Semantic Data Catalog. This also helps in resolving master data, as this makes it possible to identify both the resource identifiers and the associated relationships.

Excelra understands the need to have phase-wise data processing steps and data logs with the required flexibility to change in meta data, in the interim processing, and hence  the UI functionality has been built to consider all these requirements.

Enterprise IoT Integration

Enterprise IoT is about networks and the relationships between resources, not just in terms of simple properties but in terms of such factors as security, actions, discovery and related areas. Increasingly IoT systems are making use of semantic graphs to keep the complex web of interconnectedness manageable and easy to traverse and query. It makes it cost-effective, FAIR compliant and presents an effective data unification layer along with structured representation through Knowledge Graphs.


Bringing semantic technologies into the process of metadata management ensures that data is smarter for content as well as knowledge discovery and transfer. This data allows systems to automatically assign topics and categories to resources and further infer context from that information.

Excelra’s key strength lies in understanding semantic metadata and leveraging it to create and consume more interconnected, richer, well-structured and retrievable resources that can have a direct impact on an organization’s profits and performance. We have built our data understanding, expertise and design, having worked with the Bio-pharma industry over a period of 18 years.

Semantic Data Catalogs are not widely regarded due to the associated complexityExcelra’s solutioning offers an intelligent and Dynamic UI based interface, which acts as a solution for non-technical users as well, allowing the organization to better adapt and retain to a solution. An effective and efficient Semantic Data Catalog is a key component of Excelra’s Enterprise Data Strategy, making meta data the most valuable Enterprise Asset.


The effectiveness of semantic technology usually comes down to maintaining data discipline and governance. This is why effective metadata management is less about tools than it is about the process.


Immune overreaction to a viral infection can lead to a Cytokine storm. It may also result in serious complications such as pneumonia and acute respiratory distress syndrome (ARDS). Ruxolitinib blocks cytokine mediated response by inhibiting the activation of JAK1 and JAK2, a critical intracellular cytokine signaling kinase. Ruxolitinib also acts as a potent suppressor of the secretory phenotype of senescent cells and lethal inflammatory response. Ruxolitinib is a JAK1/JAK2 inhibitor, approved for the treatment of polycythemia vera, myelofibrosis and graft-versus-host disease can be repurposed for the treatment of COVID-19 patients. Currently there are 8 clinical trials looking into the safety and efficacy of Ruxolitinib for treatment of novel coronavirus. Recently, Incyte has initiated a Phase-3 clinical trial of Ruxolitinib to treat cytokine storm caused by SARS-CoV-2 infection.

Excelra’s open-access COVID-19 Drug Repurposing Database is a synoptic compilation of ‘Approved’ small molecules and biologics, which can rapidly enter either Phase 2 or 3, or may even be used directly in clinical settings against COVID-19. The database additionally includes information on promising drug candidates that are in various clinical, pre-clinical and experimental stages of drug discovery and development.

Supported with referenced literature, we provide mechanistic insights into SARS-CoV-2 biology and disease pathogenesis. We hope that these drug repositioning approaches can help the global biotech and pharma community develop treatments to combat COVID-19.


Drug repurposing or repositioning is a promising approach for identifying new indications for approved or investigational (including clinically failed) drugs that have not been approved by FDA.

Drug repurposing evolved as an impeccable strategy because these drugs can bring medications to patients much faster and with less cost than that of developing new drugs. Safety and ADMET (absorption, distribution, metabolism, elimination, toxicity) of these drugs has already been tested in clinical trials for other indications.

COVID‐19 is an acute respiratory disease caused by the RNA virus SARS‐CoV‐2. The situation of the COVID‐19 pandemic is continuously evolving rigorously in more than 180 countries. Effective treatments are in urgent need, but currently, no drug with stable performance has been found for COVID‐19 hence, drug repurposing has evolved as a promising strategy for COVID‐19 treatment.

SARS-CoV-2 requires host cellular factors such as angiotensin I converting enzyme 2 (ACE2), transmembrane serine protease 2 (TMPRSS2) and furin for successful replication during infection. Systematic targeting of the SARS-CoV-2 interactome offers a novel strategy for effective drug repurposing for COVID-19.

Through affinity purification mass spectrometry, SARS-CoV-2 virus-host interactome shows 332 high-confidence protein-protein interactions between 26 viral and human proteins.

List of drugs that may be used against COVID-19

A comprehensive literature search to identify potential pharmacological agents that may be used against COVID-19 have yielded following 4 classes of drugs

  1. Drugs acting on viral replication
  2. Drugs acting on viral entry
  3. Drugs acting on cytokine release
  4. Drugs enhancing Immune response/endothelial dysfunction

Figure 1: List of drugs being evaluated against COVID-19 Source: Drug repurposing approach to fight COVID-19

GOBIOM approach on drug repurposing for COVID‐19 treatment

GOBIOM’s COVID-19 Biomarker Database has been released in support of the ongoing global scientific efforts, aimed at developing safe and effective therapeutic options to treat the novel coronavirus disease.


The COVID-19 Biomarker Database provides data insights on repurposed drug and the indication for which drug is approved to treat, target (Biomarker) of the drug in pathology and its mode of action with clinical outcome of drug response. Database includes information on highest developmental status of the drugs that are in various ‘clinical, pre-clinical and experimental’ stages of drug discovery and development with direct access to scientific literature evidences.


Antiviral Drugs and Targets/Biomarkers


Remdesivir: A monophosphate prodrug of an active C-adenosine nucleoside triphosphate analogue, was originally used for the potential treatment of Ebola virus disease. Remdesivir significantly reduces the median recovery time to 11 days, compared with 15 days in the placebo group.

Figure 2: Pharmacodynamic/Response biomarkers of COVID-19 patients treated with Remdesivir

Favipiravir: Favipiravir significantly reduces virus clearance time and improves chest imaging with fewer adverse reactions. Favipiravir led to significantly accelerated relief of symptoms including pyrexia and cough. It is approved for Influenza.

Figure 3: Pharmacodynamic/Response biomarkers of COVID-19 patients treated with Favipiravir

Ribavirin: Ribavirin mimics ATP and GTP to incorporate with RNA dependent RNA polymerase. Ribavirin efficacy increased in combination with IFN‐beta‐1b, lopinavir or ritonavir. Mainly used in treatment of Hepatitis C and RSV infection.

Figure 4: Pharmacodynamic/Response biomarkers of COVID-19 patients treated with Ribavirin

Virus Entry Inhibitors 


Chloroquine: Chloroquine is a well-known antimalaria drug for many years. It disrupts virus‐receptor binding by interfering with glycosylation of the angiotensin‐converting enzyme 2 (ACE2). SARS-CoV-2 spike protein binding to ACE2 but also host gangliosides, and chloroquine interferes with this process by competing with the virus’s spike protein to bind to gangliosides. Chloroquine phosphate significantly decreased the disease duration compared to ritonavir‐lopinavir treatment.

Figure 5: Pharmacodynamic/Response biomarkers of COVID-19 patients treated with Chloroquine

Hydroxychloroquine: Hydroxychloroquine is the hydroxylated form of chloroquine and shows similar antiviral mechanism. Hydroxychloroquine add‐on therapy to ritonavir‐lopinavir may have many potential adverse effects including cardiac, metabolic, and neurological symptoms, and so forth, and should be used with caution.

Figure 6: Pharmacodynamic/Response biomarkers of COVID-19 patients treated with Hydroxychloroquine

Non-virus Targeting Drugs


Tocilizumab: A monoclonal antibody which is an IL‐6 receptor antagonist mainly used for systemic juvenile idiopathic arthritis, polyarticular juvenile idiopathic arthritis and rheumatoid arthritis treatment. Severe or critical COVID‐19 infection showed that use of tocilizumab immediately repeated doses improved clinical outcomes.

Figure 7: Pharmacodynamic/Response biomarkers of COVID-19 patients treated with Tocilizumab

Dexamethasone: An FDA approved synthetic corticosteroid and first‐line treatment for immune‐related complications by suppressing naïve T cell proliferation and differentiation in immune system. Decrease 28-day mortality rate by one-third in COVID-19 patients receiving invasive ventilation and one-fifth in patients receiving oxygen support.

Figure 8: Pharmacodynamic/Response biomarkers of COVID-19 patients treated with Dexamethasone

Dapagliflozin: A sodium‐glucose cotransporter‐2 (SGLT2) inhibitor and is hypothesized to be able to prevent serious side effects caused by SARS‐CoV‐2 infection through preventing low PH in cells.

Figure 9: Pharmacodynamic/Response biomarkers of COVID-19 patients treated with Dapagliflozin

Anticoagulant treatment: Unfractionated heparin and low molecular weight heparin are mainly used to prevent or treat thrombosis. Low molecular weight heparin improves coagulation dysfunction in COVID-19 patients and exerts anti-inflammatory effects by reducing Interleukin 6 (IL-6) and increase the percentage of lymphocytes.


Excelra’s COVID-19 Biomarker Database provides mechanistic insights into SARS-COV-2 biology and disease pathogenesis with drug repurposing approaches which can help global biotech and pharma community to develop treatments for COVID-19.


  • Limited repository of drugs to be repurposed, owing to the high attrition rate in drug development and approval
  • Repurposed drugs might have been optimized for a target, dosing, or tissue in the original indications
  • Substantial investment is required towards the research and clinical trial programs for the repurposed drug as the drug efficacies and safety profiles for COVID-19 are not well-established
  • Rapid clinical tests of existing antiviral, antimalarial, and immunomodulatory drugs have been done or are underway against COVID-19
  • The presence of heterogeneous populations with different genetic backgrounds might also affect outcomes of clinical results

GOBIOM’s Free COVID-19 Biomarker Database

Excelra’s COVID-19 Biomarker Database is a collection of manually curated clinical biomarkers, meticulously annotated by our data-scientists, to support the development of drugs/vaccines for treating COVID-19. With the number of clinical trials increasing by the day, identification and selection of potential biomarkers for inclusion in the clinical trials is of paramount importance for the success of COVID-19 clinical studies.


Mechanism of Action: RNA-dependent RNA polymerase (RdRp) inhibition

Remdesivir, is an antiviral medication being developed by Gilead Sciences for the treatment for Ebola virus disease. Animal studies have found it to be active against SARS and MERS viruses. Post the outbreak of COVID-19 pandemic in China, Gilead began laboratory testing of Remdesivir against SARS-CoV-2 in January 2020. Currently, phase-3 clinical trials are underway for Remdesivir to treat COVID-19 patients. The pro-drug (Remdesivir) is metabolized to active molecule (GS-441524). GS-441524, is an adenosine nucleotide analog. Ebola virus studies have shown, Remdesivir inhibited the viral RNA-dependent RNA polymerase (RdRp) enzyme to stop viral genome replication and viral RNA production. Remdesivir is Highly selective for Ebola Virus RdRp as compared to human mitochondrial RNA polymerase. Thus, specific targeting of RdRp can be an effective way to neutralize the SARS-CoV-2 infection.

Excelra’s open-access COVID-19 Drug Repurposing Database is a synoptic compilation of ‘Approved’ small molecules and biologics, which can rapidly enter either Phase 2 or 3, or may even be used directly in clinical settings against COVID-19. The database additionally includes information on promising drug candidates that are in various ‘clinical, pre-clinical and experimental’ stages of drug discovery and development.

Supported with referenced literature, we provide mechanistic insights into SARS-CoV-2 biology and disease pathogenesis. We hope that these drug repositioning approaches can help the global biotech and pharma community develop treatments to combat COVID-19.


The coronavirus (COVID-19) pandemic has proven to be a formidable scientific, medical, and social challenge. The complexity of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is centered on the unpredictable clinical course of the disease that can rapidly develop, causing severe and deadly complications.

Overview of the COVID-19 biomarker landscape

Below is the list of biomarkers which play a pivotal role in disease pathogenesis and its levels change according to the severity of COVID-19 infection. The identification of effective laboratory biomarkers able to classify patients based on their risk is imperative in being able to guarantee prompt treatment.

Inflammatory markers:

In SARS-CoV-2 infected patients, retrospective analysis has demonstrated that initial plasma levels of IL-1beta, IL-1RA, IL-7, IL-8, IL-10, IFN-gamma, monocyte chemoattractant peptide (MCP)-1, macrophage inflammatory protein (MIP)-1A, MIP-1B, granulocyte-colony stimulating factor (G-CSF), and tumor necrosis factor-alpha (TNF-alpha) were ‘increased in patients with COVID-19’. Further analysis has shown that the plasma concentrations of IL-2, IL-7, IL-17, IL-10, MCP-1, MIP-1A, and TNF-alpha in ICU patients were ‘higher than non-ICU patients’. Moreover, the plasma levels of IL-2, IL-6, IL-8, IL-10, and TNF-alpha, observed in severe infection, were prominently greater than those in non-severe infection.

Studies have revealed that levels of IL-6, the most common type of cytokine released by activated macrophages, rise sharply in severe manifestations of COVID-19. However, since most studies to date have been observational, it is difficult to extrapolate if the rise is significant enough to cause the manifestations seen in severe forms.

CRP a plasma protein produced by the liver and induced by various inflammatory mediators such as IL-6. Despite being non-specific, this acute phase reactant is used clinically as a biomarker for various inflammatory conditions; a rise in CRP levels are associated with an increase in disease severity. As per Wang et al. 2020 in retrospective cohort; greater CRP values correspond with the critical group, as groups were determined based on the diameter of largest lung lesion – CRP levels may indicate lung damage and development of disease.

Lactate dehydrogenase has been described to be increased during acute and severe lung damage, and elevated LDH values has been found in other interstitial lung infections. In COVID-19 patients LDH and CRP might represent an expression of lung damage and might reflect the respiratory distress consequent to the abnormal inflammation status.

Chemokine (C-X-C MOTIF) Ligand:

Studies by Kyung et al. have shown that, C-C motif (CC) chemokines (CC chemokine ligand [CCL] 2, CCL7, CCL8, CCL24, CCL20, CCL13, and CCL3), C-X-C motif (CXC) chemokines (CXC chemokine ligand [CXCL] 2 and CXCL10), and chemokine receptor subfamilies were most numerous, and were significantly (FDR < 0.05) upregulated in both mild and severe COVID-19 patient groups.

Hematologic biomarkers:

These used to stratify COVID-19 patients include WBC count, lymphocyte count, neutrophil count, neutrophil–lymphocyte ratio (NLR), platelet count, eosinophil count, and hemoglobin. Studies by Yang et al. reported lymphopenia in 80% of critically ill adult COVID-19 patients. These observations suggest that lymphopenia may correlate with infection severity. Qin et al. analyzed markers related to dysregulation of immune response in a cohort of 450 COVID-19 positive patients, reporting that severe cases tended to have lower lymphocyte-, higher leukocyte-counts and higher NLR, as well as lower percentages of monocytes, eosinophils, and basophils compared to mild cases.

Studies have revealed that although there were evident differences in lymphocytes (lymphocytopenia), platelet count and D-dimer in patients who experienced composite endpoints (ICU admission, invasive mechanical ventilation and death) there was no statistical analysis performed. In another study lowest platelet count associated with mortality. Moreover, both CD4+T and CD8+T counts significantly reduced in severe COVID-19 group compared with non-severe group.


D-dimer originate from the lysis of cross-linked fibrin with rising levels indicating the activation of coagulation and fibrinolysis. Studies have associated COVID-19 with hemostatic abnormalities with one study observing elevated levels of D-dimer, the measure of coagulation, in non-survivors compared to survivors and its levels rose throughout until death.


Elevated serum ferritin is a useful marker for a possible disease progression towards a critical development of COVID-19, it remains unclear whether serum ferritin concentration reliably reflects disease severity and whether serum ferritin can be used to gauge therapeutic effects.


ACE2 protein at the surface of lung alveolar epithelial cells allows infection of the respiratory tract with SARS-CoV-2. It can be hypothesized that the ACE2 levels correlate with susceptibility to SARS-CoV-2 infection. Apparently, men have a higher ACE2 expression in lung than women and Asian people express ACE2 higher than Caucasian and African American populations. This is in agreement with findings that conversion of Ang II to Ang by ACE2 was higher in males than female, suggesting an over-expression of ACE2 in men.


Studies by Rui Hi, have demonstrated that procalcitonin may be an indicator of disease severity and may contribute to determining the severity of patients with COVID-19.


Most emerging studies described serological tests based on detection of SARS-CoV-2-specific IgM and IgG.1–4. Although detection of SARS-CoV-2-specific IgA in serum has been reported in few papers, analyses of IgA levels in a larger number of COVID-19 patients are still lacking.

Studies by Huan et al have demonstrated that serum IgM and IgG levels in moderate and severe COVID-19 patients were significantly higher than mild cases, while no significant difference was observed between severe and moderate patients. However, we found that IgA levels in severe cases were significantly higher than those mild or moderate cases.

COVID-19 biomarker findings in GOBIOM database

The diagnostic biomarkers for COVID-19 in GOBIOM database are in-line with the findings in the literature. Below are the top diagnostic biomarkers for COVID-19 diagnosis from GOBIOM database:

Figure1: Top Biomarkers in COVID-19 diagnosis from GOBIOM database

Utility of Excelra's GOBIOM in COVID-19 research

We highlight the important biomarkers involved in COVID-19 diagnosis along with how the levels of biomarkers may change according to severity of COVID-19 infection. This can be used as an adjunct in clinical practice to guide treatment and admission to ICU. By doing so, it may improve prognosis and minimize the mortality rates.

In this manner, focused biomarker databases like GOBIOM can be a very useful resource to identify biomarkers predictive of drug response or resistance, further facilitating selection of right patient population who are most likely to respond to treatment.


OMICS has paved the way for incredible advances in the various fields of clinical, medical, drug discovery and development. These advances have provided insights to identify pathophysiology, understand drug interactions and develop personalized medicines. We have made tremendous advancement from the early days of first-generation DNA sequencing machines. The latest Next Generation Sequencing (NGS) systems are high throughput systems which generate huge amounts of data per run.Advanced data management systems are required to store and process the huge amounts of data from these machines. There are currently platforms available for OMICS analysis from companies like DNAnexus, SevenBridges Genomics, Pine Bio to name a few, which provide tools and applications for processing and analysing biomedical and OMICS related data.

The tools and pipelines from these business solution providers address the needs of the customer, but most of the analyses in the field of OMICS require a more customized solution rather than a standardized pipeline. Even the same pipeline may have many variations according to needs of different research groups. This is the reason why full-fledged service platforms provide variations for standard pipelines. The building blocks/applets are also provided so that customers can add/remove functionality or steps.

The plethora of options available for different pipelines and the diverse functionalities of the API itself, requires a dedicated effort from customers to customize existing solutions from these service providers. The client’s bioinformatics team need to have sufficient familiarity with these APIs and dedicate time to gather requirements from team members and design the pipelines. This places the additional responsibilities of planning, development, testing and maintenance on research teams. These additional manpower and resource requirements may not be feasible for all research organizations.

The Excelra approach

Excelra has a proven track record for developing and deploying custom computational pipelines to the top pharma companies and research groups globally. We provide recommendations and develop custom solutions by understanding client requirements and by leveraging our skillsets in bioinformatics and software programming. This brings about a certain economy since the solutions are custom made and our customers are not required to pay for functionalities that are not required by their organization.

We have services catering to the different needs of our customers and do not have any step-based approaches where; the entry cost for a particular service may be low, but scale-up of that service causes a huge jump in costs due to the tiered nature of service – where tiering is done based on the number of jobs that you can start at any given point in time.

OMICS service development process

OMICS pipeline development is done after understanding the scope and requires expert knowledge in OMICS data and software systems. After understanding the nature of the requirement, some groundwork on the scientific questions that need to be answered and solutions available for the same are evaluated. Pipeline development software stack is finalized after considering client current infrastructure and interoperability requirements.

After pipeline is developed, functional testing must be done on the pipeline with sample data to test its stability. The functional testing can be supplemented with regression testing wherever applicable. Reproducibility of the data and results are very important in any domain to validate the results.

Mainly, OMICS activities are divided into 3 major categories:

  • Pipeline development services
  • OMICS data processing and analysis services
  • OMICS platform development services

We now present a detailed take on these categories, which together constitute our holistic range of Omics data management and analytics services.

Excelra’s OMICS suite

  • Pipeline development services

Figure 1: Typical process for pipeline development

A typical process for pipeline development is shown in Figure 1. It considers the reagents and the data processing requirements of end user. The pipeline should be built using good computer programming practices so that any future improvements and modifications of the pipeline should be feasible. This should not cause major changes to the pipeline’s sensitivity, specificity, efficiency, or performance while upgrading or scaling up pre-existing or custom pipelines. The existing source code of the pipelines can also be scaled up to run multiple jobs at the same time or optimized to run jobs more efficiently in short amount of time.

  • OMICS data processing and analysis

Figure 2: A schematic for data analysis and interpretation from pipelines

The schematic for OMICS data processing and analysis services is shown in Figure 2. This activity involves the processing of data using standard/customized pipelines. The choice of tools and pipeline is based on the end-user goal of answering a scientific objective. Pipelines which are well-known/referenced in the bioinformatics domain are preferred. They need to be validated with test data to ensure that client expectations are met.

Custom reports and documentation are created according to user requirements. This helps end-users to utilize them for effective communication and presentation to project stakeholders. The bioinformatics analysis can also be supplemented with custom analysis for drug repurposing, target identification etc. Excelra provides a wide array of downstream analysis options.

  • OMICS platform development services

Figure 3: Schematic of OMICS platform development process using latest software stack

The schematic for platform development process is shown in Figure 3. This is the development of a cloud-based platform with a custom pipeline which can be deployed on a cloud server. This system provides the end-user with direct access to run the required pipelines. Modifications or development/deployment of pipelines can be efficiently done at the hosting server without impacting running jobs or usage of existing pipeline. We use some of the latest technology stacks like Docker, Nextflow, AWS-Batch to provide end-user with advanced capabilities and lower cost of operation.

OMICS challenges and business opportunities

There are numerous challenges when it comes to building a robust OMICS pipeline. The service providers should be flexible to customer requirements as mentioned earlier.

A major pitfall is the objective of the study or analysis as some of the interpretations and analysis cannot be done without deep knowledge and understanding of the domain.2 Subject Matter Experts (SMEs) with extensive knowledge in various domains can be leveraged to provide meaningful insights from the analayzed data and also help in the experimental challenges like designing the study and so on.3 This requires team members who are good at software programming practices and have the relevant domain experience. Some pharma companies have the bandwidth and resources to have such a team. Many mid and small pharma companies prefer to use third-party service providers for the same.

The datasets for OMICS have different pre-processing steps like data cleaning, imputation and selecting features (biomarkers, targets of interest) before the actual analysis. Excelra has right set of tools, skills and resources to address and execute these tasks with ease and expertise.

As the volume of data is huge and is only ever increasing exponentially; with upcoming technologies, there are challenges in processing such large volumes of data. All the pipelines built can be integrated with AWS Batch to perform rapid and scalable data analysis, and also address the issues of data storage and archival.


  1. Pillai S, Gopalan V, Lam AK-Y. Review of sequencing platforms and their applications in phaeochromocytoma and paragangliomas. Crit Rev Oncol Hematol. 2017;116:58-67. doi:10.1016/j.critrevonc.2017.05.005
  2. Micheel CM, Nass SJ, Omenn GS, et al. Omics-Based Clinical Discovery: Science, Technology, and Applications. National Academies Press (US); 2012. Accessed February 23, 2021.
  3. Subramanian I, Verma S, Kumar S, Jere A, Anamika K. Multi-omics Data Integration, Interpretation, and Its Application. Bioinforma Biol Insights. 2020;14:1177932219899051. doi:10.1177/1177932219899051