OMICS DATA MANAGEMENT & ANALYTICS: AN OVERVIEW OF THE LANDSCAPE, CHALLENGES AND SOLUTIONS TO EFFECTIVELY INFORM DRUG DEVELOPMENT DECISIONS
OMICS has paved the way for incredible advances in the various fields of clinical, medical, drug discovery and development. These advances have provided insights to identify pathophysiology, understand drug interactions and develop personalized medicines. We have made tremendous advancement from the early days of first-generation DNA sequencing machines. The latest Next Generation Sequencing (NGS) systems are high throughput systems which generate huge amounts of data per run.1 Advanced data management systems are required to store and process the huge amounts of data from these machines. There are currently platforms available for OMICS analysis from companies like DNAnexus, SevenBridges Genomics, Pine Bio to name a few, which provide tools and applications for processing and analysing biomedical and OMICS related data.
The tools and pipelines from these business solution providers address the needs of the customer, but most of the analyses in the field of OMICS require a more customized solution rather than a standardized pipeline. Even the same pipeline may have many variations according to needs of different research groups. This is the reason why full-fledged service platforms provide variations for standard pipelines. The building blocks/applets are also provided so that customers can add/remove functionality or steps.
The plethora of options available for different pipelines and the diverse functionalities of the API itself, requires a dedicated effort from customers to customize existing solutions from these service providers. The client’s bioinformatics team need to have sufficient familiarity with these APIs and dedicate time to gather requirements from team members and design the pipelines. This places the additional responsibilities of planning, development, testing and maintenance on research teams. These additional manpower and resource requirements may not be feasible for all research organizations.
The Excelra approach
Excelra has a proven track record for developing and deploying custom computational pipelines to the top pharma companies and research groups globally. We provide recommendations and develop custom solutions by understanding client requirements and by leveraging our skillsets in bioinformatics and software programming. This brings about a certain economy since the solutions are custom made and our customers are not required to pay for functionalities that are not required by their organization.
We have services catering to the different needs of our customers and do not have any step-based approaches where; the entry cost for a particular service may be low, but scale-up of that service causes a huge jump in costs due to the tiered nature of service – where tiering is done based on the number of jobs that you can start at any given point in time.
OMICS service development process
OMICS pipeline development is done after understanding the scope and requires expert knowledge in OMICS data and software systems. After understanding the nature of the requirement, some groundwork on the scientific questions that need to be answered and solutions available for the same are evaluated. Pipeline development software stack is finalized after considering client current infrastructure and interoperability requirements.
After pipeline is developed, functional testing must be done on the pipeline with sample data to test its stability. The functional testing can be supplemented with regression testing wherever applicable. Reproducibility of the data and results are very important in any domain to validate the results.
Mainly, OMICS activities are divided into 3 major categories:
- Pipeline development services
- OMICS data processing and analysis services
- OMICS platform development services
We now present a detailed take on these categories, which together constitute our holistic range of Omics data management and analytics services.
Excelra’s OMICS suite
- Pipeline development services
Figure 1: Typical process for pipeline development
A typical process for pipeline development is shown in Figure 1. It considers the reagents and the data processing requirements of end user. The pipeline should be built using good computer programming practices so that any future improvements and modifications of the pipeline should be feasible. This should not cause major changes to the pipeline’s sensitivity, specificity, efficiency, or performance while upgrading or scaling up pre-existing or custom pipelines. The existing source code of the pipelines can also be scaled up to run multiple jobs at the same time or optimized to run jobs more efficiently in short amount of time.
- OMICS data processing and analysis
Figure 2: A schematic for data analysis and interpretation from pipelines
The schematic for OMICS data processing and analysis services is shown in Figure 2. This activity involves the processing of data using standard/customized pipelines. The choice of tools and pipeline is based on the end-user goal of answering a scientific objective. Pipelines which are well-known/referenced in the bioinformatics domain are preferred. They need to be validated with test data to ensure that client expectations are met.
Custom reports and documentation are created according to user requirements. This helps end-users to utilize them for effective communication and presentation to project stakeholders. The bioinformatics analysis can also be supplemented with custom analysis for drug repurposing, target identification etc. Excelra provides a wide array of downstream analysis options.
- OMICS platform development services
Figure 3: Schematic of OMICS platform development process using latest software stack
The schematic for platform development process is shown in Figure 3. This is the development of a cloud-based platform with a custom pipeline which can be deployed on a cloud server. This system provides the end-user with direct access to run the required pipelines. Modifications or development/deployment of pipelines can be efficiently done at the hosting server without impacting running jobs or usage of existing pipeline. We use some of the latest technology stacks like Docker, Nextflow, AWS-Batch to provide end-user with advanced capabilities and lower cost of operation.
OMICS challenges and business opportunities
There are numerous challenges when it comes to building a robust OMICS pipeline. The service providers should be flexible to customer requirements as mentioned earlier.
A major pitfall is the objective of the study or analysis as some of the interpretations and analysis cannot be done without deep knowledge and understanding of the domain.2 Subject Matter Experts (SMEs) with extensive knowledge in various domains can be leveraged to provide meaningful insights from the analayzed data and also help in the experimental challenges like designing the study and so on.3 This requires team members who are good at software programming practices and have the relevant domain experience. Some pharma companies have the bandwidth and resources to have such a team. Many mid and small pharma companies prefer to use third-party service providers for the same.
The datasets for OMICS have different pre-processing steps like data cleaning, imputation and selecting features (biomarkers, targets of interest) before the actual analysis. Excelra has right set of tools, skills and resources to address and execute these tasks with ease and expertise.
As the volume of data is huge and is only ever increasing exponentially; with upcoming technologies, there are challenges in processing such large volumes of data. All the pipelines built can be integrated with AWS Batch to perform rapid and scalable data analysis, and also address the issues of data storage and archival.
- Pillai S, Gopalan V, Lam AK-Y. Review of sequencing platforms and their applications in phaeochromocytoma and paragangliomas. Crit Rev Oncol Hematol. 2017;116:58-67. doi:10.1016/j.critrevonc.2017.05.005
- Micheel CM, Nass SJ, Omenn GS, et al. Omics-Based Clinical Discovery: Science, Technology, and Applications. National Academies Press (US); 2012. Accessed February 23, 2021. https://www.ncbi.nlm.nih.gov/books/NBK202165/
- Subramanian I, Verma S, Kumar S, Jere A, Anamika K. Multi-omics Data Integration, Interpretation, and Its Application. Bioinforma Biol Insights. 2020;14:1177932219899051. doi:10.1177/1177932219899051
- Posted In: