Screening adverse-events -related data
Overview
Accurate adverse-event data is critical for pharmacovigilance, drug safety monitoring, and clinical research. Pharmaceutical and research organizations rely on validated biomedical datasets to identify drug-target associations, monitor toxicity signals, and improve predictive models used in preclinical and clinical development.
Excelra collaborated with a client to enhance their proprietary adverse-events database by integrating high-quality curated data on drugs and targets. Leveraging advanced Data Curation services, Scientific Informatics, and scalable Scientific Data Management, Excelra created a structured and validated dataset to support pharmacovigilance research, predictive analytics, and adverse-event monitoring.
Our client
The client is a European publishing and data solutions organization that owns a proprietary platform used in preclinical toxicity analysis, pharmacovigilance studies, and clinical research. Their platform required regular updates with validated biomedical data related to drugs, targets, and adverse events.
Client’s challenge
The client required reliable adverse-event information to improve the accuracy and sensitivity of their internal text-mining pipeline.
However, the available data posed several challenges:
- Large volumes of biomedical literature requiring manual validation
- Need for structured associations between drugs, targets, and adverse events
- Ensuring data accuracy for pharmacovigilance and toxicity analysis
- Improving the performance of the client’s text-mining pipeline
- Maintaining a regularly updated adverse-events database
To address these challenges, the client engaged Excelra to implement a data-curation and validation workflow capable of delivering high-quality analysis-ready data.
Client’s goals
The primary objectives of the project included:
- Updating the adverse events database with accurate and validated drug-target information
- Extracting adverse-event associations from biomedical literature
- Training and improving the performance of the client’s text-mining pipeline
- Delivering structured biomedical data suitable for pharmacovigilance studies
Our approach
Data mining and literature screening
Excelra applied its text-mining and literature screening capabilities to identify relevant biomedical publications.
The team screened approximately 1,000 articles per day, covering multiple literature categories such as:
- Case reports describing drug-induced adverse events
- Studies investigating associations between drugs and adverse reactions
- Reviews discussing safety data and drug classes
- Preclinical toxicology reports
- Knockdown and knockout genetic studies
- Research linking diseases with genomic findings
- These approaches align with modern biomedical knowledge extraction techniques discussed in integrated text mining and data curation for biomedical knowledgebases.
Manual curation and validation
Excelra’s expert biocurators manually validated PubMed identifiers (PMIDs) associated with adverse events.
The curated information included:
- Drug and drug-class associations
- Target information
- Adverse-event descriptions
- Literature references supporting each association
Manual validation ensured the reliability and accuracy of the curated dataset.
Multi-Stage data curation workflow
Excelra implemented a structured data-curation pipeline consisting of multiple validation stages:
Initial data transformation
Curators performed manual screening of documents and structured the extracted information based on the client’s lexicon and entity classification framework.
Subject matter expert (SME) review
Subject matter experts reviewed the curated datasets and annotated the extracted information to ensure scientific accuracy.
Final quality assurance
A final QA/QC process ensured that only validated, high-quality data was delivered to the client’s production environment.
This structured workflow maintained strict quality benchmarks for both accuracy and productivity.
Integration with Text-Mining pipeline
The validated PMIDs and curated biomedical data were then used to train and refine the client’s text-mining pipeline.
This iterative process improved the pipeline’s ability to identify adverse-event associations automatically.
Excelra further enhanced scalability by integrating computational approaches similar to those used in scientific data transformation and analytics.
Figure 1 : Excelra’s data-curation pipeline
Results
Excelra successfully delivered a curated and validated adverse-events dataset that improved the performance of the client’s pharmacovigilance platform.
Key outcomes
- Curated and validated adverse-event information from biomedical literature
- Improved sensitivity and accuracy of the client’s text-mining pipeline
- Structured associations between drugs, targets, and adverse events
- Continuous database updates performed twice per week
- Reliable datasets supporting pharmacovigilance and drug safety research
Key benefits
- Improved pharmacovigilance data quality
Validated datasets improved the reliability of adverse-event monitoring. - Enhanced Text-Mining accuracy
Curated training data significantly improved the performance of automated extraction workflows. - Scalable biomedical knowledgebase
The curated database enabled ongoing updates and scalable data integration.
Conclusion
Excelra’s expertise in data curation, biomedical text mining, and scientific data management enabled the client to significantly improve the quality and coverage of their adverse-events database. By implementing a multi-stage validation pipeline and integrating curated data into automated workflows, Excelra helped strengthen the client’s pharmacovigilance platform and enhance drug safety insights.
For similar projects, explore additional case studies or learn more about Excelra’s capabilities in data curation and biomedical data management.
