C. Lee Ventola, MS

First published June 2018


Adverse drug events (ADEs), including drug interactions, have a tremendous impact on patient health and generate substantial health care costs.19 A “big data” approach to pharmacovigilance involves the identification of drug–ADE associations by data mining various electronic sources, including: adverse event reports, the medical literature, electronic health records (EHRs), and social media.14,1022 This approach has been useful in assisting the Food and Drug Administration (FDA) and other regulatory agencies in monitoring and decision-making regarding drug safety.14,6,10,2326 Data mining can also assist pharmaceutical companies in drug safety surveillance efforts, adhering to risk management plans, and gathering real-world evidence to supplement clinical trial data.2,3,10,2735 The use of data mining for pharmacovigilance purposes provides many unique benefits; however, it also presents many challenges.14,10,11,3639 Various steps can be taken to improve the use of data mining for pharmacovigilance purposes in the future.14,10,11


The primary goal of drug safety regulators and researchers is to identify and observe ADEs that can cause public harm.3 Many ADEs are identified only after a drug has been marketed when it is used by a larger and more diverse population than during clinical trials.1,4 ADEs discovered after a drug is in broad use can be a significant cause of morbidity and mortality, so effective post-marketing drug safety surveillance is critical to the protection of public health.1,3,4

A new drug is granted regulatory approval only after its efficacy and safety have been demonstrated in a series of clinical trials.4 Randomized, controlled, phase 3 studies are considered to be the most rigorous means for studying a drug’s efficacy and safety.4 However, these trials often enroll a relatively small number of patients according to specific inclusion and exclusion criteria that do not always represent all potential users of the drug.3,4 Clinical trials also take place over a relatively short period, making ADEs with a long latency difficult to detect.4 Furthermore, after regulatory approval, drug labeling and/or prescribing practices may evolve to include new indications or patient populations, off-label uses, or concomitant use with other drugs.4 Each of these new variables may contribute to the development of ADEs that had not been observed previously during clinical trials.4 Even over-the-counter medications, such as non-steroidal anti-inflammatory drugs and phenylpropanolamine, have been associated with confirmed adverse drug reactions (ADRs) after regulatory approval, causing withdrawal from the market or changes in labeling.6,23,24

Data mining drug safety report databases, the medical literature, and other digital resources could play an important role in augmenting the information about ADEs that is obtained during short-term clinical trials.3 Data mining for pharmacovigilance purposes may also provide an “early warning system” that could detect drug safety issues more promptly than traditional methods. For these reasons, data mining these sources for ADEs is of great interest to the FDA, the pharmaceutical industry, and drug safety researchers.3


What Is Big Data?

The term “big data” refers to a large volume of diverse, dynamic, distributed structured or unstructured data that provides both opportunities and challenges with respect to its interpretation due to its complexity, content, and size.1,11 Traditional methods are often inadequate for processing big data because the volume of data is so large and complex.11 Besides vast volume and variety, other features of big data include its rapid speed of accumulation and transmission.11 A glossary of terms pertaining to big data, data mining, and pharmacovigilance is provided on the following page.

Read the full text here.