Home / PDA Letter / Full Article

Data Integrity: From the Basics to Big Data

by David Hubmayr, CSL Behring

Published On Feb 27, 2020

Data integrity is not a new concept, yet global regulators continue to cite manufacturers for deficiencies in this area.

2019 Data Integrity This was my first takeaway as an attendee at the 2019 PDA Data Integrity Workshop last September in Washington, D.C. This workshop featured presentations from global regulators and industry leaders covering burning data integrity questions from when does data integrity start to what role mindfulness can play to how to address data integrity involving big data technologies? The speakers provided their own perspectives on why data integrity remains a challenge for our industry and provided their own recommendations to solve this critical issue.

As an attendee, I want to share some of what I learned.

In the first plenary session, “Overcoming Data Integrity Challenges,” Carmelo Rosa, Division Director, Office of Manufacturing and Product Quality, CDER, U.S. FDA, explained the data integrity expectations throughout a product’s lifecycle. As a general statement, it is understood that all data (electronic and hardcopy) generated throughout a product’s lifecycle must be accurate, reliable, complete, truthful, correct, and unaltered. This includes data generated during clinical and pre/post approval stages. Of course, understanding, interpreting, explaining or signifying certain data generated (e.g., unexpected/out-of-specification results) may depend on the phase in which it was generated (e.g., if results were generated during early clinical/process or product/method development stages).

The inclusion of data from clinical stages is one area that currently requires more attention. In fact, documentation of the develop process is often neglected as it is not specifically audited. Many firms fear not having a “perfect story” during these early phases. Rosa emphasized, however, that regulators do understand that data evolves throughout early phases. Still, data should never be falsified, altered and/or manipulated to misrepresent information. This includes, but is not limited to: lack of controlled access to computer systems, “trial” HPLC injections of samples outside/within a quality structure, not recording activities contemporaneously/backdating, fabricating/falsifying batch records, copying existing data as new data, deleting results with no justification and retesting samples to present better results.

GMPs apply to all Phase II/III stages once the drug is available, including exhibit batches, validation phases and commercial batches. If a process or analytical method under development that may require modifications to equipment parameters and settings, as well as formulation, these should be documented and explained. Regulators will ask for the full development story, including data integrity, as they want to be confident that the drug is safe for patients.

Following Rosa’s presentation, Els Poff, Executive Director, Data Integrity Center of Excellence, Merck & Co., explored the challenges surrounding the transition from paper to digital. While a fully digital future presents much promise, data integrity challenges center around technology, people/culture and regulatory requirements. Technology challenges include lack of fully data integrity compliant instruments/systems/solutions, the risk of going backwards and cyber threats. People/Culture challenges include the still inherent resistance to change, entrenched legacy ways of working, adequate socialization (the why message), being trapped by the responsibility of daily operations, pace of execution, capacity to execute, return on technology investments and production scheduling demands. And last but not least, regulatory challenges include revalidation efforts, differing global regulatory expectations and impact on filings.

Miss the 2019 PDA Data Integrity Workshop? Consider attending the 2020 PDA Data Integrity Workshop. To learn more and to register, visit the workshop's website.

One possible solution for overcoming these challenges is to bridge traditional paper-based approaches and fully digitized environments with a hybrid system. Evolution, without disruption, will be long. The risk of falling behind the curve must be balanced with adequate controls and protection from cyber threats.

Being Mindful of Data Integrity

The second day of the workshop featured a breakfast session on the connection between mindfulness and data integrity with a presentation from Amy L. McLaren, Senior Director, Quality and Compliance, and Julie C. Maurhoff, Senior Director, GxP Compliance, both of Ultragenyx Pharmaceutical. They suggested that when thinking of data integrity, consider the second word, “integrity.”

“Integrity” can be defined by the state of being whole and undivided while paired with mindfulness. Living mindfulness in everyday work and private life provides us with enhanced clarity, helps us focus and supports us in problem solving through critical thinking. But management has to set the tone. Enough “space,” such as understanding of time constraints, trust, motivation, etc., must be provided to staff to make patient-focused decisions with available data.

Big Data: Fact versus Fiction

The closing session explored the impact of big data on data integrity. Since this is a new technology for us, I thought it was great to separate fact from fiction with insights from presenters, Mark A. DiMartino, Director, Quality Data Sciences, Amgen, and Peter E. Baker, Vice President, Green Mountain Quality Assurance.

First, the facts. Artificial intelligence (A.I.) applications augment human intelligence and can improve products and processes. When combined with data science-driven solutions that leverage data, insights can be efficiently found. This means ensuring data access, applying appropriate analysis methods to unlock information. All of this leads to meaningful visualization and interpretation of information. At this time, multiple tools are commercially available.

On the fiction side, A.I. can explain its decision making as A.I. software can make recommendations or take action based on its knowledge, but it cannot break down its decision-making process to explain how it came to those recommendations. Also, consider goal setting. There is the growing fear that A.I. technologies define their own goals. What does this mean? People should define goals; technologies should execute them. Technologies should not be leading execution of goals.

Other considerations from this session include integrity of cloud storage, appropriateness of data fed into models (e.g., bias), data flow (i.e., do model outputs go back to the source system?) and model management.

From this workshop, I took away that data integrity must be taken into consideration right from the very beginning and mindfulness can lead to enhanced clarity. New technologies offer great opportunities for companies, and, by extension, patients but data integrity remains a consideration no matter the technology.

About the Author

David Hubmayr is member of the Integrated Commissioning and Qualification Expert Group at CSL Behring. He is responsible for qualification compliance.