In the previous posts of this series we talked about the onset of the data integrity current expectancy and how the health authorities are evaluating it in the inspections. Globally, they consider data as the evidence of the operations carried out by the pharmaceutical companies, and thus ensuring its integrity is fundamental.

But before any data integrity implementation action is carried out, it is necessary to understand the related key concepts. This is what we are going to review in this post.


In the first place, when we refer to the scope of the data integrity we talk of any type of data, regardless of format, which is relevant from a GxP point of view. This is bold clear. However, initially most of the Warning Letters were related to problems with electronic data and this generated a false feeling that the data integrity compliance efforts should be put only on the computerized systems. But nothing could be further from the truth. Mechanisms for ensuring that data on physical support (forms, records, etc.) keeps its integrity are necessary. And thus efforts related to their implementation and continuous evaluation should be made.


Bias is understood as the intentional choice of data, processing operations, etc. that allow justifying some results, intentionally hiding those that may result in non-compliances or rejecting batches. Data management actions should ensure that all the data is equally considered and none of it is preferred without written scientific justification.


As explained in our post about GMP Indicators, the main guides define the data quality attributes with the term ALCOA:

A     attributable to the person generating the data

L     legible and permanent

C     contemporaneous

O     original record (or ‘true copy’)

A     accurate

More recently, this definition is sometimes expanded with the term “Plus”, meaning complete, consistent, enduring and available. Both ALCOA and ALCOA+ may be considered equivalent.

Data Integrity Risk Assessment (DIRA)

The concept of DIRA as such, only appears in the MHRA guide. Despite this, it is implicit in the other Data Integrity guides in the sense that the pharmaceutical companies should ensure no DI gaps are present and that adequate control measures exist. The DIRA should be part of the Risk Management process in the company, following a strategy aligned with the ICH Q9. The results of this exercise should ensure that any risk (derived from personnel, technical topics, quality system, etc. ), is identified, evaluated and suitably controlled, in the entire data life cycle.


Metadata is data that describes the attributes of other data and provides context and meaning. It is usually data that describes the structure, data elements, interrelationships, and other characteristics of the data, e.g., audit-trail. Metadata also allows data to be attributable to an individual (or, if generated automatically, to the original data source).

Data requires its associated metadata for having a meaning. Thus metadata should be considered part of the original record and therefore the same data retention and controls apply to it.

Audit Trail

FDA defines Audit Trail as a secure, computer generated, time-stamped electronic record that allows reconstruction of the course of events relating to the creation, modification, and deletion of an electronic record.

This concept was initially set in the 21CFR Part 11 regulation, but for several time it was considered as something to have, and look only in case of investigations. The data integrity guides consider that it, as metadata, is part of the original record and thus should be included in its review and approval.

This may represent a huge effort in some departments of the company, so use of suitable Audit Trail reports and risk based approach (justifying for example a review by exception) may help fulfilling this requirement.

System Validation

Ensuring that a computerized system fulfils with the data integrity principles once it is running, may be both difficult and impossible. Thus it is important that the data integrity concepts are embedded in the system’s life cycles. Special care should be taken in two aspects: first, proper definition of the data integrity requirements, based on the expected operations to be carried out. Second, selection of systems with the appropriate technical characteristics.


This has not intended to be a comprehensive list of data integrity concepts. The guides talk about several other elements of interest such as computer system permissions and shared logins, static vs. dynamic records, backup data, etc. All those concepts should be reviewed and taken into account.



Knowing the key concepts related to Data Integrity is the basis for implementing systems and processes that fulfil the expectations of the health authorities. Once they are part of the knowledge of the company, it will be able to develop the activities that will allow achieving compliance: development of a suitable Data Integrity policy, evaluation of existing systems and data management processes and appropriate implementation of new ones.