Legacy clinical data for CDISC SDTM compliance and Data Unification

Legacy clinical study data is a study data in a non-standardised format which not supported by FDA.
Converting legacy clinical data to a CDISC compliant format for submission is a thriving demand for pharmaceutical companies and sponsors.

Legacy Studies definition: Studies which are conducted previously for drug currently being studied.
In general, legacy data is reviewed if we need drug information about previous studies in a separate indication which are already approved or terminated or studies which are conducted by another organization before drug acquired by current organization.

The primary use of legacy data conversion in clinical research industry is traceability of study data.
Legacy data can be used in Health care analytics/Evidence-Based Medicine/historical controls database/Data-Driven Translational research/predictive analytics.

Study data traceability in general provides an understanding of the relationships between the analysis results (TLF’s in the study report), analysis datasets, tabulation datasets and source data.

Legacy clinical data which is non-CDISC compliant may be used to create SDTM datasets and ADaM datasets and to support the results in clinical study reports (CSRs), Integrated Summaries of Safety (ISS), and Integrated Summaries of Efficacy (ISE) for regulatory review process.
FDA does not recommended any particular approach to legacy clinical study data conversion but the converted data should be traceable and capable to support review process.

Data standards like CDISC enable FDA to modernise and streamline the review process. Study data standards provide a standard way to interchange clinical and non-clinical research data between corporate sponsors and Regulatory bodies.

Based on the FDA’s new requirements for data standards, “Study Data for Submission to CDER and CBER”, New Drug Applications (NDAs), Biologics License Applications (BLAs) and Abbreviated NDAs (ANDAs) whose studies initiated after December 17, 2016, must submit data in the CDISC standards, and for investigational New Drug (IND) this requirement start after December 17, 2017.

Traceability Issues for Legacy clinical data Conversion
It’s hard to
-identify location of collected CRF variables in the converted SDTM data
-replicate/confirm legacy analysis datasets i.e. analysis variable imputation or derived variables using SDTM datasets
-confirm derivation of intermediate analysis datasets or custom domains
-understand the source or derivation methods for imputed variables or derived variables in integrated/pooled data, SUPP and RELREC data

Legacy clinical data unification

Data unification:
Data unification is basically the process of ingesting the source data, transforming/mapping, de-duplicating, classification and exporting data from multiple sources. Two types of methodologies used to accomplish this task- ETL and Master Data Management (MDM) both have their own advantages and disadvantages.

Unification of clinical Data Empowers pharmaceutical organizations to get valuable Insights from secondary analysis but secondary analysis of clinical data have challenges like Data Privacy (The Declaration Of Helsinki, HiPAA and other regulations) and pooling of the data(hard to scale).

There are many analytics platforms available in the market to curate the data and unify from a variety of sources using ETL and MDM but my intelligence was fascinated towards Modak's NABU, it combines human expertise, machine learning algorithms, data science and in-house developed fingerprinting technology.

Introduction to Clinical Trials
An Introduction to the Standard Data Tabulation Model (SDTM)
Link between Clinical Research and SDTM
Legacy clinical data for CDISC SDTM compliance and Data Unification
Study Data Technical Conformance Guide: Technical Specifications Document
Study Data for Submission to CDER and CBER
Providing Regulatory Submissions in Electronic Format — Standardized Study Data
Study Data Standards Resources
Data_unification_Modak’s NABU