DEVENDER PALSA
4 min readJul 3, 2020

--

AMALGAMATION OF BIG DATA ANALYTICS, SDTM, LEGACY CLINICAL DATA

Advancements in technology creates new avenues in healthcare like optimizing clinical trial data, digitalization of electronic medical records(EMRs) and electronic health records (EHRs), patient enrolment in clinical trials, improves quality of life of patients, drug review procces, utilization of legacy data, lab data, genetics data and so on..

Combination of big data with SDTM standards improves clinical trial data recitation.
Several technologies can be used to get the most valuable information from the data using big data analytics. Some of them are text/data mining, predictive analytics, machine learning.

Clinical Research:

Any drug which comes into the market and is available for human use, it must go through a stringent and elaborative process of testing known as clinical trials.
Clinical trials can take around 10 to 15 years to complete but this time span may vary a lot, it costs millions of dollars and it’s crucial for an organizations survival.
Before going to understand what is clinical trial we need to understand the “drug development process” and “clinical study”.

The Drug Development Process: This process starts from drug discovery to Post-Marketing.
1.Discovery and Development/target validation: Research for a new drug begins in the laboratory.
2. Preclinical Research/testing: Drugs undergo laboratory and animal testing to answer basic questions about safety.
3. Investigational New Drug(IND) application filing: Drug developers/sponsors must submit IND application to FDA before beginning clinical research.
4. Clinical Research: Drugs are tested on people to make sure they are safe and effective.
5. New Drug Application filing and The Prescription Drug User Fee Act (PDUFA)date and decision(PDUFA dates are deadlines for the FDA to review new drugs)
6. FDA Review: FDA review teams thoroughly examine all of the submitted data related to the drug or device and make a decision to approve or not to approve it.
7. FDA Post-Market Safety Monitoring: FDA monitors all drug and device safety once products are available for use by the public.

Clinical Trials: Clinical trial is a “research study conducted to find answers to health related questions in which subjects/patients receive one or more diagnostic, therapeutic, or other types of interventions/treatment (or no intervention)” so that researchers can determine the effects of the interventions on biomedical or health related outcomes including new treatments (such as drugs, novel vaccines, dietary supplements, dietary choices, and medical devices).

CDISC SDTM:

For analysts/programmers who analyze data in clinical industry, understanding of CDISC standards gives boosting and ensures easy understanding of clinical data which submitted in a consistent/standardized format.
The Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) defines a standard structure for human clinical trial data and for non-clinical study data tabulations that are to be submitted as part of a product application(IND and NDA) to a regulatory authorities such as the United States Food and Drug Administration (FDA) and Japan’s Pharmaceuticals Medical Devices Agency (PMDA).

Implementing SDTM implemenation supports data aggregation and warehousing, promote mining and re-use, facilitates sharing and helps to perform due diligence and other important data review activities and also improves the regulatory review and drug approval process.

Legacy clinical data

Legacy clinical study data is a study data in a non-standardized format which not supported by FDA. Converting legacy clinical data to a CDISC compliant format for submission is a thriving demand for pharmaceutical companies and sponsors.
Legacy Studies definition: Studies which are conducted previously for drug currently being studied.
In general, legacy data is reviewed if we need drug information about previous studies in a separate indication which are already approved or terminated or studies which are conducted by another organization before drug acquired by current organization.
The primary use of legacy data conversion in clinical research industry is traceability of study data.

Legacy data can be used in Health care analytics/Evidence-Based Medicine/historical controls database/Data-Driven Translational research/predictive analytics.

Big Data Analytics

In this era of digitalization, it’s not hard to analyze large amount of data.
Big data is a term that describes the large volume of data which includes both structured, semistructured and unstructured which can be analyzed for insights that lead to better decisions and strategic business moves that gives boost for the business.
In the healthcare industry, various sources for big data include clinical trial data, hospital records, EMRs, EHRs, biomedical research data, omics data(genomic and transcriptomic) and medical devices that are a part of internet of things(IoT).

For data collection, processing and analysis we can utilize Machine Learning(ML) and Artificial Intelligence(AI) and for clinical trial databases we can utilize big data technologies like Hadoop/Apache Spark based systems in clinical industry.

Traditional data mining approaches are CRISP-DM (CRoss-Industry Standard Process for Data Mining) from SPSS and SEMMA(Sample, Explore, Modify, Model, and Asses) from SAS.

SAS developed a new big data approach from SEMMA known as EMSMA(Explore, Modify, Segment, Model, and Asses) and Modak developed a product called NABU which combines human expertise, machine learning algorithms, data science and in-house developed fingerprinting technology.

We can utilize Modak’s NABU for legacy clinical trial data standardization with CDISC SDTM(Nabu converges data discovery, ingestion, preparation, catalog, unification, quality and profiling into a single enterprise platform with the metadata being the primary driver)

REFERENCES:
Introduction to Clinical Trials
An Introduction to the Standard Data Tabulation Model (SDTM)
Link between Clinical Research and SDTM

--

--

DEVENDER PALSA

SAS Programmer | Data Analytics | Clinical Trials | CDISC