You must remember one thing: To get a job in clinical data management, you need to have basic knowledge of clinical research and clinical data management. Needless to mention other things such as good communication skills and Soft skills ie. Microsoft Excel, PPT and Word are “must-have skills” to get closer to your first job.
The below blog is written and tailored to fulfil the need of those freshers who are willing to make their career in the clinical data manager. The language is kept as simple as possible so it is kind of self-explanatory course. Any doubt, you can ask in the comment section or write to us at [email protected], Let’s start it.
The primary goal of Clinical Data Management (CDM) is to ensure timely delivery of high-quality clinical data needed to satisfy both the good clinical practice (GCP) guidelines and the criteria for statistical analysis and regulatory reporting.
It means clinical data managers are caretakers of clinical data. They arrange the things for proper collection of clinical data, make sure that clinical data is clean (minimum error) and ready for statistical analysis.
CDM team members participate actively in all stages of clinical trials right from start to finish
CDM is the process by which subject data are processed, cleaned, and handled in compliance with regulatory standards. We will discuss later on this blog, about all the regulatory requirements in CDM.
First let’s understand what CDM is in actuality.
Clinical Data Management Process
The whole process of clinical data management is divided in three parts across the globe.
- Set up or start-up.
Set up Phase:
We have divided all the Set Up Process in 11 Steps and and we will explain each steps in details. Please read each steps and ref the below flow chart to have a better understanding.
Step 1: Reading the finalized protocol
As we know that Clinical trials are run based on clinical trial protocol.
Once the Protocol is finalized. CDM activities start. It may start before also, if the sponsor is ready to start activities based on drafted protocol and simultaneously sponsors can work on protocol to finalize it.
So here, I am not talking about special cases (drafted protocol) but cases which are quite common (finalized protocol). So forgot about what will happen if the protocol is not finalized.
After reading the complete protocol, CDM has to start working on to below two activities
- Case Report Form designing in eDC
- various document preparation.
Now the question comes,
- What is case report form? and
- What are the documents need to be drafted by Clinical Data Management.
Lets learn about Case Report Form
Case Report Form and eDC
As per ICH E6 (R2),
“A printed, optical, or electronic document designed to record all of the protocol required information to be reported to the sponsor on each trial subject”.
The forms which are used to capture the information as per the protocol is called case report form. It may be in the paper format or electronic format. eDC systems use electronic forms to capture clinical trial data.
You would have filled many forms while schooling or in college life and these forms are the same with only one difference and that is: Information is asked and provided as per clinical trial protocols.
A well-designed CRF will reflect the essential contents of the research protocol, and when the research protocol is finalized, CRF is designed.
to read more about case report forms and example:
Basics of case report form
Now we know about CRF, lets learn about eDC.
eDC (electroninc data capture)
The method to capture clinical data is decided at the organization level.
Trial requirements, tool specifications and budget are the deciding factors to select a particular method to capture data. Organizations can use paper forms to capture the clinical trial-related data or can consider to go with electronic one.
An electronic tool or system to capture clinical data is called eDC (electron data capture). It is also called a clinical database (from data management point of view)
There are many tools available in the markets. Most popular tools are Medidata RAVE, Inform and OC-RDC.
As per FDA, “Electronic data must meet the same fundamental elements of data quality (e.g. attributable, legible, contemporaneous, original, & accurate) and integrity (complete and consistent) expected of paper records “
Case report forms are designed electronically with the help of programming in these eDC.
Now you understand the link between CRF and eDC.
Data Manager reviews the protocol and identifies the key information to be captured.
Information is captured in the Case report form.
- Subject number
- Inclusion-Exclusion Criteria
- Informed consent form
- Visit schedule eg visit 1, Visit 2 etc
- Lab tests to be done at each visit as given below
- Chemistry, Hematology, coagulation, Thyroid, cytokines, Virology, cytokines etc.
- Biomarkers test
- RECIST check if oncology study
- Dose administration record (if intervention trial)
- Sample collection time point
- Treatment arm
- informed consent from the subjects
These are general Case report forms that can vary study to study based on protocol.
Recommended further reading : Difference between Investigator Brochure and Clinical Trial Protocol
Step 2: Search in standard CRF library to identify desired case report forms
Clinical data managers choose the case report forms based on protocol requirements. Every Sponsor keeps a commonly used CRF (also called standard CRF) in their library (like shared drive) and the Data manager chooses the appropriate CRF from this library.
recommended further reading: Basics of case report form
Step 3: All the case report form is identified and the standard specification document (SSD) is ready
Study specific documents mentiones the clinical database structure (eDC)
It mentions all the visit schedules (flow of visits at site) and all the applicable case report forms. It also mentioned all the variables of each case report forms and access conditions of each role. Below example of the SSD is quite simple and the actual SSD is quite big and complicated. SSD is the skeleton of the database and it is drafted in excel format.
Now the question comes how. Let understand it with below table
|Screening||If subject sign the informed consent form||Hematology||Data manager-read access, CRC-Entry access|
|Screening||If subject sign the informed consent form||Chemistry||Data manager-read access, CRC-Entry access|
|Screening||If subject sign the informed consent form||Virology||Data manager-read acces, CRC-Entry access|
|Screening||If subject sign the informed consent form||Inclusion criteria||Data manager-read acces, CRC-Entry access|
|V1||If subject met all the inclusion criteria, V1 should open in the data base||Hematology||Data manager-read acces, CRC-Entry access|
|V1||If subject met all the inclusion criteria, V1 should open in the data base||Chemistry||Data manager-read access, CRC-Entry access|
|V1||If subject met all the inclusion criteria, V1 should open in the data base||Virology||Data manager-read access, CRC-Entry access|
|V2||if subject status is “continue” at V1, then V2 should open||Hematology||Data manager-read access, CRC-Entry access|
Step 4 Request to the programming team to program the Case report forms in the eDC
Till step 3, we have identified the case report forms and SSD is ready. Now we have to share this (CRF plus SSD) with programmers so that they can design the database (eDC).
SSD helps them to understand interaction/flow of various case report form in eDC. For example, Visit 2 should come after visit 1, hematology case report should come at V1 but should not come at V2 etc.
Step 5 Once all the case report forms are programmed, User acceptance testing is performed and it is called UAT-screen.
User Acceptance Testing (UAT): It is a critical component of using Electronic Data Capture (EDC) to record data from clinical trials. The sponsor / CRO must conduct UAT before using EDC to gather data in compliance with a protocol.
Once programmers complete the programming of case report forms in eDC system as per our request, we start UAT.
Generally, Data Manager creates a subject (dummy) in eDC system and thoroughly test every e-CRF.
Data Manager has to raise a request for UAT environment access. it is also called test environment access.
All the Case report forms in eDC systems are checked to ensure that all the variables (questions or data points), the flow of information are programmed correctly and the eDC system is showing the data as intended.
All the finding needs to be documented and is shared with the Programmer to correct. Once Programmer corrects the programming errors, it is considered is “UAT Pass”
UAT is performed whenever there is a change in the database (eDC System). For example, there is a protocol amendment and new CRFs need to be added in the database. The data manager drafts those CRFs related requests and programmer programmes those CRFs in Database. Now the data manager has to perform UAT in the Database to ensure that all the new changes are appearing as intended.
Step 6 All the issues identified in the UAT is resolved then database is shown to other associate teams such as Bio statistics, clinical, Medical etc
Once Screen UAT is passed in step 5, the skeleton of Database is ready and it is shown to all the concerned teams.
It is called online screening 1
You can consider it as a trailer launch of a movie. All the concerns raised by other teams are addressed. It may require to reprogram certain things.
Step 7: Now edit specification document is drafted and once it is ready, it is shared with the programming team to program edit checks
After step 6, a new document needs to be drafted, it is called edit check specification
Edit check Specification
Edit check document mentions all the manual and programmable checks.
“Edit check programs are written i order to identify the discrepancies in the entered data, which are embedded in the database, to ensure data validity”
Edit checks consist of manual and computer checks, which need to be performed on the clinical data to ensure the data is accurate and consistent. The first stage consists of producing an Edit Check Specifications (ECS) document and implementation stage involves the programming and testing of the checks.
With the help of edit check specification document, programming team program the edit checks in the DC. Lets learn more about edit checks
As per CDISC;
“An auditable process, usually automated, of assessing the content of a data field against its expected logical, format, range, or other properties that is intended to reduce error.
NOTE: Time-of-entry edit checks are a type of the edit check that is run (executed) at the time data are first captured or transcribed to an electronic device at the time entry is completed of each field and/or group of fields on a form. Back-end edit checks are a type that is run against data that has been entered or captured electronically and has also been received by the centralized data store.”
Electronic edit Checks allow us to use the computer’s power to review illogical, incomplete or inconsistent data in clinical trials quite efficiently.
Uni-variate edit checks (include range checks)
These are the edit checks which only apply to a single field or variable. For example, we can set up an edit check for subject weight to ensure that the extreme or improbable value is not entered.
Let’s say we set up a range check if data entry is smaller than 40 kg or greater than 90 kg for the subject’s weight.
We can set up an edit check for predicted FEV1 to be no less than 20 per cent for lung function testing because it is less likely to have someone with predicted FEV1 < 20 per cent
Multivariate edit checks (also called aggregate edit checks):
To ensure that the data is logical and consistent, these edit checks cross-check the entries through more than one fields/variables.
For example, if the Gender field entry is ‘ Male, ‘ there should be no data for the result field of the pregnancy test.
If a subject dies at some time point of the study, then there should be no visit or lab test entries for that subject after that death date
One misconception is to think that the implementation of the edit checks will fix all data issues. Edit check is just one step in the process of cleaning up results. Unnecessary edit checks on non-critical fields can be very irritating for the sites.
Step 8 Once all the edit checks are programmed, a UAT-Edit check is performed. This is the second UAT in the set up phase.
Now edit specification document is drafted and once it is ready, it is shared with the programming team to program edit checks. Once all the edit checks are programmed, a UAT-Edit check is performed. This is the second UAT in the setup phase.
Step 9: Once all the issues identified in the UAT-II are resolved, the database is ready for screening 2 with all other teams. They can suggest if any modification is needed.
Step 10: Other things like set up of any third-party systems, or any other protocol-specific requirement can be programmed here.
Step 11: Now clinical database (eDC) is live and site (the hospital where the trial is being conducted) can enter the data
There are another two important documents which are drafted by Data Managers in set up, they are
Data Management Plan
CRF completion guidelines
Data Management Plan
Each clinical trial should have a perspective plan for how to capture, process and store data. It is called DMP, Data Handling Plan (DHP) and Data Handling Protocols.
It answers the following questions:
- Which data to be captured
- How to Capture the Data
- What to review and how to handle the different type of Data
- Which tools are used in the study
- Team members in the study
- Handling of Serious adverse event
ICH E6 R2 states “trial sponsors should “implement a system to manage quality throughout all stages of the trial process” and goes on to specify that quality management includes tools and procedures for data collection and processing (ICH E6 R2 2018, SS 5.0).”
CRF completion guidelines:
CRF completion guidelines provide the instruction to site personnel while adding the data into the database.
DMP and CRF completion guideline should be ready before the database go-live.
You should note one thing that the sponsor provides the template for each document. You just need to tailor it according to your project/study
Once the database is released into production, the conduct phase of CDM starts.
In this Phase, site enters the data and CDM has to review it. If CDM finds any inconsistency, discrepancy or mistake, then he/she can raise the query in the database. Generally, the clinical research coordinator enters the data and they will respond to all queries raised by DM to site.
There is another role in clinical research. It is called Clinical research associate. CRA works as a liaison between Sponsor and site. CRA performs the source data verification. It means they check the original document and its entries in the database. There is much data which is entered first on primary documents such as local laboratory data, then CRC enters all the data into a database. CRA can look into this data and check whether data present in the primary document and entered data is the same or not.
Data manager has to perform cross check for the data for example, subject age is died on 19-Jan-2020 but sample collection date is mentioned as 21-Jan-2020. He has to be quite good at excel.
Reconciliation of external data:
All the data do not flow directly into the database as many vendors are associated with many lab assessments and they send the data separately. This is quite an important task performed in the conduct phase of the study.
For instance, Advsere event is captured in two different database. Clinical database (the one, Data manager designs and having CRFs) and Safety database. Argus is the most commonly used software to capture safety data. Both data are reconciled as per standard operating procedure and issues are communicated to the safety team. So this data is verified with existing data in the database.
For example, if Lab A needs to perform at all visits, then external data should have lab data for all the visits, the subject has performed to the site. Once all the data is reconciled, it is sent to the vendor or site for correction/clarification.
There are many types of reviews performed during the conduct phase. For example, few subjects’ data is analysed to check if it is good to go with study. Interim analysis is performed to check the efficacy and safety profile of the drug.
When all the data is entered and all the validation check is done and no discrepancy is observed, data is considered ready for lock. Pre-lock checklist is singed and permission from all the stakeholder is taken prior locking the database
Database lock is considered a state of database where no change is permitted. It denotes that all the relevant information is collected, reviewed and it is free of any discrepancy.
It is performed once all the stakeholder such as CDM, Biostatistics, Safety and Medical team has reviewed the data and pre-lock checklist is signed.
There are two types of lock:
It is done before hard lock. It refers to the process where access is quite limited and CDM personnel confirms the suitability of data for final analysis.
Data can be unlocked at this stage if required by requesting privileged user.
it is also called freeze or hard lock. All the access to the database is removed and no further changes are allowed.
Data is extracted once the database lock is done. Statistical analysis is done on extracted data. Results are documents in well-strcutered reports and submitted to the regulatory authority.
All the essential documents such as clinical trial protocols, IB, UAT related documents, ECS and other database related documents are archived in sponsor specific archival systems. All the stakeholder has to archive the data as per their work.
Recommended: What is Clinical Trial Master File?
CDM has certain guidelines and standards that must be met
- Code of Federal Regulations (CFR), 21 CFR Part 11.
- Good Clinical Data Management Practices
- ICH E6- GCP
- Clinical Data Interchange Standards Consortium
Electronic records must comply with a Code of Federal Regulations (CFR), 21 CFR Part 11.
“This regulation is applicable to records in an electronic format that are created, modified, maintained, archived, retrieved, or transmitted. This demands the use of validated systems to ensure accuracy, reliability, and consistency of data with the use of secure, computer-generated, time-stamped audit trails to independently record the date and time of operator entries and actions that create, modify, or delete electronic records”
Good Clinical Data Management Practices (GCDMP) guidelines are published by Society for Clinical Data Management (SCDM) .
As per SCDM:
“The Good Clinical Data Management Practices (GCDMP©) standard provides a reference to clinical data managers in their implementation of high-quality Clinical Data Management processes and is used as a guidance tool for clinical data managers when preparing for CDM training and education.”
They are drafting a new GCDMP in 2020.
ICH E6- GCP guideline is also applicable to CDM as it is part of core clinical trial activity. Other guidelines also refer to ICH guidelines.
Clinical Data Interchange Standards Consortium:
Study Data Tabulation Model Implementation Guide for Human Clinical Trials (SDTMIG) and the Clinical Data Acquisition Standards Harmonization (CDASH) standards are the most important standards given by CDISC
SDTMIG : It provides model and standard terminologies for the data.
CDASH: It defines the standards to be followed for collection of data and provides a list of basic information which is needed from a clinical, scientific and regulatory point of view.
Clinical data management is a quite important aspect of clinical research which is responsible for data collection, cleaning and reliable and statistically sound data generation. There are three Phases of clinical data management. Set up, conduct and close.
Setup phase is divided into 11 steps. It starts from reading of finalized protocol and CRF selection. Database developer program these selected CRF in eDC.
UAT-Screen is performed to check if the database is performed as expected.
Now Edit checs are programmed and UAT-Edit check is done to check if edit checks are firing as expected or not. Edit checks are responsible for firing autoqueries in the database whenever inconsistent or inaccurate data is entered.
Once UAT-edit check is passed, database (eDC) is moved to production (database go live).
Now the conduct phase of the CDM starts. In this phase site personnel enter the data into the database and clinical data management review this data and raise the queries whenever there is any inaccurate information (Discrepancy management and data cleaning). Reconciliation of various data point also takes place in conduct phase.
Once all the subjects visit ends, CDM has to review all the data and need to make sure that all data entries are done and it is cleaned and ready for lock.
Once all the cleaning and data entry is completed, it is time to perform database lock. Database lock activities are close out activities. Now data can be extracted for further statistical analysis.
References and recommended sources
Parts of a clinical trial protocol and Clinical Data Management (CDM) prospective to review
- Below recommended blogs are quite important for all freshers seeking jobs in Pharma and CRO industries
- The Best compilation of all job options in Pharmaceutical and CRO industries: Must read for freshers.
- Best Use of Naukari Portal for Pharma Jobs
- Complete Guide: How to Search and get the job in Pharma and CRO industry
- Complete Guide for writing effective Pharma- Resume: Your First Impression
- Phases of Clinical Trials
- Beginner’s Guide to Clinical Research Coordinator
- Difference between Investigator Brochure and Clinical Trial Protocol
Best books on clinical research-Must read for all