Common Data Elements, Part 4 of 4

The Introduction to the Principles and Practice of Clinical Research (IPPCR) is a course to train participants on how to effectively conduct clinical research. The course focuses on the spectrum of clinical research and the research process by highlighting epidemiologic methods, study design, protocol preparation, patient monitoring, quality assurance, and Food and Drug Administration (FDA) issues.

Summary

Table of Contents

Introduction

This final installment (Part 4 of 4) focuses on the NIH Common Data Elements (CDEs)—what they are, how they’re governed, where to find them, and how to use them to make your research accurate, consistent, and interoperable. You’ll learn to navigate the NIH CDE Repository, understand endorsement criteria, and align your studies with FAIR data principles.

What Are NIH Common Data Elements?

Common Data Elements (CDEs) are standardized, precisely defined questions, variables, and response options used across studies. Their purpose is to enable harmonized data collection, facilitating reuse, pooling, and meta-analysis across diseases, disciplines, and institutes.

The NIH CDE Task Force & Governance

The NIH CDE Task Force is a trans-NIH community of practice that:

  • Reviews CDE collections submitted by NIH-recognized bodies (Institutes, Centers, trans-NIH committees).
  • Determines whether they meet endorsement criteria for NIH-funded research.
  • Maintains the NIH CDE Repository, a central access point for recommended or required CDEs.
  • Coordinates with disease-focused CDE programs (e.g., NINDS CDEs).

NIH CDE Repository: What It Is and Why It Matters

Launched in 2015, the NIH Common Data Element Repository:

  • Provides human- and machine-readable CDE definitions.
  • Hosts CDEs recommended or required by NIH Institutes and Centers.
  • Advances FAIR data sharing (Findable, Accessible, Interoperable, Reusable).
  • Was prioritized in the NLM strategic plan as a key enabler of data standardization.

The Repository includes:

  • Cross-disciplinary toolkits and toolboxes.
  • Disease-focused CDEs (e.g., NINDS).
  • Consistent controlled vocabularies and metadata for computational readiness.

Using the Repository: How to Search and What You’ll Find

The Repository lets you:

  • Browse by collection (e.g., NINDS, NINR) and see counts of available CDEs.
  • Search by keyword, data type, domain, or value set.
  • View each CDE’s:
    • Name and definition
    • Prompt/question text
    • Permissible values / value sets
    • Data type, format, coding system(s)
    • Usage and provenance

Example: Administrative/Address CDEs

  • City Name, County Name, Address Line
    Each includes a standard definition, format constraints, and (where applicable) code systems—so your administrative data are collected uniformly across sites.

Example: Patient-Reported Outcomes (NINR)

  • Survey item (e.g., “It is hard for me to play or go out with my friends as much as I’d like.”)
    The repository shows the exact prompt, response options/value set (with labels and codes), and context of use—allowing direct reuse with interoperable coding.

Approval Criteria for NIH-Endorsed CDEs

The NIH CDE Governance Committee reviews and approves CDE collections using the NIH Scientific Data Council criteria:

  1. Clear, unambiguous definition of the variable and measure (prompt + response).
  2. Documented reliability and validity (psychometrics/evidence).
  3. Human- and machine-readable specification (structure + metadata).
  4. Designation by a recognized NIH body (Institute, Center, or NIH committee).
  5. Licensing/IP clarity (open use preferred; if restricted, conditions are explicit).

FAIR Data, Interoperability, and Metadata

CDEs operationalize FAIR:

  • Findable: Persistent identifiers and searchable metadata.
  • Accessible: Public repository access with clear use conditions.
  • Interoperable: Standardized vocabularies, codified value sets, computable formats.
  • Reusable: Clear definitions, provenance, reliability/validity evidence, and licensing.

Using CDEs means your datasets come metadata-ready for machine processing and cross-study synthesis.

Recommendations for Future-Ready Data Sharing

A core best practice (echoed in recent data-sharing guidance):

  • Ensure every shared dataset/document is paired with concise, public, consistently structured discovery metadata that describes:
    • What the object is,
    • How it can be accessed,
    • Under what conditions (licensing, access controls).
  • Build/participate in metadata repositories that align with FAIR, use standard terminologies, and expose computable CDEs to maximize discoverability (human + machine) and reuse.

Conclusion

The NIH CDE ecosystem—anchored by the Task Force, Governance Committee, and the CDE Repository—gives researchers practical tools to standardize data capture, satisfy FAIR expectations, and accelerate interoperable science. By selecting endorsed CDEs and pairing them with high-quality metadata, you boost your study’s immediate rigor and its long-term impact through reusability and integration.

Key Takeaways

  • CDEs = standardized variables (prompts, responses, codes) enabling consistent, reusable research data.
  • The NIH CDE Repository (since 2015) hosts endorsed CDEs that are human- and machine-readable.
  • Governance criteria require clear definitions, psychometric support, computability, NIH designation, and transparent licensing.
  • CDEs operationalize FAIR—making datasets findable, accessible, interoperable, and reusable.
  • Pair all shared data with concise, public, structured discovery metadata—this is essential for discoverability and reuse.
  • Start with disease-focused CDEs (e.g., NINDS) and cross-cutting toolkits; reuse value sets and formats to ensure interoperability across studies.

Raw Transcript

[00:00] Common Data Elements, Part 4 of 4. My name again is John

[00:20] Walter McKeevey reviewing common data elements. So, NIH has a common data element task force. It's a trans-NIH community of practice, includes governance, subcommittee. The primary charge is to decide whether common data elements submitted to them by NIH recognize

[00:40] bodies meet the criteria that should be identified for use in NIH-funded research. Maintain NIH Common Data Element Repository, providing that central access point to data elements that have been recommended or required by NIH Institutes.

[01:00] and centers for use in research and for other purposes. So it is a repository. It has all the NIH common data elements. It shares the NINDS common data elements. So there is the disease-focused common data elements.

[01:20] NIH Common Data Elements repository was launched in 2015. It provides access to the structured human and machine-readable definitions of data elements that we recommend and in some cases are required by NIH ICs for clinical research use.

[01:40] It was identified as part of NLM's strategic plan to identify common data elements in facilitating the repository and expanding the repository. NIH encourages researchers to use common data elements to improve accuracy, consistency, and

[02:00] and interoperability among datasets within various areas of health and disease research. Frost disciplines and domains of common data elements, there's toolkits, there's toolboxes, there's different tools, and again, NINDS is disease-focused common data elements.

[02:20] And so there's a link to the common data element about page that has links to many other pages. There is a guide to the NIH common data element repository. The committee reviews, submissions, and ors to elections that meet meaningful.

[02:40] criteria. The NIH endorsed common data elements published in the NIH common data element repository. It supports FAIR data sharing that we reviewed in part one. It adheres to the FAIR principles as we reviewed in part one.

[03:00] It provides high-quality computational-ready data with standardized recapitularies and readable metadata retrieved by identifiers. The Governance Committee reviews and approves collections of common data elements at the NIH and again a link of those guides. So searching

[03:20] the NIH Common Data Element Repository. It has the various collections, different institutes that provide data. You see NINDS. All their data is provided, the 1,200 and 427 common data elements that they have. It's available

[03:40] in their site, but also available in the NIH Common Data Element Repository. You can search across these similar to what we saw in NINDS. And so this reviews the search. You see the search bar. You see the data types you can search.

[04:00] across different values that you can search. You can see this is an example of formats for address, city name, county name, address line, what the standard is, and what the codes are for that value and where it was used.

[04:20] Here's an example from NINR forum questions. It identifies the question being hard for me to play or go out with my friends as much as I liked. Then they identify what values in a different place.

[04:40] for that question. And so that's an example of the Nursing Research Institute utilizing common data elements. NIH common data element repository governance, the Governance Committee reviews and

[05:00] improves collections of common data elements. The NIH Scientific Data Council criteria for approval is clear definition of variable and measure would prompt and response. Documented evidence of reliability and validity. Human and machine readable

[05:20] preferred, recommended design designated by a recognized NIH body, either an institute, center, or committee. And licensing and intellectual property status is clear. Is it open use, open source, can it be used by everybody, or is there conditions in?

[05:40] its use. A recommendation for the future. So this is a paper that we reviewed previously on reviewing data sharing plans and it identifies common data element usage without using that term and that's why it's important to review. It identifies that

[06:00] any dataset or document made available for sharing should be associated with concise, publicly available, and consistently structured discovery metadata, describing not just the data object itself, but also how it can be accessed. This is to maximize

[06:20] its discoverability by both humans and machines. And this refers to common data elements. This refers to fair guidance. This refers to terminologies. And so it identifies having a meta-data repository of all these items.

[06:40] to make sharing across clinical research and sharing for other purposes to be efficient and effective.

Related Episodes

Leave a Reply

Your email address will not be published. Required fields are marked *