About & license¶
What this is¶
Registry Forge - Patient Edition is an open-source notebook + documentation site that turns a folder of patient-downloaded C-CDA XML documents into a research-ready long-format data set, a searchable browser dashboard, and an ML-ready feature matrix.
What this is not¶
- Not a clinical decision support tool. No clinical claims are made about its outputs. Use it for research, exploration, and methods development.
- Not validated against any specific EHR vendor's C-CDA implementation. It has been tested against generic CDA R2 documents conforming to the HL7 v3 namespace. Patient-portal exports vary; expect to tune the parser for edge cases your specific source vendor introduces.
- Not an official Registry Forge product. It reuses the Registry Forge long-format schema and dashboard concept by design, so outputs are interoperable, but the two projects are independently maintained.
Relationship to Registry Forge¶
Registry Forge (boycelab.github.io/RegistryForge) is a comprehensive SMART-on-FHIR ETL pipeline developed by Danielle Boyce, MPH, DPA at the ALS Therapy Development Institute, supported by CDC grant R01-TS000341. It targets organizations doing institutional EHR extracts: Databricks chunked CSVs, FHIR R4 Bundles, C-CDA from system-level exports, plus add-on modules for OMOP CDM, GA4GH Phenopackets, and more.
Patient Edition occupies a smaller niche: the researcher working with files downloaded by the patient themselves. That workflow has different inputs (no Databricks, no SMART on FHIR, no chunked CSVs - just XML files on disk), different identity-resolution needs (name + DOB hashing instead of MRN-driven joins), and a different scale (often one patient, sometimes a few dozen, rarely thousands).
By matching Registry Forge's output schema, anything you build downstream - analyses, visualizations, ML pipelines - works on either project's outputs.
License¶
Released under the MIT License. Permissive, no warranty, free to use, modify, and redistribute.
Citation¶
If this tool contributes to a paper or a poster, please cite both:
- This project:
Registry Forge - Patient Edition. Companion tool to Registry Forge for patient-mediated C-CDA data exchange.https://boycelab.github.io/RegistryForge4Patients/ - Original Registry Forge: see their citation page for the canonical reference.
Contact¶
Open a GitHub issue on the repository for bug reports, parser edge cases, or feature requests.