Output schema¶
The pipeline emits dashboard_data.json -- a single JSON object containing
the full processed bundle. This page documents its structure.
Top-level shape¶
{
"metadata": { "generated_at": "...", "total_patients": 96 },
"patients": [...],
"documents": [...],
"encounters": [...],
"problems": [...],
"medications": [...],
"labs": [...],
"vitals": [...],
"labs_vitals": [...],
"procedures": [...],
"allergies": [...],
"immunizations": [...],
"careplans": [...],
"diagnostic_reports": [...],
"goals": [...],
"notes": [...],
"document_references": [...]
}
Per-record schemas¶
Most clinical records share a common base. Per-tab specifics extend it.
Common fields (every clinical record)¶
| Field | Type | Description |
|---|---|---|
patient_id |
string | Foreign key to patients[].patient_id |
code |
string | Primary clinical code (e.g. SNOMED, RxNorm, LOINC, CPT) |
code_system |
string | System name (SNOMED-CT, RxNorm, LOINC, etc.) |
display_name |
string | Human-readable name for the code |
all_codings |
array | All codings present on the source CodeableConcept |
effective_date |
string (ISO date) | Primary date for the record |
source |
string | "fhir" or "ccda" -- which input produced this record |
Patient¶
| Field | Type |
|---|---|
patient_id |
string |
first_name, last_name |
string |
mrn |
string |
dob |
string (YYYY-MM-DD) |
gender |
string |
num_documents |
int -- count of documents linked to this patient |
Document¶
| Field | Type |
|---|---|
patient_id |
string (nullable) |
document_uuid |
string |
source_file |
string |
source_format |
string -- "ccda_xml", "html_fragment", "rtf", "pdf", "unknown" |
source_type |
string -- "rtf_note" for RTF, otherwise same as source_format |
plain_text |
string -- extracted text, capped at 30,000 chars |
Encounter¶
source, display_name (class display), type (visit type), code
(class code), effective_date, end_date, status. Aliases:
start_date, visit_type.
Problem¶
source, code, code_system, display_name, effective_date,
clinical_status. Aliases: onset_datetime.
Medication¶
source, code, code_system, display_name, effective_date,
status, fhir_resource_type (MedicationRequest /
MedicationStatement / MedicationAdministration). Aliases:
authored_on, start_date, medication_subtype.
Observation (lab or vital)¶
source, code, code_system, display_name, value, unit,
category, effective_date. Aliases: effective_datetime.
The category field distinguishes vitals from labs. The labs_vitals
top-level key is a convenience union.
Allergy¶
source, code, code_system, display_name, effective_date,
severity, clinical_status. Aliases: allergen, allergen_code,
allergen_system, recorded_date.
Immunization¶
source, code, code_system, display_name, effective_date,
status. Aliases: vaccine, vaccine_code, vaccine_system,
occurrence_datetime, recorded_date.
Procedure¶
source, code, code_system, display_name, effective_date,
status. Aliases: performed_datetime.
CarePlan¶
source, title, description, status, effective_date.
DiagnosticReport¶
source, code, code_system, display_name, status,
effective_date, conclusion.
Goal¶
source, description, status, effective_date.
Section narrative (CCDA)¶
| Field | Type |
|---|---|
patient_id |
string |
section_title |
string -- e.g. "Problems", "Medications", "Vital Signs" |
narrative_text |
string -- full text content of the CCDA <text> element |
character_count |
int |
source_file |
string |
source |
string -- always "ccda" |
DocumentReference (FHIR)¶
patient_id, type_display, status, effective_date, source.
Field aliasing¶
Many records carry both a canonical name and one or more aliases. This is intentional -- it lets downstream code use whichever schema is most convenient. Aliases are added in Stage 4.
Example record¶
{
"patient_id": "demo-patient-001",
"code": "9322",
"code_system": "RxNorm",
"display_name": "Riluzole 50 MG Oral Tablet",
"all_codings": [
{ "code": "9322", "system_name": "RxNorm",
"display": "Riluzole 50 MG Oral Tablet" }
],
"effective_date": "2023-06-01",
"effective_datetime": "2023-06-01",
"authored_on": "2023-06-01",
"start_date": "2023-06-01",
"status": "active",
"source": "fhir",
"fhir_resource_type": "MedicationRequest",
"medication_subtype": "MedicationRequest"
}