Skip to content

Stage 5 -- Display-name enrichment

Purpose

Backfill missing or generic display names using built-in LOINC and SNOMED lookup tables.

What it fixes

EHR exports sometimes carry codes without display strings, or with non-human-readable placeholders like "no display" or "unknown". Without enrichment, dashboard users see this:

Code Display After enrichment
8480-6 (empty) Systolic blood pressure
4548-4 no display Hemoglobin A1c
86044005 (empty) Amyotrophic lateral sclerosis

Lookup tables

The pipeline ships two embedded lookup tables:

  • LOINC_DISPLAY -- ~80 commonly-seen lab and vital-sign codes (CBC, CMP, lipid panel, thyroid, coagulation, cardiac markers, pulmonary tests, the ALSFRS-R itself).
  • SNOMED_DISPLAY -- ALS, common comorbidities (hypertension, diabetes, COPD, atrial fibrillation, depression, anxiety).

These are intentionally narrow. The goal is to fix the most-likely missing displays for an ALS-focused cohort, not to mirror the full LOINC/SNOMED distributions.

Adding entries

Both tables are plain Python dicts at the top of run_pipeline.py. To add codes:

LOINC_DISPLAY = {
    ...
    '12345-6': 'Your new lab name',
}

For comprehensive enrichment, point the pipeline at a full LOINC/SNOMED release. The lookup is dict.get(code), so any mapping you supply works the same way.

Scope of the rule

enrich_one(code, display) returns the existing display unless it's empty or matches one of the placeholder strings ('no display', 'unknown', ''). Any non-empty, non-placeholder display is preserved as-is -- the EHR's own display text wins over the lookup table.