Registry Forge · cohort EDA report

Synthetic demonstration cohort (220 patients) — cohort exploration

Aggregate views of the codes and coverage exported through Registry Forge. No PHI is included; identifiers are pseudonymized and dates are aggregated.
Generated 2026-05-11 Patients 220 Categories 2 k-anonymity threshold 5
Privacy protections applied: patient identifiers replaced with synthetic IDs (PT-NNNN); dates of birth converted to 10-year age bands (90+ collapsed per HIPAA Safe Harbor); observation period shown as duration only; cross-tab cells with N<5 suppressed as "<5"; no free-text content included; per-patient codes/diagnoses not shown.

Cohort overview

Demographics of the 220 patient(s) in the bundle. Distributions are shown only where group size is at least 5; smaller groups are aggregated as "Other".

Sex distribution

Proportion of cohort by reported sex.

Age band (at report generation date)

10-year age bands; ages over 89 collapsed per HIPAA Safe Harbor.

Race

Self-reported race; categories below k-anonymity threshold collapsed.

Ethnicity

Self-reported ethnicity; categories below k-anonymity threshold collapsed.

Data volume

What's in the bundle, by clinical category.

Total records per category

Number of records the pipeline extracted, by bundle category.

Unique codes per category

Distinct (vocabulary, code) pairs appearing in each category.

Records per patient (distribution)

Histogram of total records (any category) per patient.

Source format breakdown

Where in the original SMART on FHIR pull each record came from.

Observation period

Per-patient duration from earliest to latest dated record. Reported as years; absolute dates are never shown.

Observation period bands

Patients grouped by length of observed history.

Distribution (years)

Continuous distribution of observation period across the cohort.

Vocabulary distribution

Which coding systems carry the volume across all categories.

Records by vocabulary

Coding system used (each record can contribute to multiple via translations).

ALS-specific markers

Patients with at least one motor-neuron-disease-spectrum diagnosis code, ALSFRS-R / FVC observation, or ALS-directed medication. Counts below k-anonymity threshold are suppressed.

Patients per marker type

Each patient may have multiple marker types; counts sum to more than the patient total.

Top codes by vocabulary

Top 25 codes per vocabulary, ranked by total references. Patient counts below k-anonymity threshold are suppressed.

VocabularyCodeDisplay name CategoryRecords Patients

Per-patient summary

One row per patient using pseudonymized identifiers. Diagnostic codes are not shown per patient to prevent re-identification of rare disease cases; record counts are also banded into ranges.

Synthetic IDSexAge band RaceEthnicity ObservationRecords ALS markers