Installation¶
Registry Forge - Patient Edition ships as a pip-installable Python package with a command-line interface and importable modules. Notebooks are demos that import the package.
Prerequisites¶
- Python 3.9 or newer
- About 200 MB of disk for dependencies (pandas, numpy)
Install (recommended)¶
The source repository is private during early development. Releases are published as public zip assets on GitHub Releases, and that is the supported install path for external users:
pip install https://github.com/BoyceLab/RegistryForge4Patients/releases/latest/download/RegistryForgePatient.zip
To pin a specific version, use the tag URL instead of latest:
pip install https://github.com/BoyceLab/RegistryForge4Patients/releases/download/v0.1.0/RegistryForgePatient.zip
The Release page also publishes a demo output bundle (registry_forge_patient_bundle.zip) showing every artifact the pipeline produces against the synthetic sample patients.
Install from PyPI (when published)¶
This becomes the simplest path once the package is published. Until then, use the Release asset URL above.
Install from a local checkout (maintainers / collaborators with repo access)¶
git clone https://github.com/BoyceLab/RegistryForge4Patients.git
cd RegistryForge4Patients
# Optional but recommended: virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install in editable mode
pip install -e .
The -e (editable) install means any local changes you make to the source are picked up immediately.
Verify the install¶
You should see version 0.1.0 and a help message listing four subcommands: parse, omop, phenopackets, mondo.
Run the smoke test against the included synthetic data:
This produces the full output bundle (master CSV, dashboard, feature matrix, OMOP tables, Phenopackets v2 JSONs, Mondo-mapped CSV, and a zip of everything) in ./out/.
Optional extras¶
# EDA reports (ydata-profiling, sweetviz)
pip install "registryforge-patient[eda]"
# Notebooks (Jupyter, matplotlib, seaborn)
pip install "registryforge-patient[notebook]"
If you installed from a Release asset, the same extras syntax works:
pip install "https://github.com/BoyceLab/RegistryForge4Patients/releases/latest/download/RegistryForgePatient.zip[eda]"
What gets installed¶
The package installs:
- A Python module
registryforge_patientwith submodules:parser,builder,dashboard,identity,phi,harmonize,omop,vocab,mondo,phenopackets,eda,notes,pipeline,cli. - A command-line entry point
registryforge-patient.
Dependencies: pandas>=1.3 and numpy>=1.21. Everything else is in the Python standard library.
For the demo notebooks under notebooks/, you'll also want Jupyter (or use the [notebook] extras).
Run the demo notebook¶
The quickstart notebook is a thin wrapper around the package - it imports registryforge_patient.build_outputs and runs it. Useful for interactive exploration of the outputs. For production runs, the CLI is faster.
Privacy note for installation¶
The package itself makes no network calls. Pip needs network access to download dependencies during install, but that's standard pip behavior - your patient data never leaves your machine through this package.
If your IT setup blocks pip from reaching pypi.org, you can install offline:
# On an internet-connected machine
pip download registryforge-patient pandas numpy -d ./offline_wheels
# Transfer ./offline_wheels to the target machine, then
pip install --no-index --find-links ./offline_wheels registryforge-patient
See Privacy & PHI for more.