Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
252 changes: 252 additions & 0 deletions monai/tutorials/health_kpi_analysis/health_kpi_analysis.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Comment on lines +4 to +8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make the intro cell markdown.
Line 4-8 defines a code cell but the content is markdown, so headings render as comments.

💡 Suggested fix
-        {
-            "cell_type": "code",
-            "execution_count": null,
-            "metadata": {},
-            "outputs": [],
+        {
+            "cell_type": "markdown",
+            "metadata": {},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
🤖 Prompt for AI Agents
In `@monai/tutorials/health_kpi_analysis/health_kpi_analysis.ipynb` around lines 4
- 8, The first cell is currently defined as a code cell ("cell_type": "code")
but contains markdown content, so change its "cell_type" value to "markdown" for
that cell (the one with "execution_count": null and empty "outputs") and ensure
the notebook cell's "source" remains the same so headings render correctly; you
can also remove "execution_count" and "outputs" entries if present for
cleanliness.

"# Healthcare KPI Analysis with MONAI\n",
"\n",
"## Overview\n",
"\n",
"In this tutorial we demonstrate how to compute common healthcare Key Performance Indicators (KPIs) using synthetic inpatient admissions data. We illustrate how tabular health data can be integrated into analytical workflows relevant to medical AI research.\n",
"\n",
"The KPIs computed in this tutorial include:\n",
"\n",
"- **Average Length of Stay (LOS)**\n",
"- **30-day Readmission Rate**\n",
"- **Daily Bed Occupancy**\n",
"\n",
"Although this tutorial does not use real patient data, the methodology is representative of analytics performed in hospital operations, population health management and clinical ML evaluation contexts.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Motivation\n",
"\n",
"Healthcare operations and clinical pathways generate complex tabular datasets that include admission, diagnosis and discharge patterns. These datasets complement medical imaging and can support:\n",
"\n",
"- risk stratification\n",
"- resource allocation\n",
"- quality metrics\n",
"- patient flow analysis\n",
"- clinical outcome modelling\n",
"\n",
"By combining synthetic EHR-like tabular data with MONAI workflows, we demonstrate how such metrics can be derived reproducibly and ethically for research and prototyping.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required dependencies if running standalone\n",
"# !pip install health-analytics-toolkit pandas matplotlib seaborn\n",
"\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"import health_analytics_toolkit as hat\n",
"\n",
"sns.set(style=\"whitegrid\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Generate Synthetic Data\n",
"\n",
"We generate a synthetic inpatient admission dataset using the `health-analytics-toolkit` package. The dataset includes:\n",
"\n",
"- demographic attributes\n",
"- diagnosis codes\n",
"- admission and discharge timestamps\n",
"- hospital site codes\n",
"- readmission indicators\n",
"\n",
"Synthetic data avoids patient privacy concerns while preserving realistic structure.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = hat.generate_synthetic_patients(n=2000)\n",
"df.head()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Define Cohorts\n",
"\n",
"To emulate analytic workflows, we create specific patient cohorts. Cohorts can be defined by:\n",
"\n",
"- age thresholds\n",
"- diagnosis categories\n",
"- hospital sites\n",
"- admission period\n",
"\n",
"In real environments this supports service line analysis and operational decision-making.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Example: patients aged 65+ (elderly cohort)\n",
"elderly = hat.filter_by_age(df, min_age=65)\n",
"\n",
"# Example: patients with specific chronic diagnoses\n",
"chronic_codes = [\"I10\", \"E11\", \"N18\"] # hypertension, diabetes, kidney disease\n",
"chronic = hat.filter_by_diagnosis_codes(elderly, chronic_codes)\n",
"\n",
"# Example: admissions to a specific hospital site\n",
"siteA = hat.filter_by_hospital_site(chronic, [\"NHS-TRUST-A\"])\n",
"\n",
"print(f\"Original dataset: {len(df)} patients\")\n",
"print(f\"Elderly cohort: {len(elderly)} patients\")\n",
"print(f\"Chronic elderly cohort: {len(chronic)} patients\")\n",
"print(f\"Site A chronic elderly cohort: {len(siteA)} patients\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Compute Healthcare KPIs\n",
"\n",
"We compute three common hospital operations KPIs:\n",
"\n",
"### **Average Length of Stay (LOS)** \n",
"Measures inpatient duration and informs acuity, throughput and discharge planning.\n",
"\n",
"### **30-day Readmission Rate** \n",
"Proxy for care quality and care coordination, commonly monitored in public healthcare systems.\n",
"\n",
"### **Daily Bed Occupancy** \n",
"Estimates operational capacity utilisation across inpatient wards.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"alos = hat.average_length_of_stay(siteA)\n",
"readmit_rate = hat.readmission_rate(siteA)\n",
"bed_occ = hat.daily_bed_occupancy(siteA)\n",
"\n",
"print(f\"Average LOS: {alos:.2f} days\")\n",
"print(f\"30-day Readmission Rate: {readmit_rate:.1%}\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Visualise Outputs\n",
"\n",
"Operational analytics frequently rely on visualisation to communicate patterns to clinical and administrative stakeholders.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(12, 4))\n",
"bed_occ.plot()\n",
"plt.title(\"Daily Bed Occupancy (Site A Chronic Elderly Cohort)\")\n",
"plt.ylabel(\"Occupied Beds\")\n",
"plt.xlabel(\"Date\")\n",
"plt.show()\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(6,4))\n",
"df_los = siteA.copy()\n",
"df_los[\"los\"] = (pd.to_datetime(df_los.discharge_date) - pd.to_datetime(df_los.admission_date)).dt.days\n",
"\n",
"sns.histplot(df_los[\"los\"], bins=20, kde=False)\n",
"plt.title(\"Distribution of Length of Stay (days)\")\n",
"plt.xlabel(\"Length of Stay\")\n",
"plt.ylabel(\"Count\")\n",
"plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Discussion\n",
"\n",
"This workflow demonstrates that synthetic EHR-like tabular data can support healthcare analytics tasks such as:\n",
"\n",
"- resource planning (bed occupancy)\n",
"- quality benchmarking (readmission)\n",
"- pathway efficiency measurement (LOS)\n",
"- cohort stratification (diagnosis, age, site)\n",
"\n",
"These metrics can complement MONAI workflows that analyse medical imaging datasets, enabling multimodal clinical ML model development.\n",
"\n",
"In real-world settings such analyses may contribute to:\n",
"\n",
"- population health management\n",
"- service line optimisation\n",
"- discharge planning\n",
"- clinical commissioning\n",
"- digital clinical transformation programmes\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Reproducibility & Notes\n",
"\n",
"- This tutorial uses synthetic data to ensure full privacy compliance.\n",
"- Underlying distributions are configurable and can be adapted for benchmarking scenarios.\n",
"- No Protected Health Information (PHI) is used.\n",
"- All code is executable on standard CPUs without specialised hardware.\n",
"\n",
"### Dependencies\n",
"\n",
"- Python \u2265 3.9\n",
"- pandas \u2265 1.5\n",
"- health-analytics-toolkit \u2265 0.1.0\n",
"- matplotlib, seaborn (optional for plots)\n",
"\n",
"### Suggested Extensions\n",
"\n",
"- incorporate imaging-derived features (e.g., MONAI outputs)\n",
"- integrate survival analysis packages for clinical outcomes research\n",
"- link with FHIR-like schema for interoperability\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading