feat: Serve resource usage metrics #58

mfranczel · 2026-01-15T15:32:10Z

Summary by CodeRabbit

New Features
- Added a lightweight HTTP resource-usage metrics server exposing real-time CPU and memory metrics, saturation, and a health endpoint. Returns safe JSON responses with appropriate headers and robust sampling across multiple runtime environments.
Chores
- Installer now launches the new resource metrics server alongside existing monitoring services at runtime.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

linear · 2026-01-15T15:32:13Z

BLU-5457 Serve resource usage metrics from deepnote toolkit

coderabbitai · 2026-01-15T15:32:28Z

📝 Walkthrough

Walkthrough

Adds a lightweight HTTP metrics server exposing memory and CPU metrics. Introduces ResourceMonitor (auto-detects cgroupv2, cgroupv1, psutil, or none; provides get_memory and get_cpu with thread-safe CPU delta and MEM_LIMIT override), MetricsHandler (serves JSON at /, /resource-usage, and /health, sets safe headers and logs requests), and main() to run a ThreadingHTTPServer (port from RESOURCE_USAGE_METRICS_PORT, default 9104). The installer is updated to start this script as an extra server process.

Sequence Diagram(s)

sequenceDiagram
    participant Client as HTTP Client
    participant Server as ThreadingHTTPServer
    participant Handler as MetricsHandler
    participant Monitor as ResourceMonitor
    participant Backend as cgroup/psutil

    Client->>Server: GET /resource-usage
    Server->>Handler: dispatch request
    Handler->>Monitor: get_memory()
    Monitor->>Backend: read memory metrics
    Backend-->>Monitor: memory values
    Handler->>Monitor: get_cpu()
    Monitor->>Monitor: compute CPU delta (lock)
    Monitor->>Backend: read cpu metrics
    Backend-->>Monitor: cpu values
    Monitor-->>Handler: metrics JSON
    Handler-->>Client: HTTP 200 JSON (meta, memory, cpu)

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and concisely summarizes the main change: adding a resource usage metrics server. It accurately reflects the primary feature introduced.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-01-15T15:32:40Z

📦 Python package built successfully!

Version: 1.2.1.dev8+4794871
Wheel: deepnote_toolkit-1.2.1.dev8+4794871-py3-none-any.whl

Install:

pip install "deepnote-toolkit @ https://deepnote-staging-runtime-artifactory.s3.amazonaws.com/deepnote-toolkit-packages/1.2.1.dev8%2B4794871/deepnote_toolkit-1.2.1.dev8%2B4794871-py3-none-any.whl"

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@installer/__main__.py`:
- Around line 385-389: Replace os.path.join usage for resource_usage_script with
pathlib.Path to match project guidelines and mirror the existing prometheus
setup: construct resource_usage_script as Path(config_directory_path) /
"scripts" / "resource_usage.py" and pass its string (or Path) into the
ExtraServerSpec(command=["python", str(resource_usage_script)]). Also update
prometheus_script similarly so both prometheus_script and resource_usage_script
consistently use pathlib.Path; adjust any imports to include pathlib.Path if
missing and ensure ExtraServerSpec(command=...) receives the correct stringified
path.

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6de286f and 67859fa.

📒 Files selected for processing (2)

deepnote_core/resources/scripts/resource_usage.py
installer/__main__.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.cursorrules)

**/*.py: Write clean, readable Python code following PEP 8 style guide
Use type hints with Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Maximum line length: 88 characters (Black default)
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Use black for code formatting
Use isort for import sorting (black profile)
Use flake8 for linting
Use early returns to reduce nesting and extract common checks into variables for readability
Use snake_case for variable and function names
Use PascalCase for class names
Use snake_case for file names
Support Python versions 3.9, 3.10, 3.11, 3.12, and 3.13

**/*.py: Follow PEP 8 with Black formatting (line length: 88)
Use isort with Black profile for import sorting
Use type hints consistently
Use docstrings for all functions/classes
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Always use Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Use snake_case for files, functions, and variables
Use PascalCase for classes
Use appropriate exception types with context logging for error handling
Handle Jupyter/IPython specific exceptions properly
Use early returns to reduce nesting and extract common checks into variables for readability
Use dictionary unpacking for headers (e.g., headers = {"Content-Type": "application/json", **auth_headers})
Use space-separated format for CLI arguments (e.g., --port 8080)

Files:

deepnote_core/resources/scripts/resource_usage.py
installer/__main__.py

🧬 Code graph analysis (1)

deepnote_core/resources/scripts/resource_usage.py (2)

installer/module/virtual_environment.py (1)

path (149-151)

installer/__main__.py (1)

main (407-409)

🪛 Ruff (0.14.11)

deepnote_core/resources/scripts/resource_usage.py

97-97: Consider moving this statement to an else block

(TRY300)

172-172: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

172-172: Dynamically typed expressions (typing.Any) are disallowed in **kwargs

(ANN401)

234-234: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

244-244: Possible binding to all interfaces

(S104)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Test - Python 3.9
GitHub Check: Test - Python 3.10
GitHub Check: Build and push artifacts for Python 3.10
GitHub Check: Build and push artifacts for Python 3.13
GitHub Check: Test - Python 3.11
GitHub Check: Build and push artifacts for Python 3.12
GitHub Check: Build and push artifacts for Python 3.9
GitHub Check: Test - Python 3.13
GitHub Check: Build and push artifacts for Python 3.11

🔇 Additional comments (6)

deepnote_core/resources/scripts/resource_usage.py (6)

1-19: LGTM!

Clean imports with proper optional dependency handling.

21-39: LGTM!

Backend detection and initialization are sound. Thread safety properly addressed.

101-122: LGTM!

CPU seconds parsing handles edge cases appropriately.

226-235: LGTM!

Good security header. Response handling is clean.

238-252: LGTM!

Baseline initialization and graceful shutdown handled correctly. Binding to 0.0.0.0 is appropriate for container metrics exposure.

200-202: The cpu_sat formula is correct; no change needed.

cpu_percent represents total CPU usage across all cores (0-100*cores), not per-core percentage. The division by cpu_limit (cores) correctly normalizes this to saturation percentage (0-100), matching the field name "saturation_percent."

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

deepnote_core/resources/scripts/resource_usage.py

installer/__main__.py

codecov · 2026-01-15T15:37:43Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.27%. Comparing base (6de286f) to head (9e9d988).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #58   +/-   ##
=======================================
  Coverage   73.27%   73.27%           
=======================================
  Files          93       93           
  Lines        5193     5193           
  Branches      757      757           
=======================================
  Hits         3805     3805           
  Misses       1144     1144           
  Partials      244      244

Flag	Coverage Δ
combined	`73.27% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@deepnote_core/resources/scripts/resource_usage.py`:
- Around line 234-235: The parameter name "format" in the method
log_message(self, format: str, *args: Any) shadows the built-in name; rename it
(for example to "msg_format" or "fmt") in the log_message signature and update
the function body to use the new name (e.g., self.address_string() - {msg_format
% args}). Also search for any overrides or callers referring to log_message's
parameter name and ensure they remain compatible (only the parameter name
changes, not its position), and run tests to confirm no other references break.
- Around line 238-244: Add a module-level or main() docstring describing purpose
and behavior, and make main() robust to invalid RESOURCE_USAGE_METRICS_PORT
values by validating/parsing the env var with a try/except (ValueError) and
falling back to the default port 9104 while logging or printing a warning; keep
use of ResourceMonitor(), the initial monitor.get_cpu() baseline call, and
creation of the HTTP server via ThreadingHTTPServer(("0.0.0.0", port), handler)
and reference MetricsHandler when constructing the handler.
- Around line 109-116: The cgroupv1 parsing code can raise ValueError when
converting file contents to int (the branch using self.backend == "cgroupv1"
reading cpuacct/cpuacct.usage and the earlier parts using parts[1]); wrap the
int(...) conversions in a try/except ValueError block (around both int(parts[1])
and int(content)) to catch malformed data, return 0.0 on parse error, and
optionally log/debug the raw content via the existing logging mechanism; ensure
you still handle empty content as before by keeping the content truthy checks
and only attempting conversion inside the try block.

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 67859fa and 95b873a.

📒 Files selected for processing (1)

deepnote_core/resources/scripts/resource_usage.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.cursorrules)

**/*.py: Write clean, readable Python code following PEP 8 style guide
Use type hints with Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Maximum line length: 88 characters (Black default)
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Use black for code formatting
Use isort for import sorting (black profile)
Use flake8 for linting
Use early returns to reduce nesting and extract common checks into variables for readability
Use snake_case for variable and function names
Use PascalCase for class names
Use snake_case for file names
Support Python versions 3.9, 3.10, 3.11, 3.12, and 3.13

**/*.py: Follow PEP 8 with Black formatting (line length: 88)
Use isort with Black profile for import sorting
Use type hints consistently
Use docstrings for all functions/classes
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Always use Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Use snake_case for files, functions, and variables
Use PascalCase for classes
Use appropriate exception types with context logging for error handling
Handle Jupyter/IPython specific exceptions properly
Use early returns to reduce nesting and extract common checks into variables for readability
Use dictionary unpacking for headers (e.g., headers = {"Content-Type": "application/json", **auth_headers})
Use space-separated format for CLI arguments (e.g., --port 8080)

Files:

deepnote_core/resources/scripts/resource_usage.py

🪛 Ruff (0.14.11)

deepnote_core/resources/scripts/resource_usage.py

97-97: Consider moving this statement to an else block

(TRY300)

171-171: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

171-171: Dynamically typed expressions (typing.Any) are disallowed in **kwargs

(ANN401)

234-234: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

244-244: Possible binding to all interfaces

(S104)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Build and push artifacts for Python 3.11
GitHub Check: Build and push artifacts for Python 3.12
GitHub Check: Build and push artifacts for Python 3.9
GitHub Check: Build and push artifacts for Python 3.10
GitHub Check: Build and push artifacts for Python 3.13

🔇 Additional comments (5)

deepnote_core/resources/scripts/resource_usage.py (5)

1-19: LGTM!

Clean import structure with proper optional dependency handling for psutil.

150-165: LGTM!

Thread-safe delta calculation with proper local copies before releasing lock.

183-217: LGTM!

Metrics response structure is clean. MEM_LIMIT override and saturation calculation are correct.

244-244: Binding to all interfaces.

0.0.0.0 exposes the server on all network interfaces. Expected for container metrics but verify this is intentional.

248-252: LGTM!

Clean shutdown handling.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

deepnote_core/resources/scripts/resource_usage.py

deepnote-bot · 2026-01-15T15:42:29Z

🚀 Review App Deployment Started

📝 Description	🌐 Link / Info
🌍 Review application	ra-58
🔑 Sign-in URL	Click to sign-in
📊 Application logs	View logs
🔄 Actions	Click to redeploy
🚀 ArgoCD deployment	View deployment
⏰ Last deployed	2026-01-16 10:44:42 (UTC)
📜 Deployed commit	`496d332526fc76670efc2b18ff725b390f2fb077`
🛠️ Toolkit version	`4794871`

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@deepnote_core/resources/scripts/resource_usage.py`:
- Around line 150-165: _calc_percent can return bogus negative percentages when
cgroup CPU counters reset and cpu_delta becomes negative; detect cpu_delta < 0
after computing it, treat the reset as a counter rollover/reset by updating
self._last_cpu_sec and self._last_time and returning 0.0 (or None if you prefer)
instead of computing a negative percentage. Ensure the check occurs while
holding the same lock context used to read/update _last_cpu_sec/_last_time so
state stays consistent, and document the behavior in a comment near
_calc_percent.

♻️ Duplicate comments (3)

deepnote_core/resources/scripts/resource_usage.py (3)
109-115: Unhandled ValueError in cgroup parsing.

int(parts[1]) at line 109 and int(content) at line 115 can raise if data is malformed. Wrap in try/except like _parse_int.

61-63: psutil returns host memory.

In containers, psutil.virtual_memory().total shows host memory, not the cgroup limit. Consider logging a warning.

238-244: Missing docstring; invalid port crashes.

Add docstring. Wrap int() in try/except for malformed RESOURCE_USAGE_METRICS_PORT.
Proposed fix
 def main() -> None:
-    port = int(os.environ.get("RESOURCE_USAGE_METRICS_PORT", 9104))
+    """Start the resource metrics HTTP server."""
+    try:
+        port = int(os.environ.get("RESOURCE_USAGE_METRICS_PORT", "9104"))
+    except ValueError:
+        port = 9104
+        print("Invalid RESOURCE_USAGE_METRICS_PORT, defaulting to 9104")
     monitor = ResourceMonitor()

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 95b873a and 5ba4638.

📒 Files selected for processing (1)

deepnote_core/resources/scripts/resource_usage.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.cursorrules)

**/*.py: Write clean, readable Python code following PEP 8 style guide
Use type hints with Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Maximum line length: 88 characters (Black default)
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Use black for code formatting
Use isort for import sorting (black profile)
Use flake8 for linting
Use early returns to reduce nesting and extract common checks into variables for readability
Use snake_case for variable and function names
Use PascalCase for class names
Use snake_case for file names
Support Python versions 3.9, 3.10, 3.11, 3.12, and 3.13

**/*.py: Follow PEP 8 with Black formatting (line length: 88)
Use isort with Black profile for import sorting
Use type hints consistently
Use docstrings for all functions/classes
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Always use Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Use snake_case for files, functions, and variables
Use PascalCase for classes
Use appropriate exception types with context logging for error handling
Handle Jupyter/IPython specific exceptions properly
Use early returns to reduce nesting and extract common checks into variables for readability
Use dictionary unpacking for headers (e.g., headers = {"Content-Type": "application/json", **auth_headers})
Use space-separated format for CLI arguments (e.g., --port 8080)

Files:

deepnote_core/resources/scripts/resource_usage.py

🪛 Ruff (0.14.11)

deepnote_core/resources/scripts/resource_usage.py

97-97: Consider moving this statement to an else block

(TRY300)

171-171: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

171-171: Dynamically typed expressions (typing.Any) are disallowed in **kwargs

(ANN401)

234-234: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

244-244: Possible binding to all interfaces

(S104)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Test - Python 3.11
GitHub Check: Test - Python 3.10
GitHub Check: Test - Python 3.12
GitHub Check: Test - Python 3.9
GitHub Check: Build and push artifacts for Python 3.9
GitHub Check: Build and push artifacts for Python 3.12
GitHub Check: Build and push artifacts for Python 3.13
GitHub Check: Build and push artifacts for Python 3.11
GitHub Check: Build and push artifacts for Python 3.10

🔇 Additional comments (3)

deepnote_core/resources/scripts/resource_usage.py (3)
1-19: Clean setup.

Proper optional dependency handling and imports follow guidelines.

244-244: Binding to all interfaces.

0.0.0.0 exposes the server to all networks. Expected for containers, but confirm this is intentional for your deployment model.

198-200: saturation_percent is a ratio, not a percent.

cpu_percent / cpu_limit yields a ratio (e.g., 50% / 2 cores = 25). Either multiply by 100 or rename to saturation_ratio.
Proposed fix (option A: fix calculation)
         cpu_sat = None
         if cpu_percent is not None and cpu_limit:
-            cpu_sat = round(cpu_percent / cpu_limit, 2)
+            cpu_sat = round((cpu_percent / 100) / cpu_limit * 100, 2)
Likely an incorrect or invalid review comment.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

deepnote_core/resources/scripts/resource_usage.py

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@deepnote_core/resources/scripts/resource_usage.py`:
- Around line 50-55: In the block that computes memory usage (using _parse_int
to read "memory.current", _parse_memory_stat("inactive_file"), _read_file for
"memory.max" and _parse_limit), guard against negative used values by ensuring
used = current - inactive_file is floored at zero (i.e., if inactive_file >
current then set used to 0) before returning used and limit; keep all existing
reads/parses and only add the non-negative check for the used variable.

♻️ Duplicate comments (3)

deepnote_core/resources/scripts/resource_usage.py (3)

125-133: Unhandled ValueError in CPU parsing.

int(parts[1]) (line 126) and int(content) (line 132) can raise on malformed data.

Proposed fix

                     if len(parts) >= 2:
-                        return int(parts[1]) / 1_000_000.0
+                        try:
+                            return int(parts[1]) / 1_000_000.0
+                        except ValueError:
+                            return 0.0
             return 0.0
 
         if self.backend == "cgroupv1":
             content = self._read_file(self.root / "cpuacct/cpuacct.usage")
             if content:
-                return int(content) / 1_000_000_000.0
+                try:
+                    return int(content) / 1_000_000_000.0
+                except ValueError:
+                    return 0.0
             return 0.0

175-182: Negative CPU delta unguarded.

Counter resets yield negative cpu_delta, producing bogus percentages.

Proposed fix

         if time_delta <= 0:
             return 0.0
+        if cpu_delta < 0:
+            return 0.0
         return (cpu_delta / time_delta) * 100.0

255-261: Invalid port crashes startup.

int(os.environ.get(...)) raises ValueError on non-numeric input.

Proposed fix

 def main() -> None:
-    port = int(os.environ.get("RESOURCE_USAGE_METRICS_PORT", 9104))
+    """Start the resource metrics HTTP server."""
+    try:
+        port = int(os.environ.get("RESOURCE_USAGE_METRICS_PORT", "9104"))
+    except ValueError:
+        port = 9104
+        print("Invalid RESOURCE_USAGE_METRICS_PORT, using default 9104")
     monitor = ResourceMonitor()

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5ba4638 and d0397d4.

📒 Files selected for processing (1)

deepnote_core/resources/scripts/resource_usage.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.cursorrules)

**/*.py: Write clean, readable Python code following PEP 8 style guide
Use type hints with Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Maximum line length: 88 characters (Black default)
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Use black for code formatting
Use isort for import sorting (black profile)
Use flake8 for linting
Use early returns to reduce nesting and extract common checks into variables for readability
Use snake_case for variable and function names
Use PascalCase for class names
Use snake_case for file names
Support Python versions 3.9, 3.10, 3.11, 3.12, and 3.13

**/*.py: Follow PEP 8 with Black formatting (line length: 88)
Use isort with Black profile for import sorting
Use type hints consistently
Use docstrings for all functions/classes
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Always use Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Use snake_case for files, functions, and variables
Use PascalCase for classes
Use appropriate exception types with context logging for error handling
Handle Jupyter/IPython specific exceptions properly
Use early returns to reduce nesting and extract common checks into variables for readability
Use dictionary unpacking for headers (e.g., headers = {"Content-Type": "application/json", **auth_headers})
Use space-separated format for CLI arguments (e.g., --port 8080)

Files:

deepnote_core/resources/scripts/resource_usage.py

🪛 Ruff (0.14.11)

deepnote_core/resources/scripts/resource_usage.py

99-99: Consider moving this statement to an else block

(TRY300)

188-188: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

188-188: Dynamically typed expressions (typing.Any) are disallowed in **kwargs

(ANN401)

251-251: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

261-261: Possible binding to all interfaces

(S104)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Test - Python 3.10
GitHub Check: Build and push artifacts for Python 3.12
GitHub Check: Test - Python 3.11
GitHub Check: Test - Python 3.9
GitHub Check: Test - Python 3.13
GitHub Check: Build and push artifacts for Python 3.13
GitHub Check: Build and push artifacts for Python 3.10
GitHub Check: Build and push artifacts for Python 3.9
GitHub Check: Build and push artifacts for Python 3.11

🔇 Additional comments (5)

deepnote_core/resources/scripts/resource_usage.py (5)

1-18: Clean setup.

Imports and optional psutil handling look good.

21-39: Solid initialization.

Backend detection order and thread-safe setup are appropriate.

76-116: Parsing helpers well-guarded.

ValueError handled consistently across all parse methods.

141-165: CPU limit handling looks correct.

Exception handling covers ValueError and ZeroDivisionError.

185-252: Handler implementation clean.

Routing, headers, and env override handling are appropriate. Any typing acceptable for HTTP handler inheritance.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

deepnote_core/resources/scripts/resource_usage.py

m1so

looks good for first iteration 👍

consider splitting the "backends" to their own classes with a common base class to make the implementation easier to follow

also wondering whether we want to run a background thread every X seconds that would calculate the cpu usage and the http handler would simply return the current value instead of relying on http handled being the "ticker"

deepnote_core/resources/scripts/resource_usage.py

m1so · 2026-01-15T17:21:42Z

deepnote_core/resources/scripts/resource_usage.py

+        for line in content.splitlines():
+            if line.startswith(key):
+                parts = line.split()
+                if len(parts) >= 2:
+                    try:
+                        return int(parts[1])
+                    except ValueError:
+                        pass
+        return 0


nitpick: early returns could make the code more readable and reduce the nesting

generally applies to other methods as well

deepnote_core/resources/scripts/resource_usage.py

…etter monitoring

…turns in resource usage methods

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@deepnote_core/resources/scripts/resource_usage.py`:
- Around line 32-118: Add concise docstrings to the helper/handler methods that
currently lack them: _detect_backend, _read_file, _parse_int, _parse_limit,
_parse_memory_stat, and any handler methods like do_GET and the _send_* helper
functions referenced in the review; each docstring should state what the
function does, its key parameters and return value (including types and
semantics such as when None is returned), and any important behavior (e.g.,
error handling or thresholds) following the existing project docstring style so
callers understand purpose and edge cases.
- Around line 269-270: The server is currently bound to all interfaces via
ThreadingHTTPServer(("0.0.0.0", port), handler) which may expose the metrics
endpoint publicly; make the bind address configurable and default it to loopback
(e.g., use a configurable variable like bind_address or metrics_host), update
the ThreadingHTTPServer call to use (bind_address, port), and ensure any
configuration source (env var or settings) is read and validated before creating
MetricsHandler/ThreadingHTTPServer so callers can choose 127.0.0.1 for
local-only or 0.0.0.0 when intentional.

♻️ Duplicate comments (5)

deepnote_core/resources/scripts/resource_usage.py (5)
49-55: Clamp negative cgroupv2 memory usage.
inactive_file can exceed memory.current, yielding negative used and bogus percentages.
Suggested fix
-            used = current - inactive_file
+            used = max(0, current - inactive_file)
63-65: Psutil limit can misreport in containers.
psutil.virtual_memory().total reflects host memory, not cgroup limits, so metrics can be misleading. Consider logging a warning or setting limit to None when using psutil.
Does psutil.virtual_memory().total reflect cgroup/container limits or host memory?
120-137: Guard cgroup CPU parsing against invalid data.
int(...) on malformed cgroup files will raise and take down the server.
Suggested fix
                 if len(parts) >= 2:
-                    return int(parts[1]) / 1_000_000.0
+                    try:
+                        return int(parts[1]) / 1_000_000.0
+                    except ValueError:
+                        return 0.0
@@
-            return int(content) / 1_000_000_000.0
+            try:
+                return int(content) / 1_000_000_000.0
+            except ValueError:
+                return 0.0
171-186: Avoid negative CPU percentages on counter resets.
If counters reset, cpu_delta can go negative and produce bogus values.
Suggested fix
         if time_delta <= 0:
             return 0.0
+        if cpu_delta < 0:
+            return 0.0
         return (cpu_delta / time_delta) * 100.0
259-266: Handle invalid port env var and add a docstring.
RESOURCE_USAGE_METRICS_PORT with a non-int value will crash the server.
Suggested fix
 def main() -> None:
+    """Start the resource metrics HTTP server."""
@@
-    port = int(os.environ.get("RESOURCE_USAGE_METRICS_PORT", 9104))
+    try:
+        port = int(os.environ.get("RESOURCE_USAGE_METRICS_PORT", "9104"))
+    except ValueError:
+        port = 9104
+        logging.warning("Invalid RESOURCE_USAGE_METRICS_PORT, using default 9104")
As per coding guidelines, add docstrings.

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d0397d4 and 9e9d988.

📒 Files selected for processing (1)

deepnote_core/resources/scripts/resource_usage.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.cursorrules)

**/*.py: Write clean, readable Python code following PEP 8 style guide
Use type hints with Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Maximum line length: 88 characters (Black default)
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Use black for code formatting
Use isort for import sorting (black profile)
Use flake8 for linting
Use early returns to reduce nesting and extract common checks into variables for readability
Use snake_case for variable and function names
Use PascalCase for class names
Use snake_case for file names
Support Python versions 3.9, 3.10, 3.11, 3.12, and 3.13

**/*.py: Follow PEP 8 with Black formatting (line length: 88)
Use isort with Black profile for import sorting
Use type hints consistently
Use docstrings for all functions/classes
Use f-strings instead of .format() for string formatting
Use pathlib.Path for file path operations instead of os.path
Always use Optional[T] for parameters that can be None (not T = None)
Use explicit type hints for function parameters and return values
Use snake_case for files, functions, and variables
Use PascalCase for classes
Use appropriate exception types with context logging for error handling
Handle Jupyter/IPython specific exceptions properly
Use early returns to reduce nesting and extract common checks into variables for readability
Use dictionary unpacking for headers (e.g., headers = {"Content-Type": "application/json", **auth_headers})
Use space-separated format for CLI arguments (e.g., --port 8080)

Files:

deepnote_core/resources/scripts/resource_usage.py

🪛 Ruff (0.14.11)

deepnote_core/resources/scripts/resource_usage.py

99-99: Consider moving this statement to an else block

(TRY300)

192-192: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

192-192: Dynamically typed expressions (typing.Any) are disallowed in **kwargs

(ANN401)

255-255: Dynamically typed expressions (typing.Any) are disallowed in *args

(ANN401)

256-256: info() call on root logger

(LOG015)

256-256: Logging statement uses f-string

(G004)

270-270: Possible binding to all interfaces

(S104)

272-272: info() call on root logger

(LOG015)

272-272: Logging statement uses f-string

(G004)

277-277: info() call on root logger

(LOG015)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Test - Python 3.9
GitHub Check: Test - Python 3.11
GitHub Check: Test - Python 3.10
GitHub Check: Build and push artifacts for Python 3.9
GitHub Check: Build and push artifacts for Python 3.12
GitHub Check: Build and push artifacts for Python 3.13
GitHub Check: Build and push artifacts for Python 3.11
GitHub Check: Build and push artifacts for Python 3.10

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

deepnote_core/resources/scripts/resource_usage.py

mfranczel · 2026-01-16T10:47:08Z

@m1so Splitting into multiple classes feels like overkill here - the current structure is already simple and easy to follow. Each backend branch is only 3-5 lines, and we're unlikely to add new backends, so it's stable as-is. This is a standalone script, not a library meant for extension.

As for the background thread, we can handle that in the executor if needed. Keeping the implementation in toolkit simple was my goal.

mfranczel added 2 commits January 15, 2026 16:20

feat(resource-usage): Add lightweight resource metrics server

fb25768

fix(resource-usage): Handle None values for CPU count and usage percent

67859fa

mfranczel self-assigned this Jan 15, 2026

fix: formatting issues in resource_usage

95b873a

coderabbitai bot requested changes Jan 15, 2026

View reviewed changes

deepnote_core/resources/scripts/resource_usage.py Show resolved Hide resolved

installer/__main__.py Show resolved Hide resolved

coderabbitai bot requested changes Jan 15, 2026

View reviewed changes

deepnote_core/resources/scripts/resource_usage.py Outdated Show resolved Hide resolved

deepnote_core/resources/scripts/resource_usage.py Outdated Show resolved Hide resolved

deepnote_core/resources/scripts/resource_usage.py Show resolved Hide resolved

fix: dont shadow builtin format

5ba4638

coderabbitai bot requested changes Jan 15, 2026

View reviewed changes

deepnote_core/resources/scripts/resource_usage.py Show resolved Hide resolved

coderabbitai bot previously approved these changes Jan 15, 2026

View reviewed changes

fix: subtract inactive fie from current memory

d0397d4

mfranczel dismissed coderabbitai[bot]’s stale review via d0397d4 January 15, 2026 16:26

coderabbitai bot requested changes Jan 15, 2026

View reviewed changes

deepnote_core/resources/scripts/resource_usage.py Show resolved Hide resolved

coderabbitai bot previously approved these changes Jan 15, 2026

View reviewed changes

mfranczel requested a review from m1so January 15, 2026 16:42

mfranczel marked this pull request as ready for review January 15, 2026 16:42

mfranczel requested a review from a team as a code owner January 15, 2026 16:42

m1so previously approved these changes Jan 15, 2026

View reviewed changes

mfranczel added 2 commits January 16, 2026 11:27

refactor(resource-usage): Replace print statements with logging for b…

0a619ff

…etter monitoring

refactor(resource-usage): Simplify parsing logic and improve early re…

9e9d988

…turns in resource usage methods

mfranczel dismissed stale reviews from m1so and coderabbitai[bot] via 9e9d988 January 16, 2026 10:37

coderabbitai bot requested changes Jan 16, 2026

View reviewed changes

deepnote_core/resources/scripts/resource_usage.py Show resolved Hide resolved

deepnote_core/resources/scripts/resource_usage.py Show resolved Hide resolved

coderabbitai bot approved these changes Jan 16, 2026

View reviewed changes

mfranczel requested a review from m1so January 16, 2026 10:47

m1so approved these changes Jan 16, 2026

View reviewed changes

mfranczel merged commit 4d76844 into main Jan 16, 2026
35 checks passed

mfranczel deleted the michal/blu-5457-serve-resource-usage-metrics-from-deepnote-toolkit branch January 16, 2026 11:55

feat: Serve resource usage metrics #58

feat: Serve resource usage metrics #58

Uh oh!

Conversation

mfranczel commented Jan 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

linear bot commented Jan 15, 2026

Uh oh!

coderabbitai bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Sequence Diagram(s)

Uh oh!

github-actions bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

deepnote-bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

m1so left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

m1so Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mfranczel commented Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mfranczel commented Jan 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 15, 2026 •

edited

Loading

github-actions bot commented Jan 15, 2026 •

edited

Loading

codecov bot commented Jan 15, 2026 •

edited

Loading

deepnote-bot commented Jan 15, 2026 •

edited

Loading