Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
14b2fc1
chore: include specs otherwise claude cannot read them
fbraza Jul 18, 2025
5018c8e
build: github action for claude code
fbraza Jul 18, 2025
59200cb
feature: implement the score2 algorithm
fbraza Jul 18, 2025
793a853
chore: move the specs file for the agent
fbraza Jul 19, 2025
cb352bf
fix: standardize import statements for package consistency
fbraza Jul 19, 2025
1a25f48
fix: rename test_phenoage to test_score2 in test_score2.py
fbraza Jul 19, 2025
0aae159
Merge pull request #11 from fbraza/fix/issue-3-import-consistency
fbraza Jul 19, 2025
36e889d
Merge pull request #12 from fbraza/fix/issue-4-test-function-name
fbraza Jul 19, 2025
ffe31e0
test: expand SCORE2 test coverage with diverse patient profiles
fbraza Jul 19, 2025
d5aad27
refactor: rename test directories and files for clarity
fbraza Jul 19, 2025
aefafec
fix: standardize unit suffix casing to lowercase
fbraza Jul 19, 2025
eac24b4
refactor: reorganize fodlers and test files
fbraza Jul 19, 2025
e55be22
Merge pull request #13 from fbraza/test/issue-7-score2-coverage
fbraza Jul 19, 2025
e77d6bc
refactor: add comprehensive hints issue #8
fbraza Jul 19, 2025
141b1c6
Merge pull request #14 from fbraza/fix/issue-8-comprehensive-type-hints
fbraza Jul 19, 2025
2f44ba3
Fix type hint issues with TypedDict approach
fbraza Jul 19, 2025
415ff8a
refactor: remove total=False for the BiomarkerData as cannot be None
fbraza Jul 19, 2025
4b0fdda
fix: type hint. We are overengineering it
fbraza Jul 19, 2025
7b888e1
fix: mypy error
fbraza Jul 19, 2025
d8a9eb4
Merge pull request #15 from fbraza/fix/issue-8-comprehensive-type-hints
fbraza Jul 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 9 additions & 10 deletions .github/workflows/claude-code-review.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,14 @@ jobs:
# github.event.pull_request.user.login == 'external-contributor' ||
# github.event.pull_request.user.login == 'new-developer' ||
# github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR'

runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
issues: read
id-token: write

steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand All @@ -39,7 +39,7 @@ jobs:

# Optional: Specify model (defaults to Claude Sonnet 4, uncomment for Claude Opus 4)
# model: "claude-opus-4-20250514"

# Direct prompt for automated review (no @claude mention needed)
direct_prompt: |
Please review this pull request and provide feedback on:
Expand All @@ -48,31 +48,30 @@ jobs:
- Performance considerations
- Security concerns
- Test coverage

Be constructive and helpful in your feedback.

# Optional: Use sticky comments to make Claude reuse the same comment on subsequent pushes to the same PR
# use_sticky_comment: true

# Optional: Customize review based on file types
# direct_prompt: |
# Review this PR focusing on:
# - For TypeScript files: Type safety and proper interface usage
# - For API endpoints: Security, input validation, and error handling
# - For React components: Performance, accessibility, and best practices
# - For tests: Coverage, edge cases, and test quality

# Optional: Different prompts for different authors
# direct_prompt: |
# ${{ github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR' &&
# ${{ github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR' &&
# 'Welcome! Please review this PR from a first-time contributor. Be encouraging and provide detailed explanations for any suggestions.' ||
# 'Please provide a thorough code review focusing on our coding standards and best practices.' }}

# Optional: Add specific tools for running tests or linting
# allowed_tools: "Bash(npm run test),Bash(npm run lint),Bash(npm run typecheck)"

# Optional: Skip review for certain conditions
# if: |
# !contains(github.event.pull_request.title, '[skip-review]') &&
# !contains(github.event.pull_request.title, '[WIP]')

13 changes: 6 additions & 7 deletions .github/workflows/claude.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,26 +39,25 @@ jobs:
# This is an optional setting that allows Claude to read CI results on PRs
additional_permissions: |
actions: read

# Optional: Specify model (defaults to Claude Sonnet 4, uncomment for Claude Opus 4)
# model: "claude-opus-4-20250514"

# Optional: Customize the trigger phrase (default: @claude)
# trigger_phrase: "/claude"

# Optional: Trigger when specific user is assigned to an issue
# assignee_trigger: "claude-bot"

# Optional: Allow Claude to run specific commands
# allowed_tools: "Bash(npm install),Bash(npm run build),Bash(npm run test:*),Bash(npm run lint:*)"

# Optional: Add custom instructions for Claude to customize its behavior for your project
# custom_instructions: |
# Follow our coding standards
# Ensure all new code has tests
# Use TypeScript for new files

# Optional: Custom environment variables for Claude
# claude_env: |
# NODE_ENV: test

1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -211,4 +211,3 @@ __marimo__/
CLAUDE.md
AGENTS.md
.aider*
specs
112 changes: 112 additions & 0 deletions specs/coding_style.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Python Coding Style Specification

## Core Principles

### 1. Favor Simplicity Over Complexity
- **Always choose the simple, straightforward solution** over complex or "sophisticated" alternatives
- **Avoid over-engineering** - resist the urge to build elaborate abstractions unless clearly needed
- **No premature optimization** - especially avoid blind optimization without measurement
- **Use simple building blocks** that can be composed elegantly rather than complex features
- **Principle**: If there are two ways to solve a problem, choose the one that is easier to understand

### 2. Clarity is Key
- **Readable code beats clever code** - optimize for the reader, not the writer
- **Use clear, descriptive names** for variables, functions, and classes
- **Format code for maximal scanning ease** - use whitespace and structure intentionally
- **Document intent and organization** with comments and docstrings where helpful
- **Reduce cognitive load** - code should express intent clearly at a glance
- **Principle**: The easier your code is to understand immediately, the better it is

### 3. Write Pythonic Code
- **Follow Python community standards and idioms** for naming, formatting, and programming paradigms
- **Cooperate with the language** rather than fighting it
- **Leverage Python features** like generators, itertools, collections, and functional programming
- **Write code that looks like Python wrote it** - use established patterns and conventions
- **Examples of Pythonic patterns**:
- List comprehensions over explicit loops when appropriate
- Context managers (`with` statements) for resource management
- Generator expressions for memory efficiency
- `enumerate()` instead of manual indexing
- `zip()` for parallel iteration

### 4. Don't Repeat Yourself (DRY)
- **Avoid code duplication** to make code more maintainable and extendable
- **Use functions and modules** to encapsulate common logic in single authoritative locations
- **Consider inheritance** to avoid duplicate code between related classes
- **Leverage language features** like default arguments, variable argument lists (`*args`, `**kwargs`), and parameter unpacking
- **Eliminate duplication through abstraction** - but don't abstract too early

### 5. Focus on Readability First
- **PEP8 is a guide, not a law** - readability trumps mechanical adherence to style rules
- **Make code as easy to understand as possible** - this is the ultimate goal
- **Deliberately violate guidelines** if it makes specific code more readable
- **Consider the human reader** first when making formatting and style decisions
- **Principle**: Rules serve readability, not the other way around

### 6. Embrace Conventions
- **Follow established conventions** to eliminate trivial decision-making
- **Use PEP8 as a baseline** but prioritize readability when there's conflict
- **Establish consistent patterns** in your codebase for common tasks:
- Variable naming patterns
- Exception handling approaches
- Logging configuration
- Import organization
- **Consistency enables focus** - familiar patterns let readers focus on logic rather than parsing

## Specific Implementation Guidelines

### Naming Conventions
- **Variables and functions**: `snake_case`
- **Classes**: `PascalCase`
- **Constants**: `UPPER_SNAKE_CASE`
- **Private attributes**: `_single_leading_underscore`
- **Choose descriptive names** that clearly indicate purpose and content

### Code Structure
- **Organize imports** in this order: standard library, third-party, local imports
- **Use blank lines** to separate logical sections
- **Keep functions focused** on a single responsibility
- **Prefer composition over inheritance** when appropriate
- **Write functions that do one thing well**

### Documentation
- **Write docstrings** for modules, classes, and functions that aren't immediately obvious
- **Use comments** to explain why, not what
- **Keep comments up to date** with code changes
- **Focus on intent** rather than implementation details

### Error Handling
- **Use specific exception types** rather than generic `Exception`
- **Follow the "easier to ask for forgiveness than permission" (EAFP) principle**
- **Handle errors at the appropriate level** - don't catch exceptions you can't handle meaningfully

### Performance and Optimization
- **Write clear code first** - optimize only when necessary and after measurement
- **Use appropriate data structures** for the task
- **Leverage built-in functions** and library functions when they're clearer
- **Profile before optimizing** - don't guess where bottlenecks are

## Code Review Checklist

When generating or reviewing Python code, ensure:
- [ ] The simplest solution that works is chosen
- [ ] Names clearly communicate purpose
- [ ] Code is easily scannable and readable
- [ ] Pythonic patterns are used appropriately
- [ ] No unnecessary duplication exists
- [ ] Conventions are followed consistently
- [ ] Comments explain intent where needed
- [ ] Error handling is appropriate
- [ ] The code would be easy for another developer to understand and maintain

## Decision Framework

When faced with coding choices, ask:
1. **Is this the simplest solution that works?**
2. **Will this be clear to someone reading it in 6 months?**
3. **Am I using Python idioms appropriately?**
4. **Am I duplicating logic that could be abstracted?**
5. **Does this follow our established conventions?**
6. **Is this optimized for readability?**

The answer to all these questions should be "yes" for beautiful Python code.
135 changes: 135 additions & 0 deletions specs/score2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# CVD Risk Prediction Formula

## Overview

This formula calculates the 10-year risk of cardiovascular disease (CVD) using a sex-specific Cox proportional hazards model. The model incorporates multiple risk factors with specific transformations and interaction terms to provide personalized risk estimates.

**Target Population**: European patients aged 40-69 years without prior CVD or diabetes.

## Model Coefficients and Baseline Survival

The model coefficients and baseline survival to calculate 10-year risk of CVD are as follows:

| Risk Factor | Transformation | Male | Female |
|-------------|----------------|------|--------|
| Age, years | cage = (age - 60)/5 | 0.3742 | 0.4648 |
| Smoking | current = 1, other = 0 | 0.6012 | 0.7744 |
| SBP, mm Hg | csbp = (sbp - 120)/20 | 0.2777 | 0.3131 |
| Total cholesterol, mmol/L | ctchol = tchol - 6 | 0.1458 | 0.1002 |
| HDL cholesterol, mmol/L | chdl = (hdl - 1.3)/0.5 | -0.2698 | -0.2606 |
| Smoking*age interaction | smoking*cage | -0.0755 | -0.1088 |
| SBP*age interaction | csbp*cage | -0.0255 | -0.0277 |
| Total cholesterol*age interaction | ctchol*cage | -0.0281 | -0.0226 |
| HDL cholesterol*age interaction | chdl*cage | 0.0426 | 0.0613 |
| **Baseline survival** | | **0.9605** | **0.9776** |

## Risk Calculation Formula

The uncalibrated 10-year risk of CVD is calculated by the following:

**10-year risk = 1 - (baseline survival)^exp(x)**

where **x = Σ[β*(transformed variables)]**

## Regional Calibration

The region and sex-specific scales to calculate calibrated 10-year risk are as follows:

| Risk Region | Male Scale 1 | Male Scale 2 | Female Scale 1 | Female Scale 2 |
|-------------|--------------|--------------|----------------|----------------|
| Low | -0.5699 | 0.7476 | -0.7380 | 0.7019 |
| Moderate | -0.1565 | 0.8009 | -0.3143 | 0.7701 |
| High | 0.3207 | 0.9360 | 0.5710 | 0.9369 |
| Very high | 0.5836 | 0.8294 | 0.9412 | 0.8329 |

### Calibrated Risk Calculation Formula

The calibrated 10-year risk of CVD is calculated by the following:

**Calibrated 10-year risk, % = [1 - exp(-exp(scale1 + scale2*ln(-ln(1 - 10-year risk))))] * 100**

### Regional Risk Classification

- **Belgium**: Classified as a **Low Risk** region
- For initial development, use the Low Risk calibration scales:
- Males: Scale 1 = -0.5699, Scale 2 = 0.7476
- Females: Scale 1 = -0.7380, Scale 2 = 0.7019

## Model Components Explained

### Risk Factor Transformations

1. **Age (cage)**: Centered at 60 years and scaled by 5-year intervals
- `cage = (age - 60)/5`

2. **Smoking**: Binary indicator
- `current = 1, other = 0`

3. **Systolic Blood Pressure (csbp)**: Centered at 120 mmHg and scaled by 20 mmHg intervals
- `csbp = (sbp - 120)/20`

4. **Total Cholesterol (ctchol)**: Centered at 6 mmol/L
- `ctchol = tchol - 6`

5. **HDL Cholesterol (chdl)**: Centered at 1.3 mmol/L and scaled by 0.5 mmol/L intervals
- `chdl = (hdl - 1.3)/0.5`

### Interaction Terms

The model includes four age interaction terms that capture how the effect of risk factors changes with age:

1. **Smoking × Age**: `smoking × cage`
2. **SBP × Age**: `csbp × cage`
3. **Total Cholesterol × Age**: `ctchol × cage`
4. **HDL Cholesterol × Age**: `chdl × cage`

### Sex-Specific Differences

- **Females** generally have higher baseline survival (0.9776 vs 0.9605)
- **Smoking** has a stronger effect in females (0.7744 vs 0.6012)
- **Age** has a stronger effect in females (0.4648 vs 0.3742)
- **SBP** has a slightly stronger effect in females (0.3131 vs 0.2777)
- **HDL cholesterol** protective effect is similar between sexes

## Implementation Workflow

1. **Calculate uncalibrated risk** using the base formula with model coefficients
2. **Apply regional calibration** using the appropriate scales for the patient's location and sex
3. **Output calibrated percentage** as the final 10-year CVD risk estimate

## Implementation Notes

1. **Input Units**:
- Age: years
- SBP: mmHg
- Total cholesterol: mmol/L
- HDL cholesterol: mmol/L
- Smoking: binary (1 = current smoker, 0 = other)

2. **Output**: 10-year CVD risk as a percentage (0-100%)

3. **Model Type**: Cox proportional hazards model with sex-specific coefficients and regional calibration

4. **Default Region**: Belgium (Low Risk region) for initial application development

## Risk Stratification

### Age-Specific Risk Categories

#### Patients <50 years old

- Low to moderate risk: <2.5%
- High risk: 2.5% to <7.5%
- Very high risk: ≥7.5%

#### Patients 50-69 years old

- Low to moderate risk: <5%
- High risk: 5% to <10%
- Very high risk: ≥10%

### Treatment Recommendations

- **Low to moderate risk**: Risk factor treatment plan generally not recommended.
- **High risk**: Risk factor treatment plan should be considered (i.e., blood pressure and LDL-C control).
- **Very high risk**: Risk factor treatment plan should be recommended (i.e., blood pressure and LDL-C control).
Loading