Skip to content

Conversation

@s373nZ
Copy link
Contributor

@s373nZ s373nZ commented Apr 1, 2025

This PR expands upon #7884 to demo some potential DX improvements through adding pre-commit hooks. It contains the following features:

  • Run shellcheck.
  • Re-implements make check-amount-access as a local pygrep hook.
  • Implements a clang-format hook, run manually, to fail on suggestion warnings with additional configuration to sort includes.
  • Re-implements make check-discouraged-functions as a local pygrep hook.
  • Runs codespell, manually, to check for common spelling errors.
  • Runs check-jsonschema to validate schemas and metaschemeas in the doc/ directory.
    • Last checked, this exposed a few invalidations in a few of the the schema definitions, and seems particularly useful.
  • Pretty-formats the JSON schemas
  • Checks Git commit message conforms to Core Lightning prefix conventions using commitlint.
  • Adds a devtools/fix-style-errors convenience script which runs the more destructive fixers like clang-format and codespell.
  • Adds a devtools/include-order-fixer.py script to scan the build's source and header files and fix them for Core Lightning conventions (agent assisted by Cursor).
  • Replaces existing ruff with flake8 in pre-commit config.

Many of these experiments output a lot of errors/warnings against the existing code. Generally, pre-commit should be only checking the current changeset and provoking the developer to address issues incrementally. Interested parties can run specific hooks on the entire codebase with pre-commit run --all-files [hook-id]

Particular checks and standards may not be applicable or aligned with the workflow of the maintainers. I'd be happy and interested to break this work into separate PRs for specific hooks where there is interest, or explore adding more.

Relates to #7765.

Checklist

Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:

  • The changelog has been updated in the relevant commit(s) according to the guidelines.
  • Documentation has been reviewed and updated as needed.
  • Related issues have been listed and linked, including any that this PR closes.

Changelog-None

@madelinevibes
Copy link
Collaborator

@s373nZ hi! noting your recent reference to this PR.... do you plan on returning to this PR anytime soon? I'm doing some spring cleaning of our open PRs

@madelinevibes madelinevibes added the Status::Ready for Review The work has been completed and is now awaiting evaluation or approval. label Dec 8, 2025
@s373nZ s373nZ marked this pull request as ready for review December 8, 2025 12:36
@s373nZ s373nZ changed the title [WIP] pre-commit hooks demo pre-commit hooks demo Dec 8, 2025
@s373nZ
Copy link
Contributor Author

s373nZ commented Dec 8, 2025

Hi @madelinevibes! 👋 Updated this by rebasing against master. Happy to return to it if parts seem useful, but don't want to make too many assumptions about which checks would be acceptable to the team.

Marking this ready for review and removing the [WIP] to better flag it for someone to take a quick look, and hopefully provide initial feedback or further direction on requirements.

@madelinevibes madelinevibes added this to the v26.03 milestone Dec 8, 2025
@sangbida
Copy link
Collaborator

sangbida commented Dec 8, 2025

Hey! Thank you for this awesome PR! I had been running this heavily vibecoded (and very verbose) precommit hook to fix up my nits but this looks very neat and tidy! The repos used for the most part also seem reliable and maintained. I would also add whitespace fixing and flake8 fixing in this PR as well, I find that most of the time my prebuild checks fail because of whitespace and flake8 errors :(

Comment on lines +1 to +3
connectd
crate
mut
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a comprehensive list based on what's triggered by codespell? Does this also run on all files, or all C files?

Copy link
Contributor Author

@s373nZ s373nZ Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a comprehensive list based on what's triggered by codespell?

Not by any means. This is just demo of how to add words to the ignore list. Attached is recent output for pre-commit run --all-files codespell > codespell.txt

codespell.txt

I suggest going through these items to decide which typos to fix versus which to add to the ignore list is worth a separate PR?

Does this also run on all files, or all C files?

This runs on all files submitted with the changeset (during a commit) and all files in total when using pre-commit run --all-files. The mut in the ignore list is actually for Rust code.

- refactor
- perf
- test

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You maybe able to use https://github.com/hhatto/autopep8 and https://github.com/PyCQA/flake8 for python style fixes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally didn't include flake8 due to this discussion: #7884 (comment)

I can add it and/or autopep8 in if you require in order to accept this work. Just LMK! :)

@sangbida
Copy link
Collaborator

sangbida commented Dec 9, 2025

Also, have you checked how long it takes to run this? Just wondering because we're making a fair few external requests.

@s373nZ
Copy link
Contributor Author

s373nZ commented Dec 9, 2025

Rebased against master and added trailing-whitespace and end-of-file-fixer. Also responded to other review comments.

Also, have you checked how long it takes to run this? Just wondering because we're making a fair few external requests.

First run ALL FILES (fresh install):

pre-commit clean
time pre-commit run --all-files
Executed in   37.23 secs    fish           external
   usr time   47.16 secs    0.33 millis   47.16 secs
   sys time   22.12 secs    1.19 millis   22.12 secs

Regular run ALL FILES:

time pre-commit run --all-files
Executed in   14.27 secs    fish           external
   usr time   35.73 secs    0.00 micros   35.73 secs
   sys time   19.07 secs  759.00 micros   19.07 secs

Regular run on common/* (a supposedly massive change set):

time pre-commit run --files common/*
Executed in    1.43 secs    fish           external
   usr time    6.85 secs    1.15 millis    6.85 secs
   sys time    2.31 secs    1.05 millis    2.31 secs

Regular run on cli/* (a supposedly more minimal change set):

time pre-commit run --files cli/*
Executed in  574.59 millis    fish           external
   usr time  461.00 millis    0.03 millis  460.98 millis
   sys time  399.49 millis    1.02 millis  398.47 millis

During a commit, pre-commit only runs on the files in the changeset, so most of the time it is processing very few files. Hooks are installed into the system at ~/.cache/pre-commit so we aren't making many external requests except for the first run.

@rustyrussell
Copy link
Contributor

It's always good to have an opinionated list. The downside is, we might all disagree with that list!

In particular, the conventionalcommits.org is crap. I hate "chore:" as a prefix. It's non-informative.

Prefixes should refer to subsystems, e.g:

  • lightningd: xxx
  • plugins: xxx
  • askrene: xxx
  • pytest: xxx
  • common: xxx
  • docs: xxx

If there's no logical subsystem, I sometimes omit it, and sometimes use "global". But usually I talk about what system, or subdaemon, or plugin is hit (even if, as a side effect, other code needs to be adapted!).

@ShahanaFarooqui
Copy link
Collaborator

Will this be integrated into the CI, or is it intended only for local testing? My local run is currently failing on almost all steps :(.

@s373nZ
Copy link
Contributor Author

s373nZ commented Dec 10, 2025

  • Rebased against master.
  • REMOVED the conventional-commit and clang-format hooks because they are highly experimental and will likely be far more intrusive than helpful for now. Just let me know if you want them back! Having clang-format active is like opting in to fix every C file you touch and there are so many failures, I imagine it to be really annoying. Could be good to reintegrate after some configuration tuning or in a separate initiative. See below for conventional commits rationale.
  • Replaced shellcheck with shellcheck-py to avoid dependency on Docker per @sangbida's feedback.
  • Added @sangbida's script in pre-commit hooks demo #8193 (comment) to tools/fix-style-errors (still uses clang-format FYI).
  • Added pre-commit to the dev group in pyproject.toml so it's installed along with other Python dependencies during a new development environment setup.
  • Added a section to the Contributor Workflow documentation to describe pre-commit and how to opt-in / install it to a local repo.
  • Replied to all the review feedback, any unresolved comments are awaiting your resolution, feedback or direction.

It's always good to have an opinionated list. The downside is, we might all disagree with that list!

@rustyrussell this list was a best-guess demo derived by scanning a few pages of the commit history in April and configuring based on observed conventions.

In particular, the conventionalcommits.org is crap. I hate "chore:" as a prefix. It's non-informative.

Noted, and removed it as a commit linter. The prior configuration should have required chore(subsystem) or fix(subsystem), which isn't what you're looking for. Would probably be way more annoying than helpful for you, as well as new contributors.

Prefixes should refer to subsystems

If the team wanted to formalize conventions for commit messages and add them to the developer documentation, it may be possible to leverage the Conventional Commit hook to enforce it. Although, it might not make sense to bend that tool too far and consider a customized implementation.

Will this be integrated into the CI, or is it intended only for local testing? My local run is currently failing on almost all steps :(.

@ShahanaFarooqui Definitely not ready for CI yet, IMO. An idea for a "mini road map" toward this might be something like:

  1. Reach internal consensus on which hooks to accept for now in this PR.
  2. Merge this PR.
  3. Open follow up issues / PRs to address outstanding failures for the accepted hooks (spelling, formatting, schema etc).
  4. Interested developers opt-in and test-drive it daily to discover bugs, quirks and improvement ideas.
  5. When all the checks are passing using --all-files (PRs in 3 addressed) and deemed non-intrusive, integrate into CI alongside existing checks.
  6. Incrementally replace the checks in make check with pre-commit hook equivalents.

I would be interested / curious to help out with the follow ups in 3 to start with.

@s373nZ s373nZ requested a review from sangbida December 10, 2025 13:40
@ShahanaFarooqui ShahanaFarooqui linked an issue Dec 12, 2025 that may be closed by this pull request
@sangbida
Copy link
Collaborator

sangbida commented Dec 12, 2025

Hey! Here's a list of requirements that @ShahanaFarooqui and I came up with:

Must-haves:

  • Compatible with the existing codebase: if code passes prebuild checks, it should also pass the pre-commit hook.
  • Opt-in: developers should be able to commit "ugly" code if needed.
  • Detect shell scripting bugs (check-shellcheck).
  • Prevent direct access to amount_msat and amount_sat members (check-amount-access).
  • Flag usage of discouraged functions (check-discouraged-functions).
  • Enforce commit message format with subsystem: prefix.
  • Automatic fixes for:
    • Sorting #include statements in C files (check-hdr-include-order).
    • Formatting C files (clang-format).
    • Python code style (check-python-flake8).
    • JSON schemas (check-fmt-schemas).
    • Trailing whitespace, end-of-file issues, and extra lines.

Nice-to-haves:

  • Spelling fixes.
    • From @ShahanaFarooqui - I attempted to do this locally and after 100+ file fixed, I was just ~25% done. So yes, It's useful but should not be added at this time. We will revisit once the hook is ready to be added in the CI directly.

@s373nZ Let me know what you think?

@s373nZ
Copy link
Contributor Author

s373nZ commented Dec 12, 2025

@sangbida @ShahanaFarooqui This is a great list. Thanks! I'm happy you're interested in restoring clang-format and commit linting. A few questions:

Slightly concerned about a conflict between the requirements if code passes prebuild checks, it should also pass the pre-commit hook and Automatic fixes for.... If pre-commit performs automatic changes to the code, it will also fail the hook run. Should the automatic fixes be applied prior to the commit, or in a separate process? For example, if clang-format runs, on a C file in the change set, it won't pass the check until it's feedback has been resolved.

Python code style (check-python-flake8).

I assume this means you prefer flake8 over ruff, so I'll remove the existing ruff hook?

Enforce commit message format with subsystem: prefix.

Would you like to supply a list of available subsystems? I can also just take a first guess based on Rusty's and the commit log...

Given the present information, here's my current plan:

  • Restore the clang-format hook, explore removing --dry-run and -Werror flags to accept its changes automatically.
  • Explore checking (and fixing?) the #include sort order outside of clang-format with a custom hook.
    • There is an interesting challenge in that the list of checked files is built dynamically from project's included Makefiles. Looking into a clean way to leverage the list into a hook process.
  • Define which hooks should be run during the manual stage, meaning they aren't run automatically for each commit, but must be invoked explicitly.
    • Could do this for codespell for now, to make it available, but not intrusive.
    • clang-format might also be better used like this; it may be more intrusive than codespell.
  • Explore commitizen as an alternative to conventionalcommits.org. Either could be suitable. commitizen seems to have some other nice release process management features. 👀

A bit of work and a lot of testing ahead, but optimistic many of the requirements are already mostly addressed. Let me know if you have feedback or corrections.

@ShahanaFarooqui
Copy link
Collaborator

@s373nZ Let's start with the following list of subsystem prefixes for now. We can continue refining it iteratively until it's sufficiently mature:

  ---------------
  Daemons:
  ---------------
    channeld
    closingd
    connectd
    gossipd
    hsmd
    lightningd
    onchaind
    openingd
  
  ---------------
  Related:
  ---------------
    bitcoin
    cli
    cln-grpc
    cln-rpc
    db
    wallet
    wire
  
  ---------------
  Extensions:
  ---------------
    plugin-* (any string value after `plugin-` should be allowed like xpay, renepay, askrene...)
    pyln-* (lightning, client, grpc-proto, proto, spec, testing)
    tool-* (reckless, hsmtool, downgrade)
  
  ---------------
  Others:
  ---------------
    ci
    common
    contrib
    devtools
    docs
    docker
    github
    global
    meta
    nit
    nix
    release
    script
    tests

@ShahanaFarooqui
Copy link
Collaborator

@sangbida @ShahanaFarooqui This is a great list. Thanks! I'm happy you're interested in restoring clang-format and commit linting.

I "truely" tested clang-format today and they are quite intrusive. Now I am a bit wary of enabling it broadly :).

Slightly concerned about a conflict between the requirements if code passes prebuild checks, it should also pass the pre-commit hook and Automatic fixes for.... If pre-commit performs automatic changes to the code, it will also fail the hook run. Should the automatic fixes be applied prior to the commit, or in a separate process? For example, if clang-format runs, on a C file in the change set, it won't pass the check until it's feedback has been resolved.

My understanding is that we should run all prebuild checks (i.e., will cause the prebuild checks fail) in this stage. Then extra checks like clang-format or codespell can be added as manual-stage hooks so that they don’t block commits. @sangbida, please correct me if I am mistaken.

I assume this means you prefer flake8 over ruff, so I'll remove the existing ruff hook?

AFAIU, both Ruff and flake8 are Python-only tools. I may be mistaken, but if you and @cdecker prefer Ruff, then Ruff it is. I just want to ensure Ruff doesn’t miss any flake8 checks that would cause the hook to pass but the prebuild check to fail.

Would you like to supply a list of available subsystems? I can also just take a first guess based on Rusty's and the commit log...

Shared in my earlier comment above.

Given the present information, here's my current plan:
Restore the clang-format hook, explore removing --dry-run and -Werror flags to accept its changes automatically.

Umm..How about we let clang-format and codespell take a back seat as manual hooks for now? Sorry for the mix-up earlier 😄. We can enable them once the codebase is more cleaned up accordingly.

Define which hooks should be run during the manual stage, meaning they aren't run automatically for each commit, but must be invoked explicitly.
Could do this for codespell for now, to make it available, but not intrusive.
clang-format might also be better used like this; it may be more intrusive than codespell.

Yes, agreed. These two seem like the right candidates for now, but feel free to include additional tools if needed.

Explore commitizen as an alternative to conventionalcommits.org. Either could be suitable. commitizen seems to have some other nice release process management features. 👀

It looks great, but is it not too big of a change at this stage?

A bit of work and a lot of testing ahead, but optimistic many of the requirements are already mostly addressed. Let me know if you have feedback or corrections.

Thank you so much. Truly appreciate it!!! ❤️

@s373nZ
Copy link
Contributor Author

s373nZ commented Jan 13, 2026

Apologies for the lengthy time between feedback. Was mostly offline during the holidays and it's been slow coming out of hibernation and amidst other priorities. The latest push:

  • Rebased against master.
  • Replaces ruff with flake8 - no preference on my end and the argument to preserve parity with the pre-build checks makes sense.
  • Reinstates the clang-format hook running under the manual stage. I've left --dry-run and -Werror on for now as it is taking a back seat.
  • Set codespell to run under the manual stage.
  • Selected commitlint to handle the commit messages. Conventional Commits opinionated defaults were unpreferred and its hook didn't allow for defining wildcard rules for the extensions. Wrote a custom plugin rule in commitlint.config.js defining the subsystem requirements from @ShahanaFarooqui. It seems like a flexible solution.
  • Added a custom hook to run make check-includes to ensure this pre-build check is running for parity. It can take some time as it checks all the includes for each commit as-is.
  • Implemented a new script devtools/include-order-fixer.py to address the requirement for automatic sorting for include statements. Research into existing implementations like clang-format and include-what-you-use revealed that this type of implementation is not trivial, at least for me. Still acclimating to agentic workflows, but I developed this script iteratively using Cursor and tested it as well as I could. Particular challenges included:
    • Both single-line and block comments should be preserved.
    • A convention was detected in files like onchain/onchaind.c and wire/wire-io.c where includes with leading spaces were assumed to be ignored in the order-checking. This script should preserve the whitespace and position for these includes, as well as includes which end in _gen.h assuming special conditions for generated code.
  • Added print-src-to-check and print-hdr-to-check targets to the Makefile to support include-order-fixer.py. The list of files to be checked/fixed is built dynamically by recursively calling the Makefiles in subdirectories and gathering lists of headers and source code. These targets expose the lists for reuse in the script by calling make in a subprocess.

This could always use more testing, but I wanted to aggregate the developments over the last month and provide an update.

cc @sangbida @ShahanaFarooqui @madelinevibes

@sangbida
Copy link
Collaborator

This looks amazing to me! I love that it runs when you install the hook and does not run when you do not install the hook! I tested most of the features and they are working great. Sorry to be a pain, but my only suggestion is if the include order fixer could also detect duplicate includes and fix them? Other than that looks great would be happy to approve.

@ShahanaFarooqui what do you think?

@s373nZ
Copy link
Contributor Author

s373nZ commented Jan 20, 2026

my only suggestion is if the include order fixer could also detect duplicate includes and fix them?

Good call. I've updated the script to de-duplicate the includes. Thanks @sangbida! Latest push:

  • Rebased against master.
  • Updated devtools/include-order-fixer.py to de-duplicate includes.
  • Updated PR description to reflect up-to-date work items.

Copy link
Collaborator

@ShahanaFarooqui ShahanaFarooqui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one update: I had to add either language: system or language_version: "18.19.0" for commitlint to run successfully.

ACK f9ad52e

@s373nZ
Copy link
Contributor Author

s373nZ commented Jan 21, 2026

I had to add either language: system or language_version: "18.19.0" for commitlint to run successfully.

Interesting. My local NodeJS for testing was v25.3.0 and it worked ok for me. LMK if you want me to specify the version explicitly. We could also consider attempting to standardize the various language dependencies using default_language_version at the top level.

@ShahanaFarooqui
Copy link
Collaborator

Interesting. My local NodeJS for testing was v25.3.0 and it worked ok for me.

It also worked for me with Node v20 and v22. The point I was trying to make is that I had to explicitly specify either the Node version or request the system-installed Node.js, as it did not work implicitly for me.

LMK if you want me to specify the version explicitly.

This may be related to my use of nvm. If it’s specific to my setup, we can ignore it but if others encounter the same issue, it might be better to make this explicit.

We could also consider attempting to standardize the various language dependencies using default_language_version at the top level.

Setting defaults usually sounds like a good idea to me… until proven otherwise 😄

Default to Python 3 and NodeJS to use that which is on the system,
avoiding conflicts with `nodeenv`.
@s373nZ
Copy link
Contributor Author

s373nZ commented Jan 22, 2026

This may be related to my use of nvm.

There might be some unexpected behavior here. pre-commit uses nodeenv under the hood to pin versions, so there could be a conflict between them. IMO it's a good idea to pin it -- I've used to use both nodeenv and nvm (now currently with asdf). It's reasonable to expect developers with tools.

Added a new commit to set default_language_version at the top level for node: system and python: python3. After experimenting with pinning Node to a version, it was downloading and compiling a new instance just to run commitlint! A little too crazy, so let's default to the system for now, until proven otherwise?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status::Ready for Review The work has been completed and is now awaiting evaluation or approval.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add pre-commit hook to auto-fix style issues

5 participants