-
Notifications
You must be signed in to change notification settings - Fork 1k
Lock download file to avoid races #4606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
4b0e0d9 to
19c23a7
Compare
See #TBD for details. This is one of two possible implementations, the easier one is using a temporary directory or a unique filename to ensure processes don't collide. Using locks has less platform support, but OTOH means that we don't create stray temp files that don't get cleaned up (because we never know if one of those files is actually used by another process). This does not fix #TBD completely. It only fixed the downloaded, parallel `rustup-init` calls still fail at the transaction stage. This PR introduces a `LockedFile` primitive that's modeled after the uv type of the same name. It's currently very simple and locks forever. We can extend it with a timeout that makes rustup exit after a certain duration and print an error message, instead of an inscrutable waiting without any message.
19c23a7 to
e69fb23
Compare
src/download/mod.rs
Outdated
| // Otherwise, create a temporary file with 777 permissions, to handle races between | ||
| // processes running under different UIDs (e.g., in Docker containers). We must set | ||
| // permissions _after_ creating the file, to override the `umask`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I guess Cargo's file locks also have the same issue?
Would be great if you can a link to it :) |
|
The uv type: https://github.com/astral-sh/uv/blob/fc0cf90795dbfa961f22fc0934a53afc9c13c349/crates/uv-fs/src/lib.rs#L672. I have a branch that adds async timeouts, but that's in review still. |
src/download/mod.rs
Outdated
| }; | ||
| if let Err(err) = file | ||
| .as_file() | ||
| .set_permissions(std::fs::Permissions::from_mode(0o777)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the permissions be cached in a static variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method is a no-op constructor.
|
Link to the merged async uv implementation: https://github.com/astral-sh/uv/blob/3e22637c93068418506b0e4f79b47dbe956d2836/crates/uv-fs/src/locked_file.rs#L91-L353 |
|
@konstin Many thanks again! I'm OTOH looking for another way of resolving #988 (currently my 3rd attempt now XD) so I don't want to rush to decisions, currently I'm open to all options 🙏 It's nice that this avoids lock files or similar per toolchain, but I'm working on the transaction level so I don't have enough info to say at this moment if this relatively local solution is the best way forward, I'm afraid maybe a lock is still required after all 🙇 |
|
You definitely know better than me, this is built on my naive assumption that we'll need this plus something for the rest of #988, where we lock the install stage too, similar to how we handle it in uv (though for packages we also do streaming install-and-unpack in a temp dir and use atomic replace, that's the other version mentioned in the PR description, but for installing a Python interpreter (our version of a toolchain), we do have a file lock: https://github.com/astral-sh/uv/blob/f7d1215a980c3e332cbe602db0a170a7967e7850/crates/uv-python/src/installation.rs#L286). Otherwise, we can also splice out the lock file implementation and use it for some other place. |
See #4607 for details.
This is one of two possible implementations, the easier one is using a temporary directory or a unique filename to ensure processes don't collide. Using locks has less platform support, but OTOH means that we don't create stray temp files that don't get cleaned up (because we never know if one of those files is actually used by another process).
This does not fix #4607 completely. It only fixed the downloaded, parallel
rustup-initcalls still fail at the transaction stage.This PR introduces a
LockedFileprimitive that's modeled after the uv type of the same name. It's currently very simple and locks forever. We can extend it with a timeout that makes rustup exit after a certain duration and print an error message, instead of an inscrutable waiting without any message.