-
-
Notifications
You must be signed in to change notification settings - Fork 396
File.dirname: add a spec for Shift JIS handling #1330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
byroot
wants to merge
1
commit into
ruby:master
Choose a base branch
from
byroot:shiftjs-windows-path-operations
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+29
−0
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
e8c8ec8 to
7cfe29b
Compare
trinistr
reviewed
Jan 19, 2026
7cfe29b to
38a2dbf
Compare
trinistr
reviewed
Jan 19, 2026
eregon
reviewed
Jan 19, 2026
eregon
reviewed
Jan 19, 2026
eregon
reviewed
Jan 19, 2026
eregon
reviewed
Jan 19, 2026
eregon
reviewed
Jan 19, 2026
While trying to speedup various `File.*` methods, I realized they were way slower and complicated than they should for no apparent reason. However after asking Nobu he explained that Shift JIS encoded text can contain `0x5C` (ASCII backslash) as the second byte of a two byte character sequence. Since on Windows `0x5C` is `File::ALT_SEPARATOR`, this can easily break naive path related algorithms searching for directory separators.
38a2dbf to
5c63521
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While trying to speedup various
File.*methods, I realized they were way slower and complicated than they should for no apparent reason. However after asking Nobu he explained that Shift JIS encoded text can contain0x5C(ASCII backslash) as the second byte of a two byte character sequence.Since on Windows
0x5CisFile::ALT_SEPARATOR, this can easily break naive path related algorithms searching for directory separators.cc @eregon what do you think of that spec? I think it would have helped me figure things out way sooner, so would have been valuable. If you agree I can generalize it to more "path handling" methods.