-
Notifications
You must be signed in to change notification settings - Fork 1k
Use compressed sort on unsorted chunks if possible #9133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Use compressed sort on unsorted chunks if possible #9133
Conversation
|
@melihmutlu, @akuzm: please review this pull request.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9133 +/- ##
==========================================
+ Coverage 82.45% 82.53% +0.08%
==========================================
Files 243 243
Lines 47938 47917 -21
Branches 12234 12232 -2
==========================================
+ Hits 39525 39547 +22
- Misses 3544 3556 +12
+ Partials 4869 4814 -55 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
a99c727 to
8082a05
Compare
| d2 | A | ||
| d2 | C | ||
|
|
||
| :PREFIX select device, sensor, count(*), max(sensor) from metrics group by device, sensor order by 1,2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is expected not to use skipscan, right? Could you add a comment describing expected behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nvm this is not only about SkipScan, would be good to have some negative example where we cant use the optimization atm because of filters on time column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use this optimization with filters on time column though.
Because if we only care about sorting on segmentby columns, it doesn't matter if we have unsorted time or other non-segmentby columns in the result or during the aggregation or SkipScan.
@akuzm pointed it out in #9128 (comment), and it is actually correct.
I.e. we don't have any limits on using compressed sort on unordered chunks if a sort is on segmentby columns only. The only limit is when the sort is on other than segmentby columns.
| SET enable_bitmapscan=0; | ||
| SET enable_seqscan=0; | ||
|
|
||
| SET timezone TO PST8PDT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use the default test timezone for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. It's the same timezone for the column time, so the column time can also be created with default timezone.
|
|
||
| SET max_parallel_workers_per_gather = 0; | ||
| SET enable_bitmapscan=0; | ||
| SET enable_seqscan=0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not strictly tied to index scans, so many test queries would be just as fine with sort over seq scan I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SkipScan is tied to index scans, that was the original issue i.e. we could not use compressed sort on unordered chunks, therefore could not use IndexScan and therefore could not use SkipScan.
So here we show that we can use IndexScan and not do any extra sorting.
| /* Can use compressed sort on segmentby cols for unordered chunks as well, | ||
| * unless this option is turned OFF | ||
| */ | ||
| if (!ts_guc_enable_compressed_unordered_sort && ts_chunk_is_unordered(chunk)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks simple and obviously correct, I think it might be OK w/o a GUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, will remove the guc.
| } | ||
|
|
||
| /* | ||
| * Cannot push down sort on (segmentby + non-segmentby) columns if the chunk is unordered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to extend the commend slightly, because the name "unordered" is somewhat confusing. The reason why we can't use sort pushdown is that the "unordered" chunks have batches that overlap on the orderby columns axis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will clarify. Basically the only difference between pushing down sort into ordered vs unordered chunks is that for unordered chunk we cannot push down sort with keys on orderby columns.
Fixes #9116
Use compressed sort on unordered chunks when we sort on segmentby columns only.
Implemented in a simpler, different way from #9128 after discussions on that PR.