-
Notifications
You must be signed in to change notification settings - Fork 252
Add GizmoSQL #745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add GizmoSQL #745
Conversation
|
Hi @prmoore77, thanks for your submission! However, I am having trouble getting it to run. When I run benchmark.sh I get the following output when creating the table (line 33): The only change I have made was adding 'sudo' to the docker run command since it wasn't working for me otherwise. Any ideas? |
|
Hi @george-larionov - could you share the |
|
Hi @george-larionov - I think I figured it out. In my AWS EC2 provisioning scripts (attached) - I mount the recommended EBS volume to path: Could you try with a volume mounted at that path? Here are the scripts: If you would like me to put the provisioning scripts in the codebase, just let me know. I didn't do so, b/c I didn't see that others had done so... Thanks for your help! |
|
hi @george-larionov - were you able to test? |
|
Hi @prmoore77, sorry for the late reply, it's been a busy week. I tried again after replacing /nfs_data with an existing path, but am still getting the same error, I don't think it is related to the mount path. The expectation is that the benchmark.sh script can run on a fresh Ubuntu machine on AWS, so please include any specific mounting you need (although perhaps it would be simpler to just use the default mount paths). Take a look at the CedarDB or DuckDB scripts. Additionally, the way the benchmarks are run is semi-automated using the run-benchmark.sh script, which fills in the cloud-init.sh script with the correct details and then runs it on the AWS instance, so take a look there for the exact way that things are run. I'm also attaching the log.txt file, is this the one you meant? Let me know if you have any questions, I will try to be more timely in my replies 😳 |
| "column-oriented", | ||
| "arrow-flight-sql", | ||
| "duckdb", | ||
| "lukewarm-cold-run" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see here for more details what warm / lukewarm and cold runs mean in the context of ClickBench.
Two comments related to that.
run.sh contains this:
# Execute all queries in one session (so authentication overhead is minimized)
echo "Running benchmark with $(wc -l < queries.sql) queries, ${TRIES} tries each..."
gizmosqlline \
-u 'jdbc:arrow-flight-sql://localhost:31337?useEncryption=true&disableCertificateVerification=true' \
-n clickbench \
-p clickbench \
-f "${TEMP_SQL_FILE}"
The script does not clear the OS page cache between query runs - which it should do to qualify as a "lukewarm run". If authentication is costly, then feel free to disable it entirely (the script seems to disable certificate validation already).
The other thing is that we are moving away from lukewarm runs. New submissions ideally do cold runs right from the start. Would it be possible to kill/start the GizmoSQL docker container between query runs?
Closes #733
Thank You for Your Contribution!
We appreciate your effort and contribution to the project. To ensure that your Pull Request (PR) adheres to our guidelines, please ensure to review the rules mentioned in our contribution guidelines:
ClickHouse/ClickBench Contribution Rules
Thank you for your attention to these details and for helping us maintain the quality and integrity of the project.