-
Notifications
You must be signed in to change notification settings - Fork 52
Make ref table function way simpler and more efficient #209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jzemmels
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good. Just one minor typo in the example and a question about dependencies. Otherwise approved.
| "flake8", | ||
| ] | ||
| doc = [ | ||
| "docutils<0.22", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these new dependencies from the changes made in this PR?
I don't see docutils or mapclassify used explicitly in the package files anywhere (from a CTRL + ALT + F through the package files)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docutils is one of the dependencies for building the documentation, and they made some update to the latest version that breaks the pipeline. Keeping it at an earlier version while the other packages catch up seems to fix it for now.
The other dependencies are needed to use the gpd.explore() interactive mapping functionality of geopandas, and are not installed if you simply pip install geopandas. The waterdata demo notebook I merged in earlier this week uses gpd.explore() and currently shows errors in the Jupyter notebook in the documentation: https://doi-usgs.github.io/dataretrieval-python/examples/WaterData_demo.html
Co-authored-by: Joe Zemmels (he/him) <jzemmels@gmail.com>
thodson-usgs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice in the work! In future, the docutils issue would be better as a separate PR.
I realized I initially built the reference table function to match
dataRetrieval, thinking that there was a reason we couldn't reuse theget_ogc_data()function, but I think this is inefficient and incorrect. Swapped out the repeated code forget_ogc_dataand changed the "id" column to the singular subject name of the reference table, e.g. "site-type-codes" has the "id" column changed to "site_type_code" (the rest of the columns swap underscores for dashes, so keeping consistent).I also learned that if there aren't any geometries present in the dataframe,
geopandaswill simply return apandasDataFrame. This avoids the issue I created in the last MR where I thought I had to specify "no geopandas" to get back a regular dataframe, and the logger was yelling about geopandas not being installed due to my oversight.The only risk is that URLs contain the argument
skipGeometry=FALSE, like this:https://api.waterdata.usgs.gov/ogcapi/v0/collections/time-zone-codes/items?skipGeometry=False&limit=50000
However, it doesn't seem to actually affect what comes back.