-
Notifications
You must be signed in to change notification settings - Fork 0
Feature kpoland public datasets view #245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Changed Files
|
2c69a33 to
9bee670
Compare
9bee670 to
b233070
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a public datasets search feature, allowing users to discover and search published datasets without authentication. The implementation includes new view classes for searching published datasets, a search form with filtering capabilities (text search, keywords, and frequency range), and corresponding templates and JavaScript components.
Changes:
- Added
SearchPublishedDatasetsViewandHomePageViewclasses with search and filtering functionality - Created
PublishedDatasetSearchFormfor dataset search with query, keywords, and frequency filters - Refactored existing templates to use partials for better code reuse
- Added JavaScript components for keyword chip input and dataset search handling
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| gateway/sds_gateway/users/views.py | Added SearchPublishedDatasetsView and HomePageView classes with search filtering methods; methods duplicated from ListDatasetsView |
| gateway/sds_gateway/users/urls.py | Added URL pattern for search_datasets endpoint |
| gateway/sds_gateway/users/forms.py | Added PublishedDatasetSearchForm with query, keywords, and frequency fields |
| gateway/sds_gateway/templates/users/search_datasets.html | New template for dataset search page (has critical template variable naming bugs) |
| gateway/sds_gateway/templates/users/published_datasets_list.html | New template that includes search tab partial |
| gateway/sds_gateway/templates/users/partials/search_published_datasets_tab.html | Partial template for search results display |
| gateway/sds_gateway/templates/users/partials/my_datasets_tab.html | Refactored partial extracted from dataset_list.html |
| gateway/sds_gateway/templates/users/partials/dataset_search_form.html | Reusable search form partial (has critical form ID and duplicate name attribute bugs) |
| gateway/sds_gateway/templates/users/dataset_list.html | Refactored to use my_datasets_tab.html partial |
| gateway/sds_gateway/templates/pages/home.html | Enhanced with latest datasets display and search form (has critical JavaScript syntax error) |
| gateway/sds_gateway/static/js/search/KeywordChipInput.js | New JavaScript component for keyword chip input functionality |
| gateway/sds_gateway/static/js/search/DatasetSearchHandler.js | New JavaScript handler for dataset search interactions |
| gateway/sds_gateway/static/css/components.css | Added CSS styles for keyword chip components |
| gateway/config/urls.py | Updated home page to use new HomePageView instead of TemplateView |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
gateway/sds_gateway/templates/users/partials/search_published_datasets_tab.html
Show resolved
Hide resolved
| <div class="alert alert-info" role="alert"> | ||
| <h4 class="alert-heading">No datasets found</h4> | ||
| <p> | ||
| {% if form.query.value or form.keywords.value or form.min_frequency.value or form.max_frequency.value %} |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The template uses "form" as the variable name (e.g., line 34) but the view passes "search_form" in the context (line 2914 in views.py). This inconsistency will cause template rendering errors. The template should use "search_form" to match the context variable name.
| {% if form.query.value or form.keywords.value or form.min_frequency.value or form.max_frequency.value %} | |
| {% if search_form.query.value or search_form.keywords.value or search_form.min_frequency.value or search_form.max_frequency.value %} |
| href="?page=1{% if form.query.value %}&query={{ form.query.value|urlencode }}{% endif %}{% if form.keywords.value %}&keywords={{ form.keywords.value|urlencode }}{% endif %}{% if form.min_frequency.value %}&min_frequency={{ form.min_frequency.value }}{% endif %}{% if form.max_frequency.value %}&max_frequency={{ form.max_frequency.value }}{% endif %}">First</a> | ||
| </li> | ||
| <li class="page-item"> | ||
| <a class="page-link" | ||
| href="?page={{ page_obj.previous_page_number }}{% if form.query.value %}&query={{ form.query.value|urlencode }}{% endif %}{% if form.keywords.value %}&keywords={{ form.keywords.value|urlencode }}{% endif %}{% if form.min_frequency.value %}&min_frequency={{ form.min_frequency.value }}{% endif %}{% if form.max_frequency.value %}&max_frequency={{ form.max_frequency.value }}{% endif %}">Previous</a> | ||
| </li> | ||
| {% else %} | ||
| <li class="page-item disabled"> | ||
| <span class="page-link">First</span> | ||
| </li> | ||
| <li class="page-item disabled"> | ||
| <span class="page-link">Previous</span> | ||
| </li> | ||
| {% endif %} | ||
| <li class="page-item active"> | ||
| <span class="page-link">Page {{ page_obj.number }} of {{ page_obj.paginator.num_pages }}</span> | ||
| </li> | ||
| {% if page_obj.has_next %} | ||
| <li class="page-item"> | ||
| <a class="page-link" | ||
| href="?page={{ page_obj.next_page_number }}{% if form.query.value %}&query={{ form.query.value|urlencode }}{% endif %}{% if form.keywords.value %}&keywords={{ form.keywords.value|urlencode }}{% endif %}{% if form.min_frequency.value %}&min_frequency={{ form.min_frequency.value }}{% endif %}{% if form.max_frequency.value %}&max_frequency={{ form.max_frequency.value }}{% endif %}">Next</a> | ||
| </li> | ||
| <li class="page-item"> | ||
| <a class="page-link" | ||
| href="?page={{ page_obj.paginator.num_pages }}{% if form.query.value %}&query={{ form.query.value|urlencode }}{% endif %}{% if form.keywords.value %}&keywords={{ form.keywords.value|urlencode }}{% endif %}{% if form.min_frequency.value %}&min_frequency={{ form.min_frequency.value }}{% endif %}{% if form.max_frequency.value %}&max_frequency={{ form.max_frequency.value }}{% endif %}">Last</a> |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pagination links also use "form" variable (e.g., "form.query.value") instead of "search_form" which is the variable name passed in the view context. This will cause template rendering errors.
| def _get_published_datasets(self) -> QuerySet[Dataset]: | ||
| """Get all published datasets (status=FINAL or is_public=True).""" | ||
| return ( | ||
| Dataset.objects.filter( | ||
| Q(status=DatasetStatus.FINAL) | Q(is_public=True), | ||
| is_deleted=False, | ||
| ) | ||
| .prefetch_related("keywords", "owner") | ||
| .distinct() | ||
| .order_by("-created_at") | ||
| ) | ||
|
|
||
| def _apply_search_filters( | ||
| self, | ||
| datasets: QuerySet[Dataset], | ||
| form_data: dict[str, Any], | ||
| user: User, | ||
| ) -> QuerySet[Dataset]: | ||
| """Apply search filters to the dataset queryset.""" | ||
| query = form_data.get("query", "").strip() | ||
| keywords_str = form_data.get("keywords", "").strip() | ||
| min_freq = form_data.get("min_frequency") | ||
| max_freq = form_data.get("max_frequency") | ||
|
|
||
| # Apply text search | ||
| if query: | ||
| datasets = datasets.filter( | ||
| Q(name__icontains=query) | ||
| | Q(abstract__icontains=query) | ||
| | Q(description__icontains=query) | ||
| | Q(authors__icontains=query) | ||
| | Q(doi__icontains=query) | ||
| ) | ||
|
|
||
| # Apply keyword filter | ||
| if keywords_str: | ||
| # Split and slugify keywords | ||
| keyword_slugs = { | ||
| slugify(k.strip()) | ||
| for k in keywords_str.split(",") | ||
| if k.strip() and slugify(k.strip()) | ||
| } | ||
| if keyword_slugs: | ||
| datasets = datasets.filter(keywords__name__in=keyword_slugs).distinct() | ||
|
|
||
| # Apply frequency range filter | ||
| if min_freq is not None or max_freq is not None: | ||
| datasets = self._filter_by_frequency_range(datasets, min_freq, max_freq) | ||
|
|
||
| return datasets | ||
|
|
||
| def _check_center_frequency_match( | ||
| self, | ||
| center_freq_hz: float, | ||
| min_freq_hz: float | None, | ||
| max_freq_hz: float | None, | ||
| ) -> bool: | ||
| """Check if center frequency is within the search range.""" | ||
| return not ( | ||
| (min_freq_hz is not None and center_freq_hz < min_freq_hz) | ||
| or (max_freq_hz is not None and center_freq_hz > max_freq_hz) | ||
| ) | ||
|
|
||
| def _check_frequency_range_overlap( | ||
| self, | ||
| capture_min_hz: float, | ||
| capture_max_hz: float, | ||
| min_freq_hz: float | None, | ||
| max_freq_hz: float | None, | ||
| ) -> bool: | ||
| """Check if capture frequency range overlaps with search range.""" | ||
| # Overlap occurs if: capture_min <= search_max AND capture_max >= search_min | ||
| return not ( | ||
| (min_freq_hz is not None and capture_max_hz < min_freq_hz) | ||
| or (max_freq_hz is not None and capture_min_hz > max_freq_hz) | ||
| ) | ||
|
|
||
| def _process_capture_for_frequency_match( | ||
| self, | ||
| capture: Capture, | ||
| freq_info: dict[str, Any], | ||
| min_freq_hz: float | None, | ||
| max_freq_hz: float | None, | ||
| ) -> bool: | ||
| """Check if a capture matches the frequency range.""" | ||
| center_freq_hz = freq_info.get("center_frequency") | ||
| freq_min_hz = freq_info.get("frequency_min") | ||
| freq_max_hz = freq_info.get("frequency_max") | ||
|
|
||
| if center_freq_hz is None and freq_min_hz is None and freq_max_hz is None: | ||
| return False | ||
|
|
||
| # Check if we have explicit min/max range | ||
| if freq_min_hz is not None and freq_max_hz is not None: | ||
| capture_min_hz = float(freq_min_hz) | ||
| capture_max_hz = float(freq_max_hz) | ||
| return self._check_frequency_range_overlap( | ||
| capture_min_hz, capture_max_hz, min_freq_hz, max_freq_hz | ||
| ) | ||
|
|
||
| # If we only have center frequency, check if it's in range | ||
| if center_freq_hz is not None: | ||
| center_freq = float(center_freq_hz) | ||
| return self._check_center_frequency_match( | ||
| center_freq, min_freq_hz, max_freq_hz | ||
| ) | ||
|
|
||
| return False | ||
|
|
||
| def _filter_by_frequency_range( | ||
| self, | ||
| datasets: QuerySet[Dataset], | ||
| min_freq: float | None, | ||
| max_freq: float | None, | ||
| ) -> QuerySet[Dataset]: | ||
| """Filter datasets by frequency range of their captures.""" | ||
| if min_freq is None and max_freq is None: | ||
| return datasets | ||
|
|
||
| # Get all captures for these datasets | ||
| dataset_uuids = list(datasets.values_list("uuid", flat=True)) | ||
| captures = Capture.objects.filter( | ||
| dataset__uuid__in=dataset_uuids, is_deleted=False | ||
| ) | ||
|
|
||
| if not captures.exists(): | ||
| return datasets.none() | ||
|
|
||
| # Bulk load frequency metadata | ||
| try: | ||
| frequency_data = Capture.bulk_load_frequency_metadata(captures) | ||
| except (DatabaseError, AttributeError) as e: | ||
| log.warning(f"Error loading frequency metadata: {e}", exc_info=True) | ||
| return datasets | ||
|
|
||
| # Convert frequency to Hz for comparison | ||
| min_freq_hz = min_freq * 1e9 if min_freq is not None else None | ||
| max_freq_hz = max_freq * 1e9 if max_freq is not None else None | ||
|
|
||
| # Find datasets with captures in the frequency range | ||
| matching_dataset_uuids = set() | ||
| for capture in captures: | ||
| capture_uuid = str(capture.uuid) | ||
| freq_info = frequency_data.get(capture_uuid, {}) | ||
| if ( | ||
| self._process_capture_for_frequency_match( | ||
| capture, freq_info, min_freq_hz, max_freq_hz | ||
| ) | ||
| and capture.dataset_id | ||
| ): | ||
| matching_dataset_uuids.add(capture.dataset_id) | ||
|
|
||
| if not matching_dataset_uuids: | ||
| return datasets.none() | ||
|
|
||
| return datasets.filter(uuid__in=matching_dataset_uuids) |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods (_get_published_datasets, _apply_search_filters, _check_center_frequency_match, _check_frequency_range_overlap, _process_capture_for_frequency_match, _filter_by_frequency_range) are duplicated in both ListDatasetsView and SearchPublishedDatasetsView classes (lines 2721-2876 and 2919-3074). This code duplication violates DRY principles and makes maintenance harder. Consider extracting these methods into a shared mixin class or a separate utility class that both views can use.
No description provided.