Skip to content

Conversation

@LuD1161
Copy link
Contributor

@LuD1161 LuD1161 commented Jan 22, 2026

Summary

This PR adds a Security Analytics platform to ShipSec Studio that enables users to index workflow output data into OpenSearch and visualize it through dashboards. This transforms raw scan outputs into actionable intelligence for security teams.

Key Features

  • Analytics Sink Component: New workflow node (core.analytics.sink) that indexes output data from any upstream node to OpenSearch

    • Supports array and object inputs with automatic bulk indexing
    • Auto-detects asset correlation keys (host, domain, subdomain, url, ip, etc.)
    • Configurable index suffix and fail-on-error modes
    • Fire-and-forget by default with retry logic (3 attempts with exponential backoff)
  • OpenSearch Integration:

    • Daily index rotation pattern: security-findings-{orgId}-{YYYY.MM.DD}
    • Index template with standard metadata fields
    • Multi-tenant data isolation per organization
  • Analytics API:

    • POST /api/analytics/query endpoint supporting OpenSearch DSL
    • Auto-scopes queries to organization's index pattern
    • Rate limiting: 100 requests/minute per user
  • Analytics Settings Page:

    • Tier-based retention configuration (Free: 30d, Pro: 90d, Enterprise: 365d)
    • Admin-only access controls
  • UI Integration:

    • "Dashboards" link in sidebar (opens OpenSearch Dashboards)
    • "Analytics Settings" page for retention configuration
    • "View Analytics" button on workflow detail page
  • Component SDK Extensions:

    • generateFindingHash() utility for deduplication
    • Workflow context (workflowId, workflowName, organizationId) passed to components
    • Results output port added to nuclei, trufflehog, and supabase-scanner components
  • Docker Compose Setup:

    • OpenSearch 2.11.1 and OpenSearch Dashboards for development
    • Health checks and persistent volumes configured

Files Changed

65 files across backend, frontend, worker, component-sdk, and documentation.

Test plan

  • Run npm run typecheck to verify no type errors
  • Run npm run lint to verify code quality
  • Start OpenSearch stack: docker compose -f docker/docker-compose.infra.yml up opensearch opensearch-dashboards -d
  • Configure environment variables (OPENSEARCH_URL, etc.)
  • Run index template setup: npm run setup:opensearch
  • Create workflow with Analytics Sink component and verify data indexed
  • Test Analytics API endpoint: POST /api/analytics/query
  • Verify Dashboards link in sidebar opens OpenSearch Dashboards
  • Verify Analytics Settings page shows retention options

🤖 Generated with Claude Code

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42044b8c24

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch 8 times, most recently from 9360d45 to f3c553a Compare January 23, 2026 01:07
…ntegration

Add complete Analytics Sink infrastructure for indexing workflow outputs to OpenSearch:

**OpenSearch Infrastructure:**
- Install @opensearch-project/opensearch package
- Create OpenSearchClient with connection management and graceful degradation
- Implement SecurityAnalyticsService with indexDocument and bulkIndex methods
- Create index template for security-findings-* indices with standard mappings
- Add setup script for OpenSearch initialization (npm run setup:opensearch)
- Daily index rotation pattern: security-findings-{orgId}-{YYYY.MM.DD}
- Auto-detect asset_key from common fields (host, domain, subdomain, url, ip)

**Analytics Sink Component:**
- Create core.analytics.sink component schema with configurable parameters
- Implement execute() handler with array and single object indexing support
- Add OpenSearchIndexer utility in worker with retry mechanism
- Retry logic: 3 attempts with exponential backoff (1s/2s/4s delays)
- Fire-and-forget by default with optional failOnError parameter
- Include workflow metadata: workflow_id, workflow_name, run_id, node_ref, component_id
- Add trace logging for successful and failed indexing operations

**Workflow Editor Integration:**
- Create OpenSearchModule as global NestJS module for dependency injection
- Add Analytics Sink to Output category in workflow editor palette
- Implement configuration panel with enhanced parameter controls:
  - assetKeyField dropdown (Auto-detect, host, domain, subdomain, url, ip, asset, target, custom)
  - customAssetKeyField with conditional visibility
  - indexSuffix parameter with placeholder showing default behavior
  - failOnError toggle for error handling behavior

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
…ucture

Add user-facing analytics features and backend API:

**Dashboard Integration:**
- Add "Dashboards" navigation item to Studio sidebar (BarChart3 icon)
- Link opens OpenSearch Dashboards in new tab with organization filtering
- Add VITE_OPENSEARCH_DASHBOARDS_URL configuration

**Analytics Query API:**
- Create POST /api/analytics/query endpoint supporting OpenSearch DSL
- Auto-scope queries to organization index pattern: security-findings-{orgId}-*
- Support query, pagination (size/from), and aggregations
- Add rate limiting with @nestjs/throttler: 100 requests/minute per user
- Redis storage support for distributed rate limiting
- Return 429 Too Many Requests when limit exceeded

**Retention Settings:**
- Create AnalyticsSettingsPage component with tier-based retention UI
- Display subscription tier (Free/Pro/Enterprise) with max retention limits
- Dropdown to select retention period based on tier (7d to 365d)
- Create organization_settings table with Drizzle ORM schema
- Implement OrganizationSettingsService for settings management
- Add GET /api/analytics/settings endpoint (returns settings + tier limits)
- Add PUT /api/analytics/settings endpoint (admin-only with validation)
- Tier limits: Free (30d), Pro (90d), Enterprise (365d)
- Add "Analytics Settings" navigation item in AppLayout

**View Analytics Feature:**
- Add "View Analytics" button to workflow detail page TopBar
- Button opens OpenSearch Dashboards with workflow_id and run_id filters
- Position between Synced and Run buttons

**Docker Setup:**
- Add OpenSearch 2.11.1 and OpenSearch Dashboards to docker-compose.infra.yml
- Configure single-node OpenSearch on ports 9200/9600
- Configure OpenSearch Dashboards on port 5601
- Add health checks and persistent volumes
- Create opensearch-init service for index pattern initialization
- Update installation documentation with setup instructions

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch from f3c553a to fbfc644 Compare January 23, 2026 01:56
Add analytics output ports to security components and implement workflow context:

**Component Output Ports:**
- Add analytics output port design specification documentation
- Restructure indexed documents with shipsec context metadata
- Serialize nested objects for OpenSearch compatibility
- Update Analytics Sink to accept structured list<json> inputs
- Add results output port to nuclei component
- Add results output port to trufflehog component
- Add results output port to supabase-scanner component
- Add analytics output port guidelines for component development

**Finding Deduplication:**
- Add finding_hash field for deduplication across workflow runs
- Implement finding_hash in Nuclei: hash(templateId + host + matchedAt)
- Implement finding_hash in TruffleHog: hash(DetectorType + Redacted + filePath)
- Implement finding_hash in Supabase Scanner: hash(check_id + projectRef + resource)
- Add generateFindingHash() utility to component SDK
- Update component development docs with SDK usage

**Workflow Context:**
- Add organization_id to analytics context for multi-tenant isolation
- Pass workflow context (workflowId, workflowName, organizationId) to components
- Update workflow runner to inject context into component execution
- Update OpenSearchIndexer to include organization_id in index pattern
- Document finding_hash and shipsec context usage

**Bug Fixes and Enhancements:**
- Rename _shipsec to shipsec for UI visibility (underscore fields hidden in OpenSearch)
- Update View Analytics button URL to include run_id filter and extend time range
- Add manual worker environment loading for PM2 configuration
- Add error sample logging to bulk indexing for debugging
- Add comprehensive workflow analytics documentation
- Replace ASCII diagram with Mermaid in documentation

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch 2 times, most recently from 86835cc to a80521c Compare January 23, 2026 02:17
The analytics integration added a 'results' port to trufflehog and nuclei
components for analytics-ready output. Updated tests to expect this new field.

Fixes:
- trufflehog: 3 tests now expect results array with scanner metadata
- nuclei: 3 tests now include results field in output validation

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch from 0284482 to 8c83d0b Compare January 23, 2026 02:39
@LuD1161 LuD1161 requested a review from betterclever January 23, 2026 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants