From 1371f0bb9fc3f7d4c2e2e6f0e5875ad5a3206b92 Mon Sep 17 00:00:00 2001 From: Jason Hefkey Date: Sun, 18 Jan 2026 17:13:22 -0500 Subject: [PATCH] Add CLAUDE.md with project instructions Include UV-only requirement for package management. Co-Authored-By: Claude Opus 4.5 --- .gitignore | 4 ++- CLAUDE.md | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 75 insertions(+), 1 deletion(-) create mode 100644 CLAUDE.md diff --git a/.gitignore b/.gitignore index 41b4384b8..1799577ec 100644 --- a/.gitignore +++ b/.gitignore @@ -28,4 +28,6 @@ uploads/ # OS .DS_Store -Thumbs.db \ No newline at end of file +Thumbs.db + +CLAUDE.local.md \ No newline at end of file diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 000000000..e23fade2d --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,72 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +This is a Course Materials RAG (Retrieval-Augmented Generation) chatbot - a full-stack web application that answers questions about course materials using semantic search and Claude AI. + +## Commands + +**Always use UV to run the server and manage dependencies, never pip.** + +```bash +# Install dependencies +uv sync + +# Run the application (starts FastAPI server on port 8000) +./run.sh +# Or manually: +cd backend && uv run uvicorn app:app --reload --port 8000 + +# Web interface: http://localhost:8000 +# API docs: http://localhost:8000/docs +``` + +## Environment Setup + +Copy `.env.example` to `.env` and set `ANTHROPIC_API_KEY`. + +## Architecture + +### Query Flow + +``` +Frontend (script.js) + → POST /api/query + → app.py endpoint + → rag_system.query() + → ai_generator.generate_response() [calls Claude API with tools] + → Claude decides whether to use search_course_content tool + → If tool used: search_tools.py → vector_store.py → ChromaDB + → Response + sources returned to frontend +``` + +### Key Components + +- **rag_system.py**: Central orchestrator that coordinates all subsystems +- **ai_generator.py**: Claude API integration with tool-use support. Claude decides when to search (tool_choice: auto) +- **search_tools.py**: Tool definitions for Claude's function calling. `CourseSearchTool` wraps vector store searches +- **vector_store.py**: ChromaDB wrapper with two collections: + - `course_catalog`: Course metadata for semantic course name resolution + - `course_content`: Document chunks for content search +- **document_processor.py**: Parses course documents, identifies lessons, chunks text with sentence-aware splitting +- **session_manager.py**: Conversation history management per session + +### Configuration (backend/config.py) + +Key settings: `CHUNK_SIZE=800`, `CHUNK_OVERLAP=100`, `MAX_RESULTS=5`, `MAX_HISTORY=2` + +### Data Models (backend/models.py) + +- `Course`: Title, link, instructor, lessons +- `Lesson`: Number, title, link +- `CourseChunk`: Text chunk with course/lesson metadata + +### Frontend + +Vanilla HTML/CSS/JS in `frontend/`. Uses marked.js for markdown rendering. No build step required. + +### Course Documents + +Sample documents in `docs/` are auto-loaded on server startup by `app.py` lifespan handler.