A production-ready personal AI assistant that answers questions about you using RAG (Retrieval-Augmented Generation) with optional LLM synthesis.
Perfect for: Personal websites, portfolios, professional profiles, or any scenario where you want an AI to answer questions about you.
Repository: https://github.com/ice13ball/interactive_cv
Live Demo: Ask about this project in the deployed application - it knows about its own repository!
Created using vibe coding via Windsurf
- Smart Query Detection: Automatically detects what users are asking for (intro, projects, skills, full CV)
- Dual Mode: Works without LLM (free, fast) or with LLM (natural, conversational)
- Token-Gated: Secure access with token authentication
- Token Usage Tracking: Per-token limits (default 1M tokens) to control API costs
- Rate Limited: Prevents abuse (30 requests per 10 minutes per token)
- Response Caching: 5-minute TTL reduces redundant processing
- Auto-Rebuild: Embeddings automatically rebuild when corpus files change
- Safety Filter: Detects and blocks/masks secrets and toxic content
- Non-Indexable: Robots meta tags and headers prevent search engine indexing
- Modern UI: Clean Next.js interface with prominent LLM toggle and keyboard shortcuts
- Frontend: Next.js 14 (React, TypeScript, TailwindCSS)
- Backend: FastAPI (Python 3.10+)
- Vector DB: FAISS (CPU-only, no GPU needed)
- Embeddings: SentenceTransformers (
all-MiniLM-L6-v2) - LLM (Optional): OpenAI (gpt-4o-mini) or OpenRouter (claude-3-haiku)
- Deployment: Google Cloud Run (recommended) or Render/Netlify - Containerized deployment
- Python: 3.10 or higher
- Node.js: 18 or higher
- OS: macOS, Linux, or Windows (Apple Silicon compatible)
git clone <your-repo-url>
cd <repository-directory>Create your personal content files in backend/app/data/corpus/:
cd backend/app/data/corpus/
# Copy example files and customize them
cp intro.example.txt intro.txt
cp cv.example.md cv.md
cp projects.example.md projects.md
cp about_by_ai.example.md about_by_ai.md # Optional
cp engineering_notes.example.md engineering_notes.md # Optional
# Edit each file with your information
vim intro.txt
vim cv.md
vim projects.mdRequired files:
intro.txt- Brief introduction (2-3 sentences)cv.md- Your complete CV/resumeprojects.md- Your projects portfolio
Optional files:
about_by_ai.md- AI-written evaluation (unique perspective)engineering_notes.md- Technical philosophy and notes
python3 scripts/generate_token.pyCopy the generated token - you'll need it for the .env file.
# Create .env file from example
cp .env.example .env
# Edit .env and add your token
vim .envRequired settings:
ACCESS_TOKEN=<paste-your-generated-token-here>
SAFETY_ENABLE=true
SAFETY_MODE=warnOptional LLM settings (leave empty for retrieval-only mode):
LLM_PROVIDER=openai # or 'openrouter' or 'none'
OPENAI_API_KEY=sk-... # if using OpenAI
OPENROUTER_API_KEY=sk-... # if using OpenRouter# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -U pip
pip install -e backend# The embeddings will build automatically on first startup
# Or build manually:
PYTHONPATH=backend python scripts/make_embeddings.pysource .venv/bin/activate
uvicorn app.main:app --host 0.0.0.0 --port 8000The backend will:
- Check if corpus files changed
- Automatically rebuild embeddings if needed
- Start on
http://localhost:8000
# In a new terminal
cd frontend
npm install
npm run devFrontend will start on http://localhost:3000
Open your browser:
http://localhost:3000?token=<your-access-token>
Brief, punchy introduction (2-3 sentences):
Your Name is a [role] who [what you do].
You [your approach/methodology].
Complete CV with sections:
- Professional Summary
- Key Competencies
- Professional Experience
- Education
- Languages
- Personal Interests
Project portfolio with this structure for each project:
## Project Name β Brief Description
### Problem
What problem does it solve?
### Solution
How does it solve it?
### Stack
Technologies used
### Role
Your responsibilities
### Outcome
Results and impactAI-written evaluation with:
- Who I Am
- Strengths
- Weaknesses
- Operating Principles
Technical philosophy:
- Technical Approach
- Preferred Technologies
- Design Principles
- Best Practices
The system intelligently detects what users are asking for:
| Query | Returns | Mode |
|---|---|---|
| "Who is [Name]?" | intro.txt + about_by_ai.md | Introduction |
| "What projects are you working on?" | projects.md | Projects list |
| "What are your skills?" | CV skills section | Skills |
| "Show me your full CV" | Complete cv.md | Full document |
| "Tell me about [specific project]" | RAG search | Specific info |
# Required
ACCESS_TOKEN=your-secure-token-here
# Safety Filter
SAFETY_ENABLE=true
SAFETY_MODE=warn # or 'block' or 'mask'
# Embeddings
EMBED_MODEL=sentence-transformers/all-MiniLM-L6-v2
DOCS_DIR=backend/app/data/corpus
FAISS_INDEX_PATH=backend/app/data/vectors/index.faiss
# LLM (Optional - leave empty for retrieval-only)
LLM_PROVIDER=none # or 'openai' or 'openrouter'
OPENAI_API_KEY=
OPENROUTER_API_KEY=
# Rate Limiting
RATE_LIMIT_WINDOW=600 # seconds
RATE_LIMIT_MAX=30 # requests per window
# Frontend
NEXT_PUBLIC_API_URL=http://localhost:8000- warn: Shows warnings but allows content through
- mask: Replaces detected secrets/toxicity with
*** - block: Blocks requests entirely if issues detected
pip install -e "backend[dev]"
pytest -vcd frontend
npm install
npm test# Health check
curl http://localhost:8000/health
# Readiness check
curl http://localhost:8000/health/ready
# Chat (replace TOKEN)
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-H "X-Access-Token: YOUR_TOKEN" \
-d '{"q":"Who are you?","useLLM":false}'
# Cache stats
curl http://localhost:8000/chat/cache/stats \
-H "X-Access-Token: YOUR_TOKEN"- Create new Web Service on Render
- Connect your repository
- Configure:
- Build Command:
pip install -e backend - Start Command:
uvicorn app.main:app --host 0.0.0.0 --port 10000 - Environment: Add all variables from
.env
- Build Command:
- Deploy
- Create new site on Netlify
- Connect your repository
- Configure:
- Base directory:
frontend - Build command:
npm ci && npm run build - Publish directory:
.next
- Base directory:
- Add environment variable:
NEXT_PUBLIC_API_URL=https://your-render-api.onrender.com
- Deploy
The application logs extensively:
- Request timing
- RAG search results and scores
- LLM call success/failures
- Cache hits/misses
- Safety filter findings
- Rate limit warnings
/health- Basic health check/health/ready- Readiness check (verifies index and model loaded)
/chat/cache/stats- View cache size and configuration
Always use the provided script to generate secure tokens:
python3 scripts/generate_token.pyThe safety filter detects:
- Secrets: API keys, tokens, private keys, AWS credentials, etc.
- Toxicity: Offensive language and toxic content
- In-memory rate limiting (resets on restart)
- 30 requests per 10 minutes per token (configurable)
- Returns 429 status when exceeded
- Robots meta tags on all pages
X-Robots-Tagheaders/robots.txtreturnsDisallow: /
- Create new
.mdor.txtfile inbackend/app/data/corpus/ - Restart the application (embeddings rebuild automatically)
- Content is now searchable
Edit frontend/app/page.tsx:
const SUGGESTIONS = [
'Your custom question 1',
'Your custom question 2',
'Your custom question 3',
]Edit backend/app/llm.py:
"max_tokens": 1200, # Increase for longer responsesEdit backend/app/routes/chat.py to add new keyword detection:
custom_keywords = ["keyword1", "keyword2"]
is_custom_request = any(keyword in query_lower for keyword in custom_keywords)# Manually rebuild
PYTHONPATH=backend python scripts/make_embeddings.py
# Or use API
curl -X POST http://localhost:8000/embed/rebuild \
-H "X-Access-Token: YOUR_TOKEN"- Check
NEXT_PUBLIC_API_URLin frontend.envor Netlify settings - Verify backend is running on correct port
- Check CORS settings in
backend/app/main.py
- Rate limits are in-memory and reset on restart
- Adjust
RATE_LIMIT_MAXandRATE_LIMIT_WINDOWin.env
- Verify
LLM_PROVIDERis set correctly - Check API key is valid
- Ensure you have API credits
- Check logs for error messages
<repository-name>/
βββ backend/
β βββ app/
β β βββ main.py # FastAPI app
β β βββ config.py # Configuration
β β βββ rag.py # RAG logic
β β βββ llm.py # LLM integration
β β βββ safety_filter.py # Safety checks
β β βββ cache.py # Response caching
β β βββ logger.py # Logging utilities
β β βββ middleware/ # Middleware
β β βββ routes/ # API routes
β β βββ data/
β β βββ corpus/ # Your content (gitignored)
β β βββ vectors/ # FAISS index (gitignored)
β βββ pyproject.toml
βββ frontend/
β βββ app/
β β βββ page.tsx # Main UI
β β βββ layout.tsx # Layout
β β βββ globals.css # Styles
β βββ package.json
βββ scripts/
β βββ generate_token.py # Token generator
β βββ make_embeddings.py # Manual rebuild
βββ tests/ # Backend tests
βββ .env.example # Environment template
βββ .gitignore # Git ignore rules
βββ README.md # This file
- New API Endpoint: Add to
backend/app/routes/ - New Middleware: Add to
backend/app/middleware/ - Frontend Component: Add to
frontend/app/orfrontend/components/ - Tests: Add to
tests/for backend,frontend/__tests__/for frontend
- Backend: Render Free (750 hours/month)
- Frontend: Netlify Free (100GB bandwidth)
- Total: $0/month
- Input: $0.150 per 1M tokens
- Output: $0.600 per 1M tokens
- Typical query: ~$0.0007
- 1000 queries: ~$0.70
- Monthly (moderate use): $5-20
- Similar pricing to OpenAI
- Slightly different token costs
- Check OpenRouter for current rates
MIT License - See LICENSE file for details
This is a personal project template. Feel free to:
- Fork and customize for your own use
- Submit issues for bugs
- Suggest improvements via pull requests
- FAISS: Facebook AI Similarity Search
- SentenceTransformers: Sentence embeddings library
- FastAPI: Modern Python web framework
- Next.js: React framework
- OpenAI/OpenRouter: LLM providers
Made with β€οΈ for personal AI assistants
For questions or issues, please open a GitHub issue.