Interactive CV - AI Personal Assistant

A production-ready personal AI assistant that answers questions about you using RAG (Retrieval-Augmented Generation) with optional LLM synthesis.

Perfect for: Personal websites, portfolios, professional profiles, or any scenario where you want an AI to answer questions about you.

Repository: https://github.com/ice13ball/interactive_cv

Live Demo: Ask about this project in the deployed application - it knows about its own repository!

Created using vibe coding via Windsurf

✨ Features

Smart Query Detection: Automatically detects what users are asking for (intro, projects, skills, full CV)
Dual Mode: Works without LLM (free, fast) or with LLM (natural, conversational)
Token-Gated: Secure access with token authentication
Token Usage Tracking: Per-token limits (default 1M tokens) to control API costs
Rate Limited: Prevents abuse (30 requests per 10 minutes per token)
Response Caching: 5-minute TTL reduces redundant processing
Auto-Rebuild: Embeddings automatically rebuild when corpus files change
Safety Filter: Detects and blocks/masks secrets and toxic content
Non-Indexable: Robots meta tags and headers prevent search engine indexing
Modern UI: Clean Next.js interface with prominent LLM toggle and keyboard shortcuts

🏗️ Architecture

Frontend: Next.js 14 (React, TypeScript, TailwindCSS)
Backend: FastAPI (Python 3.10+)
Vector DB: FAISS (CPU-only, no GPU needed)
Embeddings: SentenceTransformers (all-MiniLM-L6-v2)
LLM (Optional): OpenAI (gpt-4o-mini) or OpenRouter (claude-3-haiku)
Deployment: Google Cloud Run (recommended) or Render/Netlify - Containerized deployment

📋 Prerequisites

Python: 3.10 or higher
Node.js: 18 or higher
OS: macOS, Linux, or Windows (Apple Silicon compatible)

🚀 Quick Start

1. Clone and Setup

git clone <your-repo-url>
cd <repository-directory>

2. Create Your Content

Create your personal content files in backend/app/data/corpus/:

cd backend/app/data/corpus/

# Copy example files and customize them
cp intro.example.txt intro.txt
cp cv.example.md cv.md
cp projects.example.md projects.md
cp about_by_ai.example.md about_by_ai.md  # Optional
cp engineering_notes.example.md engineering_notes.md  # Optional

# Edit each file with your information
vim intro.txt
vim cv.md
vim projects.md

Required files:

intro.txt - Brief introduction (2-3 sentences)
cv.md - Your complete CV/resume
projects.md - Your projects portfolio

Optional files:

about_by_ai.md - AI-written evaluation (unique perspective)
engineering_notes.md - Technical philosophy and notes

3. Generate Secure Token

python3 scripts/generate_token.py

Copy the generated token - you'll need it for the .env file.

4. Configure Environment

# Create .env file from example
cp .env.example .env

# Edit .env and add your token
vim .env

Required settings:

ACCESS_TOKEN=<paste-your-generated-token-here>
SAFETY_ENABLE=true
SAFETY_MODE=warn

Optional LLM settings (leave empty for retrieval-only mode):

LLM_PROVIDER=openai  # or 'openrouter' or 'none'
OPENAI_API_KEY=sk-...  # if using OpenAI
OPENROUTER_API_KEY=sk-...  # if using OpenRouter

5. Install Backend Dependencies

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -U pip
pip install -e backend

6. Build Embeddings

# The embeddings will build automatically on first startup
# Or build manually:
PYTHONPATH=backend python scripts/make_embeddings.py

7. Start Backend

source .venv/bin/activate
uvicorn app.main:app --host 0.0.0.0 --port 8000

The backend will:

Check if corpus files changed
Automatically rebuild embeddings if needed
Start on http://localhost:8000

8. Start Frontend

# In a new terminal
cd frontend
npm install
npm run dev

Frontend will start on http://localhost:3000

9. Access Your AI Assistant

Open your browser:

http://localhost:3000?token=<your-access-token>

📁 Content File Structure

intro.txt

Brief, punchy introduction (2-3 sentences):

Your Name is a [role] who [what you do].
You [your approach/methodology].

cv.md

Complete CV with sections:

Professional Summary
Key Competencies
Professional Experience
Education
Languages
Personal Interests

projects.md

Project portfolio with this structure for each project:

## Project Name — Brief Description
### Problem
What problem does it solve?
### Solution
How does it solve it?
### Stack
Technologies used
### Role
Your responsibilities
### Outcome
Results and impact

about_by_ai.md (Optional)

AI-written evaluation with:

Who I Am
Strengths
Weaknesses
Operating Principles

engineering_notes.md (Optional)

Technical philosophy:

Technical Approach
Preferred Technologies
Design Principles
Best Practices

🎯 Query Examples

The system intelligently detects what users are asking for:

Query	Returns	Mode
"Who is [Name]?"	intro.txt + about_by_ai.md	Introduction
"What projects are you working on?"	projects.md	Projects list
"What are your skills?"	CV skills section	Skills
"Show me your full CV"	Complete cv.md	Full document
"Tell me about [specific project]"	RAG search	Specific info

🔧 Configuration

Environment Variables

# Required
ACCESS_TOKEN=your-secure-token-here

# Safety Filter
SAFETY_ENABLE=true
SAFETY_MODE=warn  # or 'block' or 'mask'

# Embeddings
EMBED_MODEL=sentence-transformers/all-MiniLM-L6-v2
DOCS_DIR=backend/app/data/corpus
FAISS_INDEX_PATH=backend/app/data/vectors/index.faiss

# LLM (Optional - leave empty for retrieval-only)
LLM_PROVIDER=none  # or 'openai' or 'openrouter'
OPENAI_API_KEY=
OPENROUTER_API_KEY=

# Rate Limiting
RATE_LIMIT_WINDOW=600  # seconds
RATE_LIMIT_MAX=30  # requests per window

# Frontend
NEXT_PUBLIC_API_URL=http://localhost:8000

Safety Modes

warn: Shows warnings but allows content through
mask: Replaces detected secrets/toxicity with ***
block: Blocks requests entirely if issues detected

🧪 Testing

Backend Tests

pip install -e "backend[dev]"
pytest -v

Frontend Tests

cd frontend
npm install
npm test

Manual Testing

# Health check
curl http://localhost:8000/health

# Readiness check
curl http://localhost:8000/health/ready

# Chat (replace TOKEN)
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -H "X-Access-Token: YOUR_TOKEN" \
  -d '{"q":"Who are you?","useLLM":false}'

# Cache stats
curl http://localhost:8000/chat/cache/stats \
  -H "X-Access-Token: YOUR_TOKEN"

🚢 Deployment

Backend (Render)

Create new Web Service on Render
Connect your repository
Configure:
- Build Command: pip install -e backend
- Start Command: uvicorn app.main:app --host 0.0.0.0 --port 10000
- Environment: Add all variables from .env
Deploy

Frontend (Netlify)

Create new site on Netlify
Connect your repository
Configure:
- Base directory: frontend
- Build command: npm ci && npm run build
- Publish directory: .next
Add environment variable:
- NEXT_PUBLIC_API_URL=https://your-render-api.onrender.com
Deploy

📊 Monitoring

Logs

The application logs extensively:

Request timing
RAG search results and scores
LLM call success/failures
Cache hits/misses
Safety filter findings
Rate limit warnings

Health Endpoints

/health - Basic health check
/health/ready - Readiness check (verifies index and model loaded)

Cache Statistics

/chat/cache/stats - View cache size and configuration

🔒 Security

Token Generation

Always use the provided script to generate secure tokens:

python3 scripts/generate_token.py

Safety Filter

The safety filter detects:

Secrets: API keys, tokens, private keys, AWS credentials, etc.
Toxicity: Offensive language and toxic content

Rate Limiting

In-memory rate limiting (resets on restart)
30 requests per 10 minutes per token (configurable)
Returns 429 status when exceeded

Non-Indexable

Robots meta tags on all pages
X-Robots-Tag headers
/robots.txt returns Disallow: /

🎨 Customization

Adding New Corpus Files

Create new .md or .txt file in backend/app/data/corpus/
Restart the application (embeddings rebuild automatically)
Content is now searchable

Changing Suggestion Chips

Edit frontend/app/page.tsx:

const SUGGESTIONS = [
  'Your custom question 1',
  'Your custom question 2',
  'Your custom question 3',
]

Adjusting Response Length

Edit backend/app/llm.py:

"max_tokens": 1200,  # Increase for longer responses

Custom Query Detection

Edit backend/app/routes/chat.py to add new keyword detection:

custom_keywords = ["keyword1", "keyword2"]
is_custom_request = any(keyword in query_lower for keyword in custom_keywords)

🐛 Troubleshooting

Embeddings Not Updating

# Manually rebuild
PYTHONPATH=backend python scripts/make_embeddings.py

# Or use API
curl -X POST http://localhost:8000/embed/rebuild \
  -H "X-Access-Token: YOUR_TOKEN"

Frontend Can't Connect to Backend

Check NEXT_PUBLIC_API_URL in frontend .env or Netlify settings
Verify backend is running on correct port
Check CORS settings in backend/app/main.py

Rate Limit Issues

Rate limits are in-memory and reset on restart
Adjust RATE_LIMIT_MAX and RATE_LIMIT_WINDOW in .env

LLM Not Working

Verify LLM_PROVIDER is set correctly
Check API key is valid
Ensure you have API credits
Check logs for error messages

📝 Development

Project Structure

<repository-name>/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI app
│   │   ├── config.py            # Configuration
│   │   ├── rag.py               # RAG logic
│   │   ├── llm.py               # LLM integration
│   │   ├── safety_filter.py     # Safety checks
│   │   ├── cache.py             # Response caching
│   │   ├── logger.py            # Logging utilities
│   │   ├── middleware/          # Middleware
│   │   ├── routes/              # API routes
│   │   └── data/
│   │       ├── corpus/          # Your content (gitignored)
│   │       └── vectors/         # FAISS index (gitignored)
│   └── pyproject.toml
├── frontend/
│   ├── app/
│   │   ├── page.tsx             # Main UI
│   │   ├── layout.tsx           # Layout
│   │   └── globals.css          # Styles
│   └── package.json
├── scripts/
│   ├── generate_token.py        # Token generator
│   └── make_embeddings.py       # Manual rebuild
├── tests/                       # Backend tests
├── .env.example                 # Environment template
├── .gitignore                   # Git ignore rules
└── README.md                    # This file

Adding Features

New API Endpoint: Add to backend/app/routes/
New Middleware: Add to backend/app/middleware/
Frontend Component: Add to frontend/app/ or frontend/components/
Tests: Add to tests/ for backend, frontend/__tests__/ for frontend

💰 Cost Estimates

Free Tier (No LLM)

Backend: Render Free (750 hours/month)
Frontend: Netlify Free (100GB bandwidth)
Total: $0/month

With LLM (OpenAI gpt-4o-mini)

Input: $0.150 per 1M tokens
Output: $0.600 per 1M tokens
Typical query: ~$0.0007
1000 queries: ~$0.70
Monthly (moderate use): $5-20

With LLM (OpenRouter claude-3-haiku)

Similar pricing to OpenAI
Slightly different token costs
Check OpenRouter for current rates

📄 License

MIT License - See LICENSE file for details

🤝 Contributing

This is a personal project template. Feel free to:

Fork and customize for your own use
Submit issues for bugs
Suggest improvements via pull requests

🙏 Acknowledgments

FAISS: Facebook AI Similarity Search
SentenceTransformers: Sentence embeddings library
FastAPI: Modern Python web framework
Next.js: React framework
OpenAI/OpenRouter: LLM providers

Made with ❤️ for personal AI assistants

For questions or issues, please open a GitHub issue.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.clinerules		.clinerules
backend		backend
deploy		deploy
frontend		frontend
scripts		scripts
tests		tests
web-bundles		web-bundles
.env.example		.env.example
.gitignore		.gitignore
.windsurfrules		.windsurfrules
CHANGELOG.md		CHANGELOG.md
DEPLOYMENT_GUIDE.md		DEPLOYMENT_GUIDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY_AUDIT.md		SECURITY_AUDIT.md
TOKEN_LIMITS.md		TOKEN_LIMITS.md
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini

License

ice13ball/interactive_cv

Folders and files

Latest commit

History

Repository files navigation