Portfolio RAG system / AI product artefact
How the assistant works.
A small portfolio does not strictly need RAG. This one uses it deliberately, because the system is part of the work: source modelling, retrieval, bot protection, privacy, model routing, and answer grounding are all visible AI product-design decisions.
01 / Source model
Markdown stays canonical, even when the model answers.
The assistant indexes reviewed Markdown sources, not the whole repository. Each source has frontmatter for title, type, tags, keywords, aliases, confidence, visibility, reviewed date, and public citation URL.
Raw PDFs are not linked from public HTML and are not treated as public citations. CV material should be extracted into reviewed Markdown or HTML before it becomes part of the assistant corpus.
docs/andy-knowledge
data/rag/index.json
HTML pages only
02 / Retrieval
Small corpus, real retrieval discipline.
The build script chunks each source by H2 section and adds contextual prefixes so every retrieved chunk can stand alone. Long sections can split again at H3.
At runtime, the API retrieves sparse BM25 matches and dense-vector matches, then combines them with reciprocal rank fusion. The first index uses deterministic local lexical vectors so the repo can validate without secrets; Gemini embeddings can be enabled for production builds.
- 1
Parse frontmatter and Markdown source files.
- 2
Chunk by section with document and heading context.
- 3
Build sparse terms and dense vectors.
- 4
Fuse retrieval results and send compact context to Gemini.
03 / Security
The browser never gets the API key.
The homepage calls same-origin API routes. Anonymous questions are verified with Cloudflare Turnstile and capped to a very small quota before Gemini is called. Unlocking Clyde creates a server-signed session with a higher quota, while Redis-backed limits keep abuse controls consistent across serverless invocations.
The Gemini API key, Turnstile secret, Redis token, session secret, and password hash live in deployment environment variables. The browser only receives public configuration such as the Turnstile site key and source metadata.
This is why GitHub Pages alone is not enough for the assistant. GitHub Pages can still host static HTML, but Gemini-backed answers require an API proxy on Vercel, Netlify, Cloudflare, or another backend-capable platform.
First deployment target
The repository includes Vercel-compatible API functions and a local Node preview server. The recommended low-cost deployment pairs Vercel with free Cloudflare Turnstile and an Upstash Redis REST database.