Overview
FastRAG is a production-ready RAG (Retrieval-Augmented Generation) starter kit built with Next.js, LangChain, Pinecone, and OpenAI. It saves you 40+ hours of setting up vector ingestion pipelines, handling stream responses, and managing context windows.
Prerequisites
Make sure you have accounts and API keys ready before starting. All services have free tiers.
OpenAI API KeyRequiredUsed for embeddings and chat completions. Requires $5 credit added to the account — a new API key alone isn't enough.
Get keyPinecone API KeyRequiredVector database for storing and querying embeddings. Free Starter plan is sufficient.
Get keyBrowserless.io TokenOptionalHeadless browser for URL scraping. Only needed if you use the web scraping feature.
Get keyInstallation
Clone or unzip the project
If you have GitHub repo access:
git clone fastrag.git
cd fastragOr unzip the downloaded file and open the folder in your terminal.
Install dependencies
npm installnpm install --legacy-peer-depsEnvironment Setup
Rename .env.example to .env.local and fill in your keys:
# OpenAI — platform.openai.com/api-keys
OPENAI_API_KEY=sk-proj-...
# Pinecone — app.pinecone.io
PINECONE_API_KEY=pc-sk-...
# Must match the index name you create in Pinecone
PINECONE_INDEX=fast-rag
# Optional: only needed for URL scraping
BROWSERLESS_TOKEN=your-token-hereOPENAI_API_KEYRequiredPowers both text-embedding-3-small (for ingestion) and GPT-3.5-turbo (for chat).
PINECONE_API_KEYRequiredUsed to upsert and query your vector index.
PINECONE_INDEXRequiredMust exactly match the index name in your Pinecone dashboard. Case-sensitive — 'fast-rag' ≠ 'Fast-RAG'.
BROWSERLESS_TOKENOptionalPowers the headless Chromium instance for scraping JS-rendered sites. Skip if you don't use URL ingestion.
Pinecone Setup
Go to app.pinecone.io and sign in
Click "Create Index"
Use these exact settings:
| Name | fast-rag | Must match PINECONE_INDEX in .env.local |
| Dimensions | 1024 | ⚠️ DO NOT use the default 1536 |
| Metric | Cosine | |
| Cloud | AWS — us-east-1 | Recommended |
Click Create and wait ~30 seconds for the index to initialize
text-embedding-3-small at 1024 dims instead of the default 1536. This cuts Pinecone storage costs by ~33% with negligible quality loss.Running Locally
npm run devOpen http://localhost:3000 in your browser.
Architecture
FastRAG is a standard two-phase RAG pipeline:
pages/api/ingest.jsPDF uploads, chunking, vector upsert
pages/api/ingest-url.jsURL scraping, Puppeteer, vectorize
pages/api/chat.jsRetrieve chunks, stream GPT response
PDF Ingestion
Handled by pages/api/ingest.js. Supports multiple files simultaneously.
URL Ingestion
Handled by pages/api/ingest-url.js. Paste any URL to scrape, clean, and vectorize it.
Chat & Retrieval
Handled by pages/api/chat.js.
Frontend
Lives in pages/index.js. A single-page chat interface with two modes.
Deploy to Vercel
FastRAG is optimized for Vercel. Deployment takes about 5 minutes.
Push your code to GitHub
git init && git add .
git commit -m "initial"
git remote add origin https://github.com/you/fastrag.git
git push -u origin mainImport to Vercel
Go to vercel.com/new, import your repo, select Next.js as the framework.
Add all environment variables
In Vercel project settings → Environment Variables, add all keys from your .env.local:
OPENAI_API_KEYPINECONE_API_KEYPINECONE_INDEXBROWSERLESS_TOKENClick Deploy — your app goes live in ~2 minutes
Troubleshooting
429 — "You exceeded your current quota"Cause: OpenAI accounts require pre-paid credits. A new API key alone isn't enough.
Fix: Go to platform.openai.com/settings/organization/billing and add $5. May take 5–10 minutes to activate.
"Vector dimension 1536 does not match index 1024"Cause: Your Pinecone index was created with default settings (1536 dims).
Fix: Delete the index and recreate it with Dimensions: 1024. See Pinecone Setup above.
"PineconeNotFoundError: 404"Cause: PINECONE_INDEX env var doesn't match the index name in your dashboard.
Fix: Check .env.local — value must match exactly, including case.
Scraping returns empty contentCause: Site may be heavily client-side rendered, behind auth, or blocking scrapers.
Fix: Try a different URL. Docs sites and blogs work best. Paywalled pages won't work.
FAQ
Q: Do I need a paid Pinecone plan?
A: No. Free Starter plan supports 1 index and up to 100K vectors — plenty for development and small projects.
Q: Can I swap OpenAI for another model?
A: Yes. Chat and embedding logic is in pages/api/chat.js and ingest.js. Swap to any LangChain-compatible provider — Anthropic, Mistral, Cohere, etc.
Q: Can I use this commercially?
A: Yes. MIT license. Build products on top of FastRAG and sell them.
Q: What file types are supported besides PDF?
A: Currently only PDF. Extend ingest.js with LangChain's other loaders to support .txt, .docx, or .md.
Q: How do I clear all uploaded documents?
A: In Pinecone dashboard, go to your index → Namespaces → delete the 'global' namespace.
Q: Will I get future updates?
A: Yes. All buyers get lifetime access to the private GitHub repo. v1.3 is current.
Ready to ship?
Get the full source code and start building your AI app this weekend.
Get FastRAG — $29 →One-time · MIT License · Lifetime updates