AI Engineering

7 articles on ai engineering.

Building Production RAG: Retrieval, Chunking, and the Parts That Break

RAG demos are easy; production RAG is not. The full pipeline, the chunking and retrieval decisions that decide quality, and the failure modes nobody warns you about.

June 25, 2026

AI Engineering·9 min read

Choosing a Vector Database in 2026: pgvector, Pinecone, Qdrant, and When Postgres Is Enough

You probably don't need a dedicated vector database. A decision framework for pgvector vs Pinecone/Qdrant, index types, and the scale where Postgres stops being enough.

June 21, 2026

AI Engineering·10 min read

Structured Outputs and Tool Calling: Making LLMs Reliable

Parsing free text out of an LLM is a bug waiting to happen. How to get typed, schema-valid output and wire up tool calling so the model can act, safely.

June 17, 2026

AI Engineering·12 min read

LLM Agents That Actually Work: Tools, Loops, and Guardrails

Most 'agents' should have been a function call. When you genuinely need an agent loop, how to build one that is safe, bounded, and doesn't burn your budget.

June 11, 2026

AI Engineering·11 min read

Evaluating LLM Apps: How to Test Something Non-Deterministic

You can't unit-test a probability distribution with assertEquals. How to build evals, golden datasets, and LLM-as-judge scorers that catch regressions before users do.

June 5, 2026

AI Engineering·11 min read

Cutting LLM Cost and Latency in Production: Caching, Routing, and Streaming

LLM bills scale with traffic and latency kills UX. The caching, model-routing, and streaming tactics that cut both, with the tradeoffs spelled out.

May 25, 2026

AI Engineering·10 min read

Securing LLM Applications: Prompt Injection and the OWASP LLM Top 10

Prompt injection has no clean fix, and treating model output as trusted is how data leaks. A practical security guide for LLM apps using the OWASP LLM Top 10.

May 12, 2026

All articles