Category

AI Engineering

7 articles on ai engineering.

AI Engineering·12 min read

Building Production RAG: Retrieval, Chunking, and the Parts That Break

RAG demos are easy; production RAG is not. The full pipeline, the chunking and retrieval decisions that decide quality, and the failure modes nobody warns you about.

AI Engineering·9 min read

Choosing a Vector Database in 2026: pgvector, Pinecone, Qdrant, and When Postgres Is Enough

You probably don't need a dedicated vector database. A decision framework for pgvector vs Pinecone/Qdrant, index types, and the scale where Postgres stops being enough.

AI Engineering·10 min read

Structured Outputs and Tool Calling: Making LLMs Reliable

Parsing free text out of an LLM is a bug waiting to happen. How to get typed, schema-valid output and wire up tool calling so the model can act, safely.

AI Engineering·12 min read

LLM Agents That Actually Work: Tools, Loops, and Guardrails

Most 'agents' should have been a function call. When you genuinely need an agent loop, how to build one that is safe, bounded, and doesn't burn your budget.

AI Engineering·11 min read

Evaluating LLM Apps: How to Test Something Non-Deterministic

You can't unit-test a probability distribution with assertEquals. How to build evals, golden datasets, and LLM-as-judge scorers that catch regressions before users do.

AI Engineering·11 min read

Cutting LLM Cost and Latency in Production: Caching, Routing, and Streaming

LLM bills scale with traffic and latency kills UX. The caching, model-routing, and streaming tactics that cut both, with the tradeoffs spelled out.

AI Engineering·10 min read

Securing LLM Applications: Prompt Injection and the OWASP LLM Top 10

Prompt injection has no clean fix, and treating model output as trusted is how data leaks. A practical security guide for LLM apps using the OWASP LLM Top 10.