TokenTrace

Production observability for AI systems.

One developer burned through $638 in a month on AI coding assistance and could not tell where the tokens went. TokenTrace records every request — tokens in, tokens out, cost, latency — so you can see exactly what your models are doing.

go get github.com/greynewell/tokentrace

View on GitHub MIST Stack

Know what you're spending

Agent loops burn tokens on dead-end reasoning paths. RAG pipelines stuff context windows with irrelevant chunks. Without per-request tracing, you get a monthly bill and no way to attribute it. TokenTrace instruments your inference calls so you can find what is wasting money and fix it.

TokenTrace is the observability layer of the MIST Stack, alongside MatchSpec (evaluation), InferMux (routing), and SchemaFlux (data). Written in Go, zero external dependencies, shared message protocol. Read about the methodology at evaldriven.org.