Benchmarks

Repetitive knowledge-graph edge extraction task. OpenAI o200k_base tokenizer. 10,000-row corpus, fixed prompt. Each format scored on payload size, output token cost, malformed-row rate, and downstream mechanical parse throughput.

Format	Avg payload	Approx. output tokens	Malformed row rate	Parse speed
Verbose JSON	507 B	~127	0.2%	~45 k rows/s
YAML	312 B	~85	1.1%	~22 k rows/s
NDJSON	410 B	~105	0.1%	~60 k rows/s
ASHRU	99 B	~25	0.8%	~115 k rows/s

Reading the table

Output tokens dominate operational cost for repetitive extraction. ASHRU is roughly 5× cheaper than verbose JSON and 4× cheaper than NDJSON per row.
Parse speed reflects pipe-split throughput on a single core. ASHRU's positional shape is cheaper to tokenize than nested formats.
Malformation rate is the trade-off. ASHRU sits at 0.8% — higher than NDJSON's 0.1%, lower than YAML's 1.1%. Recoverable rows can be salvaged with the patterns in the spec §5; persistent failures should be dropped + logged.

Methodology

Each format used the same source prompt structure (header instructions + one example row + the input text). Model output was scored against a canonical ASHRU row set generated from the same source documents to determine malformation rate. Parse-speed numbers are single-threaded on an M3-class Apple Silicon core.

The benchmark scripts, corpus, and scorers will be open-sourced alongside the SDK at github.com/sumaproai/ashru. Reproducible runs welcome.

Full whitepaper with the complete methodology + analysis: /whitepaper.