Integrate
From LLM emission to graph ingestion. The official reference
implementation is the suma_ashru Rust
crate; bindings for other languages are pure-protocol wrappers around
the same eleven-field contract.
1. Prompt the LLM
Give the model the 11-field shape and a single example. The model emits one row per fact, one row per line.
You are extracting facts. Output one ASHRU row per fact, one row per line.
Schema (11 fields, pipe-delimited):
V | action | subject | object | instrument | recipient | source | location | tense | negated | attributes
Example:
V|deploy|engineer|api||||staging|p|0|
Rules:
- Empty fields are sequential pipes (||).
- Escape literal pipes inside values as \|.
- tense: p past, n present, f future, h habitual, c conditional.
- negated: 0 or 1.
- Output the rows only. No prose. No fences.
Input:
{TEXT} 2. Parse in Rust
use suma_ashru::{AshruRow, Recovery};
// Strict parse — 11 fields required
let row = AshruRow::parse("V|deploy|engineer|api||||staging|p|0|")?;
assert_eq!(row.action.as_deref(), Some("deploy"));
// Lossy recovery for malformed LLM outputs
if let Some(row) = Recovery::try_parse(maybe_bad_row) {
ingest(row);
} else {
log_dropped(maybe_bad_row);
}
Install: cargo add suma_ashru
3. Parse in Python
FIELDS = [
"version", "action", "subject", "object", "instrument",
"recipient", "source", "location", "tense", "negated", "attributes",
]
def parse_ashru(line: str) -> dict | None:
# Split on unescaped pipes
parts, buf, esc = [], [], False
for c in line:
if esc:
buf.append(c); esc = False
elif c == "\\":
esc = True
elif c == "|":
parts.append("".join(buf)); buf = []
else:
buf.append(c)
parts.append("".join(buf))
if len(parts) != 11:
return None
return { k: (v or None) for k, v in zip(FIELDS, parts) } 4. Ingest into Neo4j
// One Cypher edge per ASHRU row
MERGE (s:Entity {name: $subject})
MERGE (o:Entity {name: $object})
MERGE (s)-[r:ACTION {
action: $action,
tense: $tense,
negated: toBoolean($negated),
location: $location,
instrument: $instrument
}]->(o)
RETURN r 5. Stream into Kafka
Each line is a complete record. No buffering required. Use the row
itself as the message body, or wrap with a thin header
(topic:partition:offset|ASHRU_ROW) if your downstream
consumer needs Kafka metadata at parse time.
The reference SDK lives at github.com/sumaproai/ashru. Contributions for additional language bindings (Go, TypeScript, Java) welcome.