Integrate

From LLM emission to graph ingestion. The official reference implementation is the suma_ashru Rust crate; bindings for other languages are pure-protocol wrappers around the same eleven-field contract.

1. Prompt the LLM

Give the model the 11-field shape and a single example. The model emits one row per fact, one row per line.

You are extracting facts. Output one ASHRU row per fact, one row per line.

Schema (11 fields, pipe-delimited):
  V | action | subject | object | instrument | recipient | source | location | tense | negated | attributes

Example:
  V|deploy|engineer|api||||staging|p|0|

Rules:
  - Empty fields are sequential pipes (||).
  - Escape literal pipes inside values as \|.
  - tense: p past, n present, f future, h habitual, c conditional.
  - negated: 0 or 1.
  - Output the rows only. No prose. No fences.

Input:
{TEXT}

2. Parse in Rust

use suma_ashru::{AshruRow, Recovery};

// Strict parse — 11 fields required
let row = AshruRow::parse("V|deploy|engineer|api||||staging|p|0|")?;
assert_eq!(row.action.as_deref(), Some("deploy"));

// Lossy recovery for malformed LLM outputs
if let Some(row) = Recovery::try_parse(maybe_bad_row) {
    ingest(row);
} else {
    log_dropped(maybe_bad_row);
}

Install: cargo add suma_ashru

3. Parse in Python

FIELDS = [
    "version", "action", "subject", "object", "instrument",
    "recipient", "source", "location", "tense", "negated", "attributes",
]

def parse_ashru(line: str) -> dict | None:
    # Split on unescaped pipes
    parts, buf, esc = [], [], False
    for c in line:
        if esc:
            buf.append(c); esc = False
        elif c == "\\":
            esc = True
        elif c == "|":
            parts.append("".join(buf)); buf = []
        else:
            buf.append(c)
    parts.append("".join(buf))
    if len(parts) != 11:
        return None
    return { k: (v or None) for k, v in zip(FIELDS, parts) }

4. Ingest into Neo4j

// One Cypher edge per ASHRU row
MERGE (s:Entity {name: $subject})
MERGE (o:Entity {name: $object})
MERGE (s)-[r:ACTION {
    action: $action,
    tense: $tense,
    negated: toBoolean($negated),
    location: $location,
    instrument: $instrument
}]->(o)
RETURN r

5. Stream into Kafka

Each line is a complete record. No buffering required. Use the row itself as the message body, or wrap with a thin header (topic:partition:offset|ASHRU_ROW) if your downstream consumer needs Kafka metadata at parse time.

The reference SDK lives at github.com/sumaproai/ashru. Contributions for additional language bindings (Go, TypeScript, Java) welcome.