01/03
Find the constraint
A model problem may be a data problem. A dashboard problem may be a question problem. A strategy problem may be an avoided tradeoff. I like getting to the layer where the work actually changes.
// initialising …
// session ok
[ signature interaction ]
Three authored Pretext bands orbit the same constraint. Evidence keeps receipts, judgment names the tradeoff, and execution stays usable after the handoff.
Drag the amber core. The prose does not follow it; each lane parts, tightens, and breathes like field material around the thing that actually matters.
01/03
A model problem may be a data problem. A dashboard problem may be a question problem. A strategy problem may be an avoided tradeoff. I like getting to the layer where the work actually changes.
02/03
If a system makes a claim, routes a case, blocks an action, or changes a workflow, there should be something to inspect afterwards. Otherwise people are guessing politely.
03/03
Complexity has to earn its place. Sometimes the right move is a model, sometimes cleaner data, sometimes a better decision loop, and sometimes not adding AI at all.
// the layer where the problem lives
I care about the layer where the problem actually lives.
The visible failure might be the model, the data, the workflow, the metric, the incentive, or the decision nobody wants to make. The work is finding the real constraint, then making the answer simple enough to use.
The first answer is useful, but it is often incomplete.
Do not fix a data problem with a prompt.
Do not fix an ownership problem with a dashboard.
Do not fix a judgment problem by adding more automation.
Evals should answer product questions, not decorate dashboards.
Find the real constraint, then make the output simple enough to use.
[ flagship note ]
The first answer is often useful, but incomplete. The real constraint may be the model, the data, the workflow, the metric, the incentive, or the decision nobody wants to make.
A single benchmark score does not tell you what part of the model is doing the work. This gives multilingual model builders a more inspectable target.
Useful when you want to audit whether a multilingual encoder is carrying meaning across languages or just getting good at language-specific residue.
Teams keep needing small, boring behavior guarantees from large models. This treats those guarantees as something you can isolate, ship, and audit.
A small local-first library for keeping AI decisions replayable. Useful when someone asks, later, 'why did the system do that?' and hand-waving is not enough.
A data-quality control layer for AI datasets. It keeps schemas, access rules, and pipeline steps boringly explicit before anyone tries to put a model on top.
A self-hosted cost monitor that catches weird spend before the monthly report ruins someone's morning.
An offline privacy gate for data pipelines. It catches sensitive values, redacts them, and packages the result so teams can prove what happened.
// after the demo, in the weather
I care about the moment when an AI idea stops being exciting and starts becoming operational. What should it be trusted to do? Where should it fail closed? What evidence would make a decision defensible later?
That usually sits between engineering, product, and delivery. I like clear questions, small loops, and work that leaves the system less mysterious than it was before.