cat ~/research/index.md
research
This is not here as a trophy case. It is a record of the questions I keep coming back to: what breaks, what can be measured, what should be audited, and where human judgment still has to carry the weight.
- PAPERS
- 20
- YEAR
- 2026
- MODE
- full list, not an archive
- RELATED
- → /projects
════════════════════════════════════════════════════════════════════════════
№ 01
2026
I looked inside multilingual encoders and compared their internal signals against multilingual fMRI data. The interesting part: the shared cross-language alignment seems to come more from attention-side computations than the usual 'semantic FFN' story.
// why i care
A single benchmark score does not tell you what part of the model is doing the work. This gives multilingual model builders a more inspectable target.
authors: Ali Uyar
multilingual-nlpbrain-alignmentmechanistic-analysis
№ 02
2026
This paper separates shared multilingual structure from language-specific leftovers in XLM-R and NLLB-200 using naturalistic fMRI data. Less 'does it work?', more 'what kind of representation is it using?'.
// why i care
Useful when you want to audit whether a multilingual encoder is carrying meaning across languages or just getting good at language-specific residue.
authors: Ali Uyar
multilingual-nlpfmri-encodingrepresentation-analysis
№ 03
2026
A method for adding operational behaviors like strict JSON or exact quotation on top of a frozen model without retraining the whole thing. The goal is to make output contracts reusable and testable.
// why i care
Teams keep needing small, boring behavior guarantees from large models. This treats those guarantees as something you can isolate, ship, and audit.
authors: Ali Uyar
frozen-modelsadapter-residualsauditable-evaluation
№ 04
2026
A recovery method for poisoned agent memory. Instead of wiping everything, it tracks where bad state came from, revokes descendants, and replays only the parts that need repair.
// why i care
Agent memory is useful until it carries attacker garbage forward. This gives operators a cleaner option than 'delete everything and hope'.
authors: Ali Uyar
prompt-injection-recoveryagent-memoryprovenance
№ 05
2026
A locked testbed for moving function-calling behavior between same-family models. Sparse transfer worked, but the dense module still won the main metric, which is exactly the kind of inconvenient result worth knowing.
// why i care
It gives model-editing claims a deterministic check instead of a nice story and a few cherry-picked examples.
authors: Ali Uyar
function-callingdeterministic-evalmodel-editing
№ 06
2026
A paired diffing study of Gemma post-training. It recovered some capability signal, but the clean small-mask story did not really survive contact with held-out evaluation.
// why i care
A useful reminder that 'we found the circuit' and 'we found something that generalizes' are not the same sentence.
authors: Ali Uyar
instruction-tuningsparse-interventionsheld-out-eval
№ 07
2026
A cheap probe pass is used to initialize routing for multi-token prediction on a frozen backbone. The aim is better speculative drafting without making the adaptation recipe bigger and messier.
// why i care
If you can get a better route from a small offline probe, you should not need to unfreeze half the model just to test the idea.
authors: Ali Uyar
frozen-backbonemulti-token-predictionspeculative-decoding
№ 08
2026
This paper tests whether LLM judges follow the actual criterion or just react to how the criterion is worded. Accuracy can look fine while the judge is still path-dependent in a way you would not want in production.
// why i care
Good judge pipelines need to survive paraphrase and counterfactual checks. Otherwise the reviewer is grading the wording, not the work.
authors: Ali Uyar
llm-evaluationjudge-validityreliability
№ 09
2026
A probe study asking whether reasoning models carry an internal signal for 'I have enough evidence now.' That signal appears separable from just getting the final answer right.
// why i care
For retrieval and agent systems, knowing when the model has enough evidence may matter more than another confident-looking answer.
authors: Ali Uyar
interpretabilitymulti-hop-qahidden-state-probes
№ 10
2026
A locked analysis of matched ORF and CRISPR Cell Painting profiles. The structure is real, but fragile, and the retrieval utility disappears under stricter checks.
// why i care
Negative results like this are useful. They tell you which biological claims still hold when analytic flexibility is taken away.
authors: Ali Uyar
locked-pipelinerobustnessnegative-result
№ 11
2026
An audit of hidden-state verifiers that separates 'the final answer was right' from 'the process was locally valid.' Those are easy to blur and dangerous to confuse.
// why i care
If a verifier is just reading outcome signals, it should not be sold as process verification. This paper gives teams a way to check that.
authors: Ali Uyar
hidden-state-verifiersprocess-verificationcounterfactual-audit
№ 12
2026
A held-out agent-safety study that finds the first unsafe step, rewrites it safely, and trains on that contrast. The catch: preference training is limited by what the base model can even produce on-policy.
// why i care
It is a practical warning for agent safety work: better labels do not magically fix a weak behavioral horizon.
authors: Ali Uyar
agent-safetypreference-optimizationheld-out-evaluation
№ 13
2026
A full-scope negative result for a locked causal-abstraction acceptance rule. Across the tested settings, no proposed abstraction class earned certification.
// why i care
This is the kind of failure report interpretability needs more of: clear rules, no post-hoc rescue, and boundaries on what can be claimed.
authors: Ali Uyar
mechanistic-interpretabilityacceptance-criterianull-calibration
№ 14
2026
A reproducible way to corrupt, diagnose, and repair KV-cache failures using distillation. The goal is to turn weird serving instability into something you can actually test.
// why i care
Serving failures are easier to fix when they are repeatable. This makes that repeatability part of the workflow.
authors: Ali Uyar
kv-cachedistillationrobustness
№ 15
2026
A repair workflow for frozen models that finds counterexamples, patches the behavior, and leaves behind replayable certificates.
// why i care
Model repair should not be a vibes-based sign-off. This turns it into something teams can rerun and inspect.
authors: Ali Uyar
llm-reliabilityreproducibilitycertification
№ 16
2026
An audit method for checking whether interpretability claims are actually identifiable from the evidence. Pretty plots are not enough.
// why i care
It raises the bar for interpretability claims by asking what can be independently checked.
authors: Ali Uyar
interpretabilityauditingsafety
№ 17
2026
A compute-matched pipeline that separates short-term adaptation from durable improvement. The boring question matters: did we improve the model, or just help it survive the test?
// why i care
It helps teams avoid mistaking temporary adaptation for a real tuning win.
authors: Ali Uyar
test-time-adaptationdistillationevaluation
№ 18
2026
A controlled study of activation cascades under gain changes. It looks for instability patterns in the model before they show up as weird behavior downstream.
// why i care
If a model becomes brittle under small scaling changes, you want to find out in a lab, not from users.
authors: Ali Uyar
mechanistic-analysisllmstability
№ 19
2026
A deterministic audit for cases where semantically identical prompt formatting still changes the answer. Newline roulette is not a reliability strategy.
// why i care
It gives teams evidence about whether formatting changes are harmless or quietly changing outcomes.
authors: Ali Uyar
tokenizationevaluationauditability
№ 20
2026
A practical reliability report for AI systems: deterministic checks, evidence gates, and implementation notes for teams that have to ship the thing, not just demo it.
// why i care
It turns the reliability philosophy into something a delivery team can actually adopt.
authors: Ali Uyar
technical-reportreliabilitysystems