ls /research/
──────────────────────────────────────────────────────────────────────
RESEARCH OPERATIONS
The archive catalogues failures. The research arm tries to build the things that catch them — benchmarks, defenses, reproducers. This page tracks the operations currently in flight.
active: 001completed: 000archived: 000planned: 000
ps -eo op,status | grep ACTIVE
// ACTIVE
// in-flight, accepting contributors
┌┐└┘
[OP-001]STATUS:ACTIVE
[codename: sentry]
SentryBench
Defense-first framework for LLM backdoor evaluation, reproducible benchmarking, and modular defenses.
─── ABSTRACT ────────────────────────────
Backdoor research is fragmented — different datasets, triggers, training recipes, and metrics make results hard to compare. SentryBench provides a standard experiment contract: unified JSONL data schema, modular defenses with a fit/apply/evaluate interface, reproducible pipelines (seeds, configs, artifact hashes), and one-command reports (ASR / utility / stealth / robustness). Targets modern adapter workflows: LoRA, QLoRA, merging, instruction tuning.
// KEYWORDS
backdoordefensebenchmarkLoRAQLoRAreproducibility
─── METADATA ────────────────────────────
- started
- 2026-02
- last commit
- 2026-02-28
- lang
- Python
- license
- MIT
- stars
- ★ 0
- forks
- ⑂ 0
─── LEAD ────────────────────────────────
ls /research/completed/
// COMPLETED
// shipped, results published
// nothing here yet. the first operation lands when it lands.
cat /research/.queue
// PLANNED
// next up; scope still loose
// queue is forming. propose one via [research-proposal] issue on the org.