Research Insights
Shorter, more timely and frequent research insights and perspectives from the Kamiwaza Agentic Intelligence Research team. Research insights provide a preview into our latest research findings before they are aggregated and summarized as part of future paper releases.
Latest Insights
A 9B Model Just Crashed the Big Leagues
JV Roig · March 5, 2026
Qwen3.5-9B scores 88.1% on our KAMI agentic benchmark — a bracket previously reserved for 70B+ dense models, 200B+ MoEs, and flagship cloud APIs. The small model revolution isn't coming. It's here.
Hallucination Resistance Holds at 64K and 128K Context
JV Roig · February 18, 2026
We pushed our LoRA-finetuned Granite 4.0 Micro from 32K to 64K and 128K context — 4-16x longer than training. Hallucination resistance held (92% → 88% → 87%). Extraction didn't. The "don't fabricate" lesson is durable; finding needles in bigger haystacks is not.
Can We Reduce LLM Hallucinations for Enterprise Use? RIKER+LoRA Says Yes
JV Roig · February 15, 2026
Using RIKER + LoRA SFT on IBM Granite 4.0 Micro with just ~1,100 lease contract examples boosted accuracy from 32% to 80% — and the hallucination resistance transferred to document types the model never saw during training.
Qwen3 Next 80B: The Long-Context Champion You Haven't Heard Of
JV Roig · January 28, 2026
Our RIKER benchmark testing reveals Qwen3 Next 80B-A3B as the top performer at 200K context, beating models 6x its size while using only 3B active parameters. A deep dive into what makes this model special for long-context knowledge retrieval.
Related Resources
- RIKER Paper - Full methodology for long-context knowledge retrieval evaluation
- KAMI Leaderboard - Live rankings for agentic AI performance
- Main Blog - Articles on agentic computing, orchestration and AI platform development



