Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents

论文概览

Diagnosing failures in LLM agents remains largely manual. Practitioners inspect a small subset of execution traces, form ad-hoc hypotheses, and iterate. This process misses patterns that only emerge across trace populat…

规范主键

arxiv:2605.21347

合并来源

arXiv

作者

Akshay Manglik，Apaar Shanker，Kaustubh Deshpande，Jason Qin，Yash Maurya，Veronica Chatrath，Vijay S. Kalmath，Levi Lentz，Yuan，Xue

分类

cs.AI, cs.LG, cs.SE

标签

评测 / 数据 / 方法

主题词

LLM / Benchmark

首次出现

2026-05-21 13:14:24 (UTC+08:00)

个人反馈

把你为什么标记这篇论文、接下来准备怎么处理，直接挂在规范化详情页上。

当前还没有个人反馈，可以先用本地 feedback CLI 补上。

反馈操作

复制规范主键或本地 CLI 命令，把这篇论文快速加入个人反馈状态文件。

行动提醒状态

这里记录这篇论文最近已经触发过哪些 action reason，便于解释为什么今天没有再次提醒。

当前还没有记录过 action 提醒。

来源与外链

优先展示这篇论文在各来源上的规范化入口，再补当前摘要页和 PDF。

arXiv PDF

历史命中

按归档时间回看它在哪些 feed 中出现过，并保留当日 digest 产物入口。

LM

2026-05-21

2026-05-21 13:14:24 (Asia/Shanghai)

Diagnosing failures in LLM agents remains largely manual. Practitioners inspect a small subset of execution traces, form ad-hoc hypotheses, and iterate. This process misses patter…

Score 144 · title matched "LLM"；title matched "agent"；summary matched "RAG"

Markdown JSON 对应 Feed 页

Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents

论文概览

个人反馈

反馈操作

行动提醒状态

来源与外链

历史命中

2026-05-21

相关推荐

LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG

Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio

ComplexMCP: Evaluation of LLM Agents in Dynamic, Interdependent, and Large-Scale Tool Sandbox

Agentic Retrieval-Augmented Generation for Financial Document Question Answering

Self-Induced Outcome Potential: Turn-Level Credit Assignment for Agents without Verifiers

Which Defense Closes Which Threat? Attributing OWASP-LLM-Top-10 Coverage and Its Brittleness Under Paraphrasing