Feed Subscription

Terminal and SWE Agents 固定订阅页

适合长期跟踪单个研究方向。页面会汇总这个 feed 的最近 7 天 / 30 天表现，并保留每天命中的原始条目和 digest 链接。

返回归档首页查看趋势总览最新 Markdown 订阅 RSS

近期走势

Terminal and SWE Agents 今日没有新的命中文献。

2026-06-15

2026-06-16

2026-06-17

2026-06-18

2026-06-19

2026-06-20

2026-06-21

2026-06-22

2026-06-23

2026-06-24

2026-06-25

2026-06-26

2026-06-27

2026-06-28

历史命中

按天回看这个 feed 的命中文献，并保留当日 digest 的 Markdown / JSON 原始产物。

2026-06-26

命中 7 篇生成于 2026-06-26 13:16:53 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents7 篇

《Smaller Models, Unexpected Costs: Trade-offs in LLM Quantization for Automated Program Repair》〔评测 / 方法〕：Language Models (LLMs) are powerful toolsand have been increasingly adopted for complex software engineering tasks…

Smaller Models, Unexpected Costs: Trade-offs in LLM Quantization for Automated Program Repair · Score 108
title matched "program repair"；title matched "automated program repair"；has PDF
原始来源
To Run or Not to Run: Analyzing the Cost-Effectiveness of Code Execution in LLM-Based Program Repair · Score 83
title matched "program repair"；summary matched "SWE-bench"；has PDF
原始来源
How Much Static Structure Do Code Agents Need? A Study of Deterministic Anchoring · Score 65
title matched "code agent"；has PDF；has rich summary
原始来源
A Deterministic Control Plane for LLM Coding Agents · Score 64
title matched "coding agent"；has PDF；has rich summary
原始来源
NOVA: A Verification-Aware Agent Harness for Architecture Evolution in Industrial Recommender Systems · Score 47
summary matched "coding agent"；has PDF；has rich summary
原始来源

2026-06-25

命中 2 篇生成于 2026-06-25 13:11:21 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents2 篇

《Unlocking Model Potentials Through Adaptive Multi-Agent Scaffolding for Efficient Issue Resolution》〔评测 / 应用 / 方法〕：Resolving issues with ambiguous and incomplete descriptions, particularly concerning complex bugs, requi…

Unlocking Model Potentials Through Adaptive Multi-Agent Scaffolding for Efficient Issue Resolution · Score 78
title matched "issue resolution"；summary matched "SWE-bench"；has PDF
原始来源
Evaluating LLMs on Real-World Software Performance Optimization · Score 38
summary matched "repository-level"；has PDF；has rich summary
原始来源

2026-06-24

命中 5 篇生成于 2026-06-24 13:06:49 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents5 篇

《SHERLOC: Structured Diagnostic Localization for Code Repair Agents》〔方法〕：LLM agents solve repository-level coding tasks through multi-turn tool use, but utilize half their budget on locating faults before editing. Dedic…

SHERLOC: Structured Diagnostic Localization for Code Repair Agents · Score 105
title matched "code repair"；summary matched "SWE-bench"；summary matched "repository-level"
原始来源
NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? · Score 65
title matched "coding agent"；has PDF；has rich summary
原始来源
Bayesian control for coding agents · Score 64
title matched "coding agent"；has PDF；has rich summary
原始来源
Detecting AI Coding Agents in Open Source: A Validated Multi-Method Census of 180 Million Repositories · Score 63
title matched "coding agent"；has PDF；has rich summary
原始来源
LemonHarness Technical Report · Score 39
summary matched "Terminal-Bench"；has PDF；has rich summary
原始来源

2026-06-23

命中 1 篇生成于 2026-06-23 13:10:02 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents1 篇

《Tmax: A simple recipe for terminal agents》〔评测 / 数据 / 应用 / 方法〕：Terminal-using agents have quickly become the most popular downstream application of language models (LMs). Despite their prevalence, relatively little acad…

Tmax: A simple recipe for terminal agents · Score 84
title matched "terminal agent"；summary matched "Terminal-Bench"；has PDF
原始来源

2026-06-19

命中 3 篇生成于 2026-06-19 14:26:15 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents3 篇

《Probe-and-Refine Tuning of Repository Guidance for Coding Agents》〔应用 / 方法〕：LLM-based coding agents need higher-level operational knowledge about a repository (which files house which subsystems, how to run the test sui…

Probe-and-Refine Tuning of Repository Guidance for Coding Agents · Score 87
title matched "coding agent"；summary matched "SWE-bench"；has PDF
原始来源
Phoenix: Safe GitHub Issue Resolution via Multi-Agent LLMs · Score 83
title matched "issue resolution"；summary matched "SWE-bench"；has PDF
原始来源
N-Version Programming with Coding Agents · Score 63
title matched "coding agent"；has PDF；has rich summary
原始来源

2026-06-18

命中 1 篇生成于 2026-06-18 14:03:08 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents1 篇

《Data Intelligence Agents: Interpreting, Modeling, and Querying Enterprise Data via Autonomous Coding Agents》〔评测 / 应用 / 方法〕：Production data integration is bottlenecked by repeated, lossy handoffs between data owners, en…

Data Intelligence Agents: Interpreting, Modeling, and Querying Enterprise Data via Autonomous Coding Agents · Score 69
title matched "coding agent"；has PDF；has rich summary
原始来源

2026-06-17

命中 5 篇生成于 2026-06-17 14:22:19 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents5 篇

《All Smoke, No Alarm: Oracle Signals in Agent-Authored Test Code》〔方法〕：Software practitioners increasingly use AI coding agents that generate test code alongside production code in open source pull requests (PRs). Recent…

All Smoke, No Alarm: Oracle Signals in Agent-Authored Test Code · Score 46
summary matched "coding agent"；has PDF；has rich summary
原始来源
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling · Score 44
summary matched "SWE-bench"；has PDF；has rich summary
原始来源
VoidPadding: Let [VOID] Handle Padding in Masked Diffusion Language Models so that [EOS] Can Focus on Semantic Termination · Score 44
summary matched "code generation benchmark"；has PDF；has rich summary
原始来源
GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine? · Score 42
summary matched "coding agent"；has PDF；has rich summary
原始来源
Position: Coding Benchmarks Are Misaligned with Agentic Software Engineering · Score 40
summary matched "coding agent"；has PDF；has rich summary
原始来源

2026-06-16

命中 3 篇生成于 2026-06-16 14:38:43 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents3 篇

《Agent trajectories as programs: fingerprinting and programming coding-agent behavior》〔评测 / 数据 / 应用 / 方法〕：Benchmark scores tell you what an agent got right; they do not tell you how it got there. In this work, we introd…

Agent trajectories as programs: fingerprinting and programming coding-agent behavior · Score 64
summary matched "SWE-bench"；summary matched "coding agent"；has PDF
原始来源
Towards LLM Accelerated Rapid Reviews for Software Tool Discovery -- Case for Log Anomaly Detection · Score 44
summary matched "coding agent"；has PDF；has rich summary
原始来源
No Resource, No Benchmarks, No Problem? Evaluating and Improving LLMs for Code Generation in No-Resource Languages · Score 44
summary matched "code generation benchmark"；has PDF；has rich summary
原始来源

2026-06-12

命中 2 篇生成于 2026-06-12 13:55:02 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents2 篇

《Understanding the Rejection of Fixes Generated by Agentic Pull Requests -- Insights from the AIDev Dataset》〔数据 / 方法〕：AI coding agents are increasingly used to generate pull requests (PRs) that propose code fixes in sof…

Understanding the Rejection of Fixes Generated by Agentic Pull Requests -- Insights from the AIDev Dataset · Score 57
summary matched "coding agent"；has DOI；has PDF
原始来源
Recursive Agent Harnesses · Score 47
summary matched "coding agent"；has PDF；has rich summary
原始来源

2026-06-11

命中 3 篇生成于 2026-06-11 13:59:12 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents3 篇

《PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents》〔应用 / 方法〕：AI coding assistants now support a growing share of software work, from quick scripts to production applications. Yet th…

PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents · Score 69
title matched "coding agent"；has PDF；has rich summary
原始来源
Exploration Structure in LLM Agents for Multi-File Change Localization · Score 59
summary matched "SWE-bench"；summary matched "SWE bench"；has PDF
原始来源
Agents All the Way Down; A Methodology for Building Custom AI Agents from Substrate to Production · Score 39
summary matched "code agent"；has PDF；has rich summary
原始来源

2026-06-10

命中 3 篇生成于 2026-06-10 13:25:04 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents3 篇

《Frontier Coding Agents Use Metaprogramming to Adapt to Unfamiliar Programming Languages》〔评测 / 方法〕：LLM-based coding agents are usually evaluated in familiar software settings: mainstream languages, common libraries, and…

Frontier Coding Agents Use Metaprogramming to Adapt to Unfamiliar Programming Languages · Score 103
title matched "coding agent"；summary matched "Terminal-Bench"；summary matched "SWE-bench"
原始来源
AutoPDE: Reliable Agentic PDE Solving via Explicitly Represented Solver Strategies · Score 60
summary matched "coding agent"；summary matched "code agent"；has PDF
原始来源
DeNovoSWE: Scaling Long-Horizon Environments for Generating Entire Repositories from Scratch · Score 60
summary matched "code agent"；summary matched "bug fixing"；has PDF
原始来源

2026-06-09

命中 3 篇生成于 2026-06-09 13:12:49 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents3 篇

《SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation》〔方法〕：Advanced scientific simulators expose specialized input languages that turn simulation goals into executable configurations, but learning them ca…

SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation · Score 48
summary matched "coding agent"；has PDF；has rich summary
原始来源
From 0-to-1 to 1-to-N: Reproducible Engineering Evidence for MetaAI Recursive Self-Design · Score 46
summary matched "SWE-bench"；has PDF；has rich summary
原始来源
Self-Harness: Harnesses That Improve Themselves · Score 44
summary matched "Terminal-Bench"；has PDF；has rich summary
原始来源

2026-06-05

命中 10 篇生成于 2026-06-05 13:25:00 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents10 篇

《ADK Arena: Evaluating Agent Development Kits via LLM-as-a-Developer》〔评测 / 方法〕：The rapid proliferation of Agent Development Kits (ADKs), SDK-level frameworks for building LLM-powered autonomous agents, has outpaced any…

ADK Arena: Evaluating Agent Development Kits via LLM-as-a-Developer · Score 94
summary matched "Terminal-Bench"；summary matched "SWE-bench"；summary matched "coding agent"
原始来源
Asuka-Bench: Benchmarking Code Agents on Underspecified User Intent and Multi-Round Refinement · Score 80
title matched "code agent"；has PDF；has rich summary
原始来源
Knowledge Matters: Injecting Project and Testing Knowledge into LLM-based Unit Test Generation · Score 80
title matched "test generation"；has PDF；has rich summary
原始来源
SmellBench: Towards Fine-Grained Evaluation of Code Agents on Refactoring Tasks · Score 80
title matched "code agent"；has PDF；has rich summary
原始来源
From Failed Trajectories to Reliable LLM Agents: Diagnosing and Repairing Harness Flaws · Score 76
summary matched "Terminal-Bench"；summary matched "SWE-bench"；has PDF
原始来源

2026-06-04

命中 6 篇生成于 2026-06-04 14:02:06 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents6 篇

《Latent Anchor-Driven Test Generation for Deep Neural Networks》〔数据 / 应用 / 方法〕：Deep Neural Networks (DNNs) are increasingly being deployed in security-critical and safety-sensitive applications, which makes rigorous test…

Latent Anchor-Driven Test Generation for Deep Neural Networks · Score 79
title matched "test generation"；has PDF；has rich summary
原始来源
Can Generalist Agents Automate Data Curation? · Score 57
summary matched "coding agent"；has PDF；has rich summary
原始来源
Not All Errors Are Equal: Consequence-Aware Reasoning Compute Allocation · Score 57
summary matched "SWE-bench"；has PDF；has rich summary
原始来源
The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development? · Score 57
summary matched "code agent"；has PDF；has rich summary
原始来源
The Saturation Trap and the Subjectivity of Intervention Timing: Why Affect-Based Triggers and LLM Judges Fail to Time Interventions on Autonomous Agents · Score 57
summary matched "SWE-bench"；has PDF；has rich summary
原始来源

2026-06-03

命中 9 篇生成于 2026-06-03 14:09:56 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents9 篇

《What Makes Interaction Trajectories Effective for Training Terminal Agents?》〔方法〕：Stronger code agents are commonly assumed to be superior teachers for post-training, yet this assumption remains poorly disentangled from…

What Makes Interaction Trajectories Effective for Training Terminal Agents? · Score 115
title matched "terminal agent"；summary matched "Terminal-Bench"；summary matched "code agent"
原始来源
Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing · Score 97
title matched "code agent"；summary matched "coding agent"；has PDF
原始来源
Dependency-Guided Repository-Level C-to-Rust Translation with Reinforcement Alignment · Score 97
title matched "repository-level"；summary matched "repository level"；has PDF
原始来源
Handoff Debt: The Rediscovery Cost When Coding Agents Take Over Interrupted Tasks · Score 79
title matched "coding agent"；has PDF；has rich summary
原始来源
VulnAgent-R2: Evidence-Calibrated Multi-Agent Auditing for Repository-Level Vulnerability Detection · Score 79
title matched "repository-level"；has PDF；has rich summary
原始来源

2026-06-02

命中 1 篇生成于 2026-06-02 13:56:35 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents1 篇

《SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction》〔评测 / 应用 / 方法〕：Agent skills occupy a privileged position in the agent workflow, as agents are expected to implicitly follow and execute them, re…

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction · Score 47
summary matched "coding agent"；has PDF；has rich summary
原始来源

2026-05-29

命中 2 篇生成于 2026-05-29 13:18:32 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents2 篇

《Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software》〔应用 / 方法〕：Are AI agents tools, co-authors, or researchers? We present a quantified case study ($N=1$): a physicist sup…

Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software · Score 48
summary matched "coding agent"；has PDF；has rich summary
原始来源
Discovering Cooperative Pipelines: Autoresearch for Sequential Social Dilemmas · Score 45
summary matched "coding agent"；has PDF；has rich summary
原始来源

2026-05-28

命中 1 篇生成于 2026-05-28 13:15:52 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents1 篇

《Calibrating Conservatism for Scalable Oversight》〔方法〕：Agentic AI systems capable of autonomous planning and extended environmental interaction pose a fundamental control problem: how can humans maintain meaningful…

Calibrating Conservatism for Scalable Oversight · Score 48
summary matched "SWE-bench"；has PDF；has rich summary
原始来源

2026-05-22

命中 3 篇生成于 2026-05-22 13:08:19 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents3 篇

《"Refactoring Runaway": Understanding and Mitigating Tangled Refactorings in Coding Agents for Issue Resolution》〔方法〕：Recent advances in coding agents have shown remarkable progress in software issue resolution. In pract…

"Refactoring Runaway": Understanding and Mitigating Tangled Refactorings in Coding Agents for Issue Resolution · Score 125
title matched "coding agent"；title matched "issue resolution"；summary matched "SWE-bench"
原始来源
TerminalWorld: Benchmarking Agents on Real-World Terminal Tasks · Score 45
summary matched "Terminal-Bench"；has PDF；has rich summary
原始来源
Why Are Agentic Pull Requests Merged or Rejected? An Empirical Study · Score 45
summary matched "coding agent"；has PDF；has rich summary
原始来源

2026-05-21

命中 1 篇生成于 2026-05-21 13:14:24 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents1 篇

《SpecBench: Measuring Reward Hacking in Long-Horizon Coding Agents》〔评测 / 方法〕：As long-horizon coding agents produce more code than any developer can review, oversight collapses onto a single surface: the automated test s…

SpecBench: Measuring Reward Hacking in Long-Horizon Coding Agents · Score 69
title matched "coding agent"；has PDF；has rich summary
原始来源

2026-05-20

命中 5 篇生成于 2026-05-20 13:10:58 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents5 篇

《Does Code Cleanliness Affect Coding Agents? A Controlled Minimal-Pair Study》〔评测 / 应用 / 方法〕：As autonomous coding agents see rapid adoption, their evaluation has primarily focused on task completion rates holding the tar…

Does Code Cleanliness Affect Coding Agents? A Controlled Minimal-Pair Study · Score 80
title matched "coding agent"；has PDF；has rich summary
原始来源
PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents · Score 58
summary matched "coding agent"；has PDF；has rich summary
原始来源
RoadmapBench: Evaluating Long-Horizon Agentic Software Development Across Version Upgrades · Score 58
summary matched "coding agent"；has PDF；has rich summary
原始来源
The Growing Pains of Frontier Models: When Leaderboards Stop Separating and What to Measure Next · Score 58
summary matched "SWE-bench"；has PDF；has rich summary
原始来源
Toward Training Superintelligent Software Agents through Self-Play SWE-RL · Score 58
summary matched "SWE-bench"；has PDF；has rich summary
原始来源

2026-05-19

命中 3 篇生成于 2026-05-19 13:08:04 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents3 篇

《Same Signal, Different Semantics: A Cross-Framework Behavioral Analysis of Software Engineering Agents》〔应用 / 方法〕：Behavioral studies of LLM-based software engineering agents extract operational rules about which traject…

Same Signal, Different Semantics: A Cross-Framework Behavioral Analysis of Software Engineering Agents · Score 83
title matched "software engineering agent"；summary matched "SWE-bench"；has PDF
原始来源
SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution · Score 62
summary matched "Terminal-Bench"；summary matched "SWE-bench"；has PDF
原始来源
Reversa: A Reverse Documentation Engineering Framework for Converting Legacy Software into Operational Specifications for AI Agents · Score 48
summary matched "coding agent"；has PDF；has rich summary
原始来源

2026-05-15

命中 6 篇生成于 2026-05-15 14:57:29 (Asia/Shanghai)

Markdown JSON

Terminal and SWE Agents6 篇

《CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing》〔评测 / 方法〕：Code agents must both reason over long-horizon repository state and obey strict tool-use protocols. In paired Instruct/Thinking che…

CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing · Score 115
title matched "code agent"；summary matched "Terminal-Bench"；summary matched "SWE-bench"
原始来源
Remember Your Trace: Memory-Guided Long-Horizon Agentic Framework for Consistent and Hierarchical Repository-Level Code Documentation · Score 97
title matched "repository-level"；summary matched "coding agent"；has PDF
原始来源
SWE-Chain: Benchmarking Coding Agents on Chained Release-Level Package Upgrades · Score 97
title matched "coding agent"；summary matched "issue resolution"；has PDF
原始来源
Documentation-Guided Agentic Codebase Migration from C to Rust · Score 75
summary matched "coding agent"；summary matched "repository-level"；has PDF
原始来源
Comparing Developer and LLM Biases in Code Evaluation · Score 57
summary matched "code editing"；has PDF；has rich summary
原始来源

Terminal and SWE Agents 固定订阅页

近期走势

相关关键词页

历史命中