Feed Subscription

Agent Runtime Security 固定订阅页

适合长期跟踪单个研究方向。页面会汇总这个 feed 的最近 7 天 / 30 天表现，并保留每天命中的原始条目和 digest 链接。

返回归档首页查看趋势总览最新 Markdown 订阅 RSS

近期走势

Agent Runtime Security 今日没有新的命中文献。

2026-06-15

2026-06-16

2026-06-17

2026-06-18

2026-06-19

2026-06-20

2026-06-21

2026-06-22

2026-06-23

2026-06-24

2026-06-25

2026-06-26

2026-06-27

2026-06-28

历史命中

按天回看这个 feed 的命中文献，并保留当日 digest 的 Markdown / JSON 原始产物。

2026-06-26

命中 4 篇生成于 2026-06-26 13:16:53 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security4 篇

《Jailbreaking for the Average Jane: Choosing Optimal Jailbreaks via Bandit Algorithms for Automatically Enhanced Queries》〔评测 / 应用 / 方法〕：With a profusion of jailbreaks for LLMs now widely known, a growing concern is that…

Jailbreaking for the Average Jane: Choosing Optimal Jailbreaks via Bandit Algorithms for Automatically Enhanced Queries · Score 64
title matched "jailbreak"；has PDF；has rich summary
原始来源
Do Safety Guardrails Need to Reason? LeanGuard: A Fast and Light Approach for Robust Moderation · Score 59
title matched "guardrail"；has PDF；has rich summary
原始来源
AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems · Score 41
summary matched "guardrail"；has PDF；has rich summary
原始来源
MIRROR: Novelty-Constrained Memory-Guided MCTS Red-Teaming for Agentic RAG · Score 40
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-06-25

命中 3 篇生成于 2026-06-25 13:11:21 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security3 篇

《How Reliable Is Your Jailbreak Judge? Calibration and Adversarial Robustness of Automated ASR Scoring》〔方法〕：Almost every paper on LLM jailbreaks and prompt injection reports an attack-success rate (ASR), and that number…

How Reliable Is Your Jailbreak Judge? Calibration and Adversarial Robustness of Automated ASR Scoring · Score 78
title matched "jailbreak"；summary matched "prompt injection"；has PDF
原始来源
The Unfireable Safety Kernel: Execution-Time AI Alignment for AI Agents and Other Escapable AI Systems · Score 48
summary matched "guardrail"；has PDF；has rich summary
原始来源
AI Snitches Get Glitches: Towards Evading Agentic Surveillance · Score 44
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-06-24

命中 6 篇生成于 2026-06-24 13:06:49 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security6 篇

《Burnyard: Future of Malware Analysis》〔方法〕：Malware analysis is a critical aspect of modern cybersecurity. The prevailing industry practice, sandboxing, involves executing suspicious binaries within isol…；《LLMs Prompted…

Burnyard: Future of Malware Analysis · Score 47
summary matched "sandboxing"；has PDF；has rich summary
原始来源
LLMs Prompted for Legal Context Object More: Overrefusal from Small On-Premises LLMs in Criminal Legal Context · Score 44
summary matched "jailbreak"；has PDF；has rich summary
原始来源
Red-Teaming the Agentic Red-Team · Score 43
summary matched "guardrail"；has PDF；has rich summary
原始来源
PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models · Score 41
summary matched "guardrail"；has PDF；has rich summary
原始来源
Securing LLM-Agent Long-Term Memory Against Poisoning: Non-Malleable, Origin-Bound Authority with Machine-Checked Guarantees · Score 39
summary matched "data exfiltration"；has PDF；has rich summary
原始来源

2026-06-23

命中 3 篇生成于 2026-06-23 13:10:02 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security3 篇

《Capable but Careless: Do Computer-Use Agents Follow Contextual Integrity?》〔评测 / 应用 / 方法〕：Computer-use agents (CUAs) now act on a user's behalf across personal applications such as email, calendars, and to-do lists. Thi…

Capable but Careless: Do Computer-Use Agents Follow Contextual Integrity? · Score 64
title matched "computer-use agent"；has PDF；has rich summary
原始来源
TROPT: An Open Framework for Unifying and Advancing Discrete Text Optimization · Score 46
summary matched "jailbreak"；has PDF；has rich summary
原始来源
GIF: Locally Sound Geometric Information Flow Control for LLMs · Score 43
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-06-19

命中 4 篇生成于 2026-06-19 14:26:15 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security4 篇

《What Do Safety-Aligned LLMs Learn From Mixed Compliance Demonstrations?》〔方法〕：Prior work has shown that in-context demonstrations can jailbreak language models, but it remains unclear how models interpret different type…

What Do Safety-Aligned LLMs Learn From Mixed Compliance Demonstrations? · Score 46
summary matched "jailbreak"；has PDF；has rich summary
原始来源
Analyzing Defensive Misdirection Against Model-Guided Automated Attacks on Agentic AI Systems · Score 46
summary matched "jailbreak"；has PDF；has rich summary
原始来源
RACL: Reasoning-Agent Control Layers for Continuous Metaheuristic Learning · Score 41
summary matched "guardrail"；has PDF；has rich summary
原始来源
Beyond Static Endpoints: Tool Programs as an Interface for Flexible Agentic Web Services · Score 39
summary matched "sandboxing"；has PDF；has rich summary
原始来源

2026-06-18

命中 1 篇生成于 2026-06-18 14:03:08 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security1 篇

《CodeSentinel: A Three-Layer Defense Against Indirect Prompt Injection in Code Contexts》〔方法〕：Code large language models increasingly retrieve external code context from repositories, documentation, issue threads, and co…

CodeSentinel: A Three-Layer Defense Against Indirect Prompt Injection in Code Contexts · Score 108
title matched "prompt injection"；title matched "indirect prompt injection"；has PDF
原始来源

2026-06-17

命中 3 篇生成于 2026-06-17 14:22:19 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security3 篇

《Seeing Is Not Screening: Multimodal Hidden Instruction Attacks on Agent Skill Scanners》〔应用 / 方法〕：Agent skills are emerging as an important attack surface in LLM-based systems. Through an empirical study of existing ski…

Seeing Is Not Screening: Multimodal Hidden Instruction Attacks on Agent Skill Scanners · Score 47
summary matched "privilege escalation"；has PDF；has rich summary
原始来源
A Red-Team Study of Anthropic Fable 5 & Opus 4.8 Models · Score 47
summary matched "jailbreak"；has PDF；has rich summary
原始来源
PreAct: Computer-Using Agents that Get Faster on Repeated Tasks · Score 43
summary matched "guardrail"；has PDF；has rich summary
原始来源

2026-06-16

命中 5 篇生成于 2026-06-16 14:38:43 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security5 篇

《Automated jailbreak attack targeting multiple defense strategies》〔评测 / 方法〕：Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks. However, their safety remains a critical c…

Automated jailbreak attack targeting multiple defense strategies · Score 65
title matched "jailbreak"；has PDF；has rich summary
原始来源
MyPCBench: A Benchmark for Personally Intelligent Computer-Use Agents · Score 65
title matched "computer-use agent"；has PDF；has rich summary
原始来源
DoubtProbe: Black-Box Jailbreak Defense via Structural Verification and Semantic Auditing · Score 61
title matched "jailbreak"；has PDF；has rich summary
原始来源
KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing · Score 47
summary matched "prompt injection"；has PDF；has rich summary
原始来源
Adaptive and Explicit safe: Triggering Latent Safety Awareness in Large Reasoning Models · Score 44
summary matched "jailbreak"；has PDF；has rich summary
原始来源

2026-06-12

命中 5 篇生成于 2026-06-12 13:55:02 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security5 篇

《Neuro-Symbolic Agents for Regulated Process Automation: Challenges and Research Agenda》〔应用 / 方法〕：LLM-based agents are entering regulated industries where they automate judgment intensive quality management processes. W…

Neuro-Symbolic Agents for Regulated Process Automation: Challenges and Research Agenda · Score 44
summary matched "guardrail"；has PDF；has rich summary
原始来源
ComAct: Reframing Professional Software Manipulation via COM-as-Action Paradigm · Score 41
summary matched "computer-use agent"；has PDF；has rich summary
原始来源
Getting Better at Working With You: Compiling User Corrections into Runtime Enforcement for Coding Agents · Score 40
summary matched "agent runtime"；has PDF；has rich summary
原始来源
No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions · Score 38
summary matched "prompt injection"；has PDF；has rich summary
原始来源
Nous: An Attempt to Extract and Inject the Cognition Behind Prediction-Market Behavior · Score 38
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-06-11

命中 4 篇生成于 2026-06-11 13:59:12 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security4 篇

《Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code》〔评测 / 方法〕：Large Language Models (LLMs) are increasingly used for code generation, raising concerns that they may be misused to produce mali…

Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code · Score 60
title matched "jailbreak"；has PDF；has rich summary
原始来源
OCELOT: Inference-Leakage Budgets for Privacy-Preserving LLM Agents · Score 47
summary matched "jailbreak"；has PDF；has rich summary
原始来源
Online Shift Detection and Conformal Adaptation for Deployed Safety Classifiers · Score 41
summary matched "jailbreak"；has PDF；has rich summary
原始来源
External Experience Serving in Production LLM Systems: A Deployment-Oriented Study of Quality-Cost Trade-offs · Score 38
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-06-10

命中 7 篇生成于 2026-06-10 13:25:04 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security7 篇

《Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation》〔评测 / 应用 / 方法〕：Large language model (LLM) agents are rapidly moving from conversational interfaces to software components that plan, invoke t…

Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation · Score 78
summary matched "agent security"；summary matched "LLM agent security"；summary matched "prompt injection"
原始来源
Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields · Score 68
title matched "computer-use agent"；has PDF；has rich summary
原始来源
Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories · Score 48
summary matched "computer-use agent"；has PDF；has rich summary
原始来源
It Takes One to Bias Them All: Breaking Bad with One-Shot GRPO · Score 45
summary matched "guardrail"；has PDF；has rich summary
原始来源
Training LLMs to Enforce Multi-Level Instruction Hierarchies via Gravity-Weighted Direct Preference Optimization · Score 44
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-06-09

命中 4 篇生成于 2026-06-09 13:12:49 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security4 篇

《WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces》〔评测 / 方法〕：Computer-use agents (CUAs) increasingly operate in runtimes that combine visual desktop control, command-line ex…

WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces · Score 83
title matched "computer-use agent"；summary matched "agent runtime"；has PDF
原始来源
Brain-Prompt Injection: A Route-Safety Audit for BCI-LLM Agents · Score 63
title matched "prompt injection"；has PDF；has rich summary
原始来源
What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks · Score 47
summary matched "guardrail"；has PDF；has rich summary
原始来源
PRISM: Recovering Instruction Sets from Language Model Activations · Score 45
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-06-05

命中 5 篇生成于 2026-06-05 13:25:00 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security5 篇

《GuardNet: Ensemble Strategies of Shallow Neural Networks for Robust Prompt Injection and Jailbreak Detection》〔评测 / 数据 / 应用 / 方法〕：Large Language Models (LLMs) have transformed natural language processing, but they remai…

GuardNet: Ensemble Strategies of Shallow Neural Networks for Robust Prompt Injection and Jailbreak Detection · Score 138
title matched "prompt injection"；title matched "jailbreak"；summary matched "guardrail"
原始来源
From Risk Classification to Action Plan Remediation: A Guardrail Feedback Driven Framework for LLM Agents · Score 80
title matched "guardrail"；has PDF；has rich summary
原始来源
Safety Paradox: How Enhanced Safety Awareness Leaves LLMs Vulnerable to Posterior Attack · Score 76
summary matched "jailbreak"；summary matched "guardrail"；has PDF
原始来源
Beyond Similarity: Trustworthy Memory Search for Personal AI Agents · Score 58
summary matched "jailbreak"；has PDF；has rich summary
原始来源
The Granularity Gap: A Multi-Dimensional Longitudinal Audit of Sycophancy in Gemini Models · Score 58
summary matched "guardrail"；has PDF；has rich summary
原始来源

2026-06-04

命中 6 篇生成于 2026-06-04 14:02:06 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security6 篇

《MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models》〔评测 / 应用 / 方法〕：Diffusion large language models (dLLMs) generate text by iteratively denoising partially masked sequences unde…

MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models · Score 79
title matched "jailbreak"；has PDF；has rich summary
原始来源
What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems · Score 79
title matched "prompt injection"；has PDF；has rich summary
原始来源
Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agents · Score 75
summary matched "prompt injection"；summary matched "indirect prompt injection"；has PDF
原始来源
AgentJet: A Flexible Swarm Training Framework for Agentic Reinforcement Learning · Score 57
summary matched "agent runtime"；has PDF；has rich summary
原始来源
From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents · Score 57
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-06-03

命中 8 篇生成于 2026-06-03 14:09:56 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security8 篇

《D-Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting》〔评测 / 数据 / 方法〕：Multi-turn jailbreak attacks pose a growing threat to large language model (LLM) safety because they exploit feedback…

D-Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting · Score 79
title matched "jailbreak"；has PDF；has rich summary
原始来源
MedCUA-Bench: A Screenshot-Only Benchmark for Clinical Computer-Use Agents · Score 79
title matched "computer-use agent"；has PDF；has rich summary
原始来源
MultiTurnPSB: Evaluating Multi-Turn Jailbreak Attacks an dClassifier-Based Defenses for Medical AI Safety · Score 79
title matched "jailbreak"；has PDF；has rich summary
原始来源
From Control Boundary to Insurance Claim: Reconstructing AI-Mediated Losses Through the CER Framework · Score 75
summary matched "prompt injection"；summary matched "malicious tool"；has PDF
原始来源
Acceptance-Test-Driven Evaluation Protocols for Business-Centric LLM Systems · Score 57
summary matched "guardrail"；has PDF；has rich summary
原始来源

2026-06-02

命中 6 篇生成于 2026-06-02 13:56:35 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security6 篇

《Jailbreaking Multimodal Large Language Models using Multi-Clip Video》〔数据 / 应用 / 方法〕：As multimodal large language models (MLLMs) have advanced to process video inputs, concerns have emerged about their potential for mal…

Jailbreaking Multimodal Large Language Models using Multi-Clip Video · Score 63
title matched "jailbreak"；has PDF；has rich summary
原始来源
SentGuard: Sentence-Level Streaming Guardrails for Large Language Models · Score 62
title matched "guardrail"；has PDF；has rich summary
原始来源
AgentRedBench: Dynamic Redteaming and Integration-Aware Defense for LLM Agents over SaaS Integrations · Score 61
summary matched "prompt injection"；summary matched "indirect prompt injection"；has PDF
原始来源
SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning · Score 61
title matched "agent defense"；has PDF；has rich summary
原始来源
SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents · Score 44
summary matched "agent security"；has PDF；has rich summary
原始来源

2026-05-29

命中 4 篇生成于 2026-05-29 13:18:32 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security4 篇

《Provably Secure Agent Guardrail》〔评测 / 应用 / 方法〕：As large language models transition from bounded generative engines to agents with expansive execution privileges, AI going out of control precipitates a funda…；《Robust an…

Provably Secure Agent Guardrail · Score 120
title matched "secure agent"；title matched "guardrail"；has PDF
原始来源
Robust and Efficient Guardrails with Latent Reasoning · Score 80
title matched "guardrail"；has PDF；has rich summary
原始来源
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security · Score 58
summary matched "guardrail"；has PDF；has rich summary
原始来源
Beyond Attack Success Rate: Temporal Logit Observability for LLM Safety Failures · Score 58
summary matched "jailbreak"；has PDF；has rich summary
原始来源

2026-05-28

命中 5 篇生成于 2026-05-28 13:15:52 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security5 篇

《Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents》〔数据 / 方法〕：Computer-use agents (CUAs) have recently made substantial progress, but deploying a separate large expert for each software…

Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents · Score 70
title matched "computer-use agent"；has PDF；has rich summary
原始来源
Code as a Weapon: A Consensus-Labeled Prompt Bank for Measuring Coding-Model Compliance with Malicious-Code Requests · Score 47
summary matched "jailbreak"；has PDF；has rich summary
原始来源
The Ethics of LLM Sandbox and Persona Dynamics · Score 46
summary matched "guardrail"；has PDF；has rich summary
原始来源
LACUNA: Safe Agents as Recursive Program Holes · Score 46
summary matched "prompt injection"；has PDF；has rich summary
原始来源
Technical Report: Exploring the Emerging Threats of the Agent Skill Ecosystem · Score 45
summary matched "data exfiltration"；has PDF；has rich summary
原始来源

2026-05-27

命中 7 篇生成于 2026-05-27 13:23:19 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security7 篇

《EviACT: An Evidence-to-Action Framework for Agentic Program Repair》〔评测 / 方法〕：LLM-based agents have moved automated program repair (APR) from fixed-context patch generation to interactive repository-level repair. Howeve…

EviACT: An Evidence-to-Action Framework for Agentic Program Repair · Score 122
summary matched "guardrail"；has PDF；has rich summary
原始来源
Governed Evolution of Agent Runtimes through Executable Operational Cognition · Score 70
title matched "agent runtime"；has PDF；has rich summary
原始来源
Prompt Injection Detection is Regime-Dependent: A Deployment-Aware Evaluation with Interpretable Structural Signals · Score 65
title matched "prompt injection"；has PDF；has rich summary
原始来源
BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning · Score 45
summary matched "jailbreak"；has PDF；has rich summary
原始来源
AlbanianLLMSafety: A Safety Evaluation Dataset for Large Language Models in Albanian · Score 43
summary matched "guardrail"；has PDF；has rich summary
原始来源

2026-05-26

命中 3 篇生成于 2026-05-26 13:09:24 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security3 篇

《CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents》〔评测 / 数据 / 应用 / 方法〕：Reinforcement learning with verifiable rewards (RLVR) has driven breakthroughs in domains such as math, tool-use,…

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents · Score 62
title matched "computer-use agent"；has PDF；has rich summary
原始来源
How Agentic AI Coding Assistants Become the Attacker's Shell · Score 44
summary matched "prompt injection"；has PDF；has rich summary
原始来源
AgentHijack: Benchmarking Computer Use Agent Robustness to Common Environment Corruptions · Score 41
summary matched "computer-use agent"；has PDF；has rich summary
原始来源

2026-05-22

命中 3 篇生成于 2026-05-22 13:08:19 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security3 篇

《DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback》〔评测 / 应用 / 方法〕：LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learn…

DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback · Score 48
summary matched "agent sandbox"；has PDF；has rich summary
原始来源
HarnessAPI: A Skill-First Framework for Unified Streaming APIs and MCP Tools · Score 47
summary matched "agent runtime"；has PDF；has rich summary
原始来源
Contractual Skills: A GovernSpec Design Framework for Enterprise AI Agents · Score 46
summary matched "guardrail"；has PDF；has rich summary
原始来源

2026-05-21

命中 1 篇生成于 2026-05-21 13:14:24 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security1 篇

《Agent JIT Compilation for Latency-Optimizing Web Agent Planning and Scheduling》〔应用 / 方法〕：Computer-use agents (CUA) automate tasks specified with natural language such as "order the cheapest item from Taco Bell" by gene…

Agent JIT Compilation for Latency-Optimizing Web Agent Planning and Scheduling · Score 48
summary matched "computer-use agent"；has PDF；has rich summary
原始来源

2026-05-20

命中 7 篇生成于 2026-05-20 13:10:58 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security7 篇

《Attention-Guided Reward for Reinforcement Learning-based Jailbreak against Large Reasoning Models》〔评测 / 方法〕：Large Reasoning Models (LRMs) have demonstrated remarkable capabilities in solving complex problems by generat…

Attention-Guided Reward for Reinforcement Learning-based Jailbreak against Large Reasoning Models · Score 80
title matched "jailbreak"；has PDF；has rich summary
原始来源
OpenComputer: Verifiable Software Worlds for Computer-Use Agents · Score 80
title matched "computer-use agent"；has PDF；has rich summary
原始来源
Robotics-Inspired Guardrails for Foundation Models in Socially Sensitive Domains · Score 80
title matched "guardrail"；has PDF；has rich summary
原始来源
A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents · Score 58
summary matched "agent runtime"；has PDF；has rich summary
原始来源
Formal Skill: Programmable Runtime Skills for Efficient and Accurate LLM Agents · Score 58
summary matched "policy enforcement"；has PDF；has rich summary
原始来源

2026-05-19

命中 4 篇生成于 2026-05-19 13:08:04 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security4 篇

《An Empirical Study of Privacy Leakage Chains via Prompt Injection in Black-Box Chatbot Environments》〔方法〕：LLM-based chatbot agents increasingly process user requests by combining natural-language reasoning with external…

An Empirical Study of Privacy Leakage Chains via Prompt Injection in Black-Box Chatbot Environments · Score 98
title matched "prompt injection"；summary matched "indirect prompt injection"；summary matched "jailbreak"
原始来源
Multilingual jailbreaking of LLMs using low-resource languages · Score 82
title matched "jailbreak"；summary matched "guardrail"；has PDF
原始来源
Overeager Coding Agents: Measuring Out-of-Scope Actions on Benign Tasks · Score 68
summary matched "prompt injection"；has PDF；has rich summary
原始来源
Acoustic Interference: A New Paradigm Weaponizing Acoustic Latent Semantic for Universal Jailbreak against Large Audio Language Models · Score 63
title matched "jailbreak"；has PDF；has rich summary
原始来源

2026-05-14

命中 2 篇生成于 2026-05-14 12:52:54 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security2 篇

《Sleeper Channels and Provenance Gates: Persistent Prompt Injection in Always-on Autonomous AI Agents》〔评测 / 应用 / 方法〕：Always-on AI agents (OpenClaw, Hermes Agent) run as a single persistent process under the owner's iden…

Sleeper Channels and Provenance Gates: Persistent Prompt Injection in Always-on Autonomous AI Agents · Score 66
title matched "prompt injection"；has PDF；has rich summary
原始来源
LLM-Based Persuasion Enables Guardrail Override in Frontier LLMs · Score 63
title matched "guardrail"；has PDF；has rich summary
原始来源

2026-05-13

命中 2 篇生成于 2026-05-13 12:54:34 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security2 篇

《Metaphor Is Not All Attention Needs》〔应用 / 方法〕：Large language models are increasingly deployed in safety-critical applications, where their ability to resist harmful instructions is essential. Although post…；《A microser…

Metaphor Is Not All Attention Needs · Score 44
summary matched "jailbreak"；has PDF；has rich summary
原始来源
A microservices-based endpoint monitoring platform with predictive NLP models for real-time security and hate-speech risk alerting · Score 42
summary matched "data exfiltration"；has PDF；has rich summary
原始来源

2026-05-12

命中 6 篇生成于 2026-05-12 12:42:08 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security6 篇

《Break the Brake, Not the Wheel: Untargeted Jailbreak via Entropy Maximization》〔评测 / 方法〕：Recent studies show that gradient-based universal image jailbreaks on vision-language models (VLMs) exhibit little or no cross-mod…

Break the Brake, Not the Wheel: Untargeted Jailbreak via Entropy Maximization · Score 69
title matched "jailbreak"；has PDF；has rich summary
原始来源
Intrinsic Guardrails: How Semantic Geometry of Personality Interacts with Emergent Misalignment in LLMs · Score 67
title matched "guardrail"；has PDF；has rich summary
原始来源
Re-Triggering Safeguards within LLMs for Jailbreak Detection · Score 67
title matched "jailbreak"；has PDF；has rich summary
原始来源
Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing · Score 67
title matched "jailbreak"；has PDF；has rich summary
原始来源
RUBEN: Rule-Based Explanations for Retrieval-Augmented LLM Systems · Score 48
summary matched "prompt injection"；has PDF；has rich summary
原始来源

2026-05-08

命中 1 篇生成于 2026-05-08 14:15:32 (Asia/Shanghai)

Markdown JSON

Agent Runtime Security1 篇

《Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation》〔评测 / 应用 / 方法〕：Self-hosted computer-use agents (SHCUAs), such as OpenClaw, combine natural-language interaction with direct acce…

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation · Score 102
title matched "computer-use agent"；summary matched "prompt injection"；summary matched "indirect prompt injection"
原始来源

Agent Runtime Security 固定订阅页

近期走势

相关关键词页

历史命中