<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>in-context learning Topic Archive</title>
<link>in-context-learning.html</link>
<description>关键词 in-context learning 的长期追踪 RSS，汇总历史命中文献。</description>
<language>zh-CN</language>
<lastBuildDate>Sun, 28 Jun 2026 05:24:06 +0000</lastBuildDate>
<item>
<title>MedGuards: Multi-Agent System for Reliable Medical Error Detection and Correction</title>
<link>../papers/arxiv-924e9f45b440.html</link>
<guid>https://arxiv.org/abs/2606.25651v1#2026-06-25#in-context-learning</guid>
<pubDate>Thu, 25 Jun 2026 13:11:21 +0800</pubDate>
<description>As Large Language Models (LLMs) are increasingly deployed in healthcare settings, accurate error detection and correction in generated or existing text becomes critical, as even minor mistakes can pose risks to patient safety. Existing methods for error detection and correction, including automated checks and heuristic-based approaches, do not generalize well across unseen datasets. In this paper, we propose MedGuards as a medical safety guardrail, which is a new framework that treats medical e…</description>
</item>
<item>
<title>Pigeonholing: Bad prompts hurt models to collapse and make mistakes</title>
<link>../papers/arxiv-112c872ebf06.html</link>
<guid>https://arxiv.org/abs/2606.24267v1#2026-06-24#in-context-learning</guid>
<pubDate>Wed, 24 Jun 2026 13:06:49 +0800</pubDate>
<description>While in-context learning is generally shown to be effective in Large Language Models (LLMs), bad contexts can cause performance degradation and mode collapse, a phenomenon we call &quot;pigeonholing.&quot; **Unintentionally bad** contexts can happen without malicious jailbreaking intents: For example, a user asks the model to justify an incorrect math theorem or fails to correct the model&#x27;s buggy code. Specifically, we investigate ``pigeonholing&quot; in two scenarios: (1) when the user suggests a solution,…</description>
</item>
<item>
<title>Navigating Unreliable Parametric and Contextual Knowledge: Explicit Knowledge Conflict Resolution for LLM Inference</title>
<link>../papers/arxiv-1b4902e41aec.html</link>
<guid>https://arxiv.org/abs/2606.20245v1#2026-06-19#in-context-learning</guid>
<pubDate>Fri, 19 Jun 2026 14:26:15 +0800</pubDate>
<description>Large language models (LLMs) have achieved strong performance across a wide range of language-based tasks by leveraging both extensive parametric knowledge and in-context learning ability, enabling them to incorporate external information provided in the input prompt. However, the integration of external knowledge can introduce conflicts, not only between the model&#x27;s internal parametric knowledge and the external information, but also among multiple pieces of external contexts. Existing approac…</description>
</item>
<item>
<title>What Do Safety-Aligned LLMs Learn From Mixed Compliance Demonstrations?</title>
<link>../papers/arxiv-d2c45c3b54a7.html</link>
<guid>https://arxiv.org/abs/2606.20508v1#2026-06-19#in-context-learning</guid>
<pubDate>Fri, 19 Jun 2026 14:26:15 +0800</pubDate>
<description>Prior work has shown that in-context demonstrations can jailbreak language models, but it remains unclear how models interpret different types of compliance demonstrations. We study this by mixing benign compliance demonstrations (non-harmful request, helpful response) with harmful compliance demonstrations (harmful request, helpful response) and testing three hypotheses about how demonstration composition drives harmful compliance. Across four models, we find that benign and harmful demonstrat…</description>
</item>
<item>
<title>Querying an astronomical database using large language models: the ALeRCE text-to-SQL system</title>
<link>../papers/arxiv-8a813f327a5a.html</link>
<guid>https://arxiv.org/abs/2606.18108v1#2026-06-17#in-context-learning</guid>
<pubDate>Wed, 17 Jun 2026 14:22:19 +0800</pubDate>
<description>We develop a text-to-SQL (structured query language) system based on large language models (LLMs) using in-context learning and apply it to the Automatic Learning for the Rapid Classification of Events (ALeRCE) astronomical database. ALeRCE is a community broker for the Zwicky Transient Facility and the Vera C. Rubin Observatory. The system enables users to query the database in natural language (NL) and generates executable SQL queries. To develop and evaluate the system, we constructed a data…</description>
</item>
<item>
<title>Caliper: Probing Lexical Anchors versus Causal Structure in LLMs</title>
<link>../papers/arxiv-ccfc01d31332.html</link>
<guid>https://arxiv.org/abs/2606.04915#2026-06-04#in-context-learning</guid>
<pubDate>Thu, 04 Jun 2026 14:02:06 +0800</pubDate>
<description>Large language models reach 50 to 70% accuracy on causal reasoning benchmarks such as CLadder, but it is unclear whether this reflects structural reasoning or lexical pattern matching. We introduce Caliper, a controlled perturbation that replaces semantic variable names with placeholder tokens while preserving the causal graph and probabilistic specification of each question. Across nine instruction-tuned LLMs from 3.8B to 671B and three causal reasoning benchmarks, lexical anonymization yields…</description>
</item>
<item>
<title>Reasoning over Grammar: Can Synthetic Linguistic Reasoning Traces Enhance Low-Resource Machine Translation?</title>
<link>../papers/arxiv-98760774739b.html</link>
<guid>https://arxiv.org/abs/2606.03782#2026-06-03#in-context-learning</guid>
<pubDate>Wed, 03 Jun 2026 14:09:56 +0800</pubDate>
<description>Large language models (LLMs) offer a promising approach to machine translation (MT) for extremely low-resource languages by incorporating linguistic resources through in-context learning. However, LLMs often struggle to apply grammatical information effectively during translation. Inspired by recent progress in chain-of-thought reasoning, we investigate whether low-resource MT can benefit from structured intermediate steps of linguistic analysis and grammatical reasoning. We propose a pipeline…</description>
</item>
<item>
<title>BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models</title>
<link>../papers/arxiv-33f9027d56b4.html</link>
<guid>https://arxiv.org/abs/2605.05758#2026-05-08#in-context-learning</guid>
<pubDate>Fri, 08 May 2026 14:15:32 +0800</pubDate>
<description>Despite the success of large language models (LLMs) on general-purpose tasks, their performance in highly specialized domains such as biomedicine remains unsatisfactory. A key limitation is the inability of LLMs to effectively leverage biomedical tools, which clinical experts and biomedical researchers rely on extensively in daily workflows. While recent general-domain tool-calling datasets have substantially improved the capabilities of LLM agents, existing efforts in the biomedical domain lar…</description>
</item>
<item>
<title>From Image to Pixels: towards Fine-Grained Medical Vision-Language Models.</title>
<link>../papers/doi-71303bb82f13.html</link>
<guid>https://pubmed.ncbi.nlm.nih.gov/41989909/#2026-04-17#in-context-learning</guid>
<pubDate>Fri, 17 Apr 2026 11:39:21 +0800</pubDate>
<description>Multimodal large language models (MLLMs) offer immense potential for biomedical AI, yet current applications remain limited to coarse-grained image understanding and basic textual queries-falling short of the fine-grained reasoning required in clinical contexts. In this work, we present a comprehensive solution spanning data, model, and training innovations to advance pixel-level multimodal intelligence in biomedicine. First, we construct MeCoVQA, a new visual-language benchmark that spans eigh…</description>
</item>
</channel>
</rss>
