Canonical Paper

Anthropogenic Regional Adaptation in Multimodal Vision-Language Model

这是一篇规范化归档后的论文详情页,聚合了多来源命中、历史出现记录和相关推荐。

相关性

86

当前分数

1 个合并来源

历史跨度

1

个活跃日期

1 个 feed

归档记录

1

次归档出现

1 个合并来源

主题与标签

2

个主题词

2 个标签

论文概览

While the field of vision-language (VL) has achieved remarkable success in integrating visual and textual information across multiple languages and domains, there is still no dedicated framework for assessing human-cent…

规范主键

arxiv:2604.11490

合并来源

arXiv

作者

Samuel Cahyawijaya,Peerat Limkonchotiwat,Tack Hwa Wong,Hitesh Laxmichand Patel,Amit Agarwal,Manuel Antonio Rufino,Carlos Rafael Catalan,Muhammad Reza Qorib,Vicky Feliren,Holy Lovenia,Aye Hninn Khine,Frederikus Hudi,David Anugraha,Alham Fikri Aji,Romrawin Chumpu,Viet-Thanh Pham,Minghan Wang,Mohamed Fazli Imam,Ruochen Zhang,Joseph Marvin Imperial,Do Xuan Long,Musa Izzanardi Wijanarko,Joel Ruben Antony Moniz,Patrick Amadeus Irawan,Hanif Muhammad Zhafran,Isaiah Flores,Ira Salsabila,Jun Kevin,Jostin Jerico Rosal,Patricia Nicole Monderin,Kun Kerdthaisong,Ahmad Mustafid,My Chiffon Nguyen,Natchapon Jongwiriyanurak,Siva Worajitwannakul,Haochen Li,Adrian Xuan Wei Lim,Bin Wang,Muhammad Ravi Shulthan Habibi,Lynnette Hui Xian Ng,Mithil Bangera,Yeshil Bangera,Priyaranjan Pattnayak,Dun Li Chan,Sherissa Caren Djuniwar,Hee Ming Shan

分类

cs.AI, cs.CL, cs.CV

标签

评测 / 方法

主题词

Language Model / Multimodal

首次出现

2026-04-14 11:37:06 (UTC+08:00)

最近出现

2026-04-14 11:37:06 (UTC+08:00)

覆盖跨度

1 个活跃日期 / 1 个 feed / 1 次归档出现

反馈状态

未设置

下一步

未设置

最晚处理

未设置

搁置到

未设置

复查周期

未设置

个人备注

未设置

命中原因

title matched "multimodal";summary matched "diffusion";has PDF

最近行动提醒

未记录

个人反馈

把你为什么标记这篇论文、接下来准备怎么处理,直接挂在规范化详情页上。

当前还没有个人反馈,可以先用本地 feedback CLI 补上。

反馈操作

复制规范主键或本地 CLI 命令,把这篇论文快速加入个人反馈状态文件。

行动提醒状态

这里记录这篇论文最近已经触发过哪些 action reason,便于解释为什么今天没有再次提醒。

当前还没有记录过 action 提醒。

来源与外链

优先展示这篇论文在各来源上的规范化入口,再补当前摘要页和 PDF。

历史命中

按归档时间回看它在哪些 feed 中出现过,并保留当日 digest 产物入口。

Vision

2026-04-14

2026-04-14 11:37:06 (Asia/Shanghai)

While the field of vision-language (VL) has achieved remarkable success in integrating visual and textual information across multiple languages and domains, there is still no dedi…

Score 86 · title matched "multimodal";summary matched "diffusion";has PDF

相关推荐

基于共享主题、标签和配置关键词做的轻量规则推荐。

Related

Multimodal large language models in brain tumor imaging: clinical applications and future perspectives.

共享主题:Language Model / Multimodal;共享标签:方法;共享关键词:alignment / multimodal / language model

Score 109PubMed

Related

Character Beyond Speech: Leveraging Role-Playing Evaluation in Audio Large Language Models via Reinforcement Learning

共享主题:Language Model;共享标签:评测 / 方法;共享关键词:alignment / multimodal / language model

Score 120arXiv

Related

Bridging the Modality Gap in Medical Vision-Language Models: A Hybrid Contrastive-Optimal Transport Framework for Enhanced Cross-Modal Alignment.

共享主题:Language Model;共享标签:评测 / 方法;共享关键词:alignment / multimodal / language model

Score 107PubMed

Related

MILU: a consensus ensemble benchmark for multimodal medical imaging lecture understanding.

共享主题:Language Model;共享标签:评测 / 方法;共享关键词:alignment / multimodal / language model

Score 82PubMed

Related

EgoMotion: Hierarchical Reasoning and Diffusion for Egocentric Vision-Language Motion Generation

共享主题:Language Model;共享标签:评测 / 方法;共享关键词:diffusion / multimodal / language model

Score 77arXiv

Related

Revisiting Change VQA in Remote Sensing with Structured and Native Multimodal Qwen Models

共享主题:Language Model;共享标签:评测 / 方法;共享关键词:alignment / multimodal / language model

Score 70arXiv