Announcement_2026 04 22 Wila Popqa

New paper at the KG-LLM Workshop (LREC 2026): A Wikidata-Based Framework to Measure Cross-Lingual Bias in Multilingual Large Language Models. We introduce WILA-PopQA, a popularity-matched multilingual benchmark across 9 languages, and disentangle three factors that multilingual probing benchmarks usually confound: the language of the question, the language of the entity, and entity popularity. Across 12 open-weight LLMs, the language of the question turns out to be the dominant factor, and matching it to the entity’s language does not reliably improve factual recall.