DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 23 days ago • 57
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 23 days ago • 57
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 23 days ago • 57
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature Paper • 2406.07835 • Published Jun 10, 2024 • 2
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models Paper • 2510.09541 • Published Oct 10 • 15
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 23 days ago • 57
Text or Pixels? It Takes Half: On the Token Efficiency of Visual Text Inputs in Multimodal LLMs Paper • 2510.18279 • Published Oct 21 • 4
Large Language Models Discriminate Against Speakers of German Dialects Paper • 2509.13835 • Published Sep 17 • 7
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge Paper • 1803.05457 • Published Mar 14, 2018 • 3
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering Paper • 1809.02789 • Published Sep 8, 2018
From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project Paper • 1909.01958 • Published Sep 4, 2019
Probing Natural Language Inference Models through Semantic Fragments Paper • 1909.07521 • Published Sep 16, 2019
QASC: A Dataset for Question Answering via Sentence Composition Paper • 1910.11473 • Published Oct 25, 2019
What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge Paper • 1912.13337 • Published Dec 31, 2019
UnifiedQA: Crossing Format Boundaries With a Single QA System Paper • 2005.00700 • Published May 2, 2020
Attentiveness to Answer Choices Doesn't Always Entail High QA Accuracy Paper • 2305.14596 • Published May 24, 2023 • 1