CRMArena Leaderboard

CRMArena is a novel benchmark designed to assess LLM agents on realistic customer service tasks within professional environments. By working with CRM experts, CRMArena offers nine challenging tasks across three personas—service agent, analyst, and manager—populated within a simulated organization using 16 interrelated industrial objects. This benchmark invites the community to improve AI agent capabilities in function-calling and work task understanding, demonstrating tangible business value in a realistic Salesforce Org.
Agentic Frameworks
CRMArena Tasks
Model
Agentic Framework
NCR
HTU
TCU
NED
PVI
KQA
TII
MTA
BRI
Overall ⬆️

claude-3.5-sonnet

Function Calling

60.8

68.5

66.9

34.6

24.6

39.2

99.2

84.6

74.8

64.3