Gemini’s Rise: Dominating the LMSYS Leaderboard

Gemini’s latest release (gemini-exp-1206) has taken the top spot on LMSYS Leaderboard last updated  on 05 Dec 2024, which for most of the 2023 and 3 quarters of 2024 has been dominated by OpenAI GPT models. Kudos and congratulations to them, but it does look like their dominance is coming to an end. Even if that is a premature assumption, it is fair to say that the two frontier research labs (Google DeepMind and OpenAI) are wrestling hard with each other to stay at the top. In the latest iteration, Gemini has taken full control as seen from the following two graphics.

Other observations that I would like to highlight are

  1. The top spots are occupied by proprietary models evenly distributed between Google and OpenAI with a couple of spots for xAI and Anthropic. The Llama 3.1 Community models are outside of the top 10 and Gemma even further behind. This is a little concerning for those wanting to pick an open weight model for their use case. 
  2. Anthropic models are lower in the ranking for coding despite their strength in coding tasks (at least anecdotal). Is this leaderboard ranking a true reflection of their ability or prompt engineering playing a role here ? 
  1. Many traditional powerhouses in the enterprise analytics space such as Microsoft, Snowflake and Databricks still do not have a native model to challenge the dominance of Google, OpenAI and Anthropic.. 
  2. Last week, Amazon announced their native models. It will be interesting to see how they will fare when they join the Arena.