I've benchmarked it on the Extended NYT Connections (https://github.com/lechmazur/nyt-connections/). It scores 20.0 compared to 10.0 for Haiku 3.5, 19.2 for Sonnet 3.7, 26.6 for Sonnet 4.0, and 46.1 for Sonnet 4.5.
You could try wordle on it. But from my own experience all of them are pretty bad. They're not smart enough to pick up the colours represented as letters. The only one that actually was good was Qwen surprisingly.