You're making an even less charitable set of assumptions:
1). I'm incompetent enough to ignore publicly available table benchmarks.
2). I'm incompetent enough to never look at poor quality data.
3). I'm incompetent enough to not create a validation dataset for all models that were available.
Needless to say you're wrong on all three.
My day rate is $400 + taxes per hour if you want to be run through each point and why VLMs like Gemini fail spectacularly and unpredictably when left to their own devices.
1). I'm incompetent enough to ignore publicly available table benchmarks.
2). I'm incompetent enough to never look at poor quality data.
3). I'm incompetent enough to not create a validation dataset for all models that were available.
Needless to say you're wrong on all three.
My day rate is $400 + taxes per hour if you want to be run through each point and why VLMs like Gemini fail spectacularly and unpredictably when left to their own devices.