The best way I've found to "benchmark" untangle is a unit test for a web app. There are tests out there that probe for all sorts of things. I just had my web dev buddy configure some unit tests against his web server I host, and put an Untangle between the test unit and the server.
Let that run for an hour and you get a solid idea of what an Untangle server will do when barraged by a TON of HTTP sessions.
Because at the end of the day it isn't good enough to generate network sessions, you need to generate sessions that will engage the UVM in a meaningful way.
To make a real benchmark (something I've been poking at for several years) we need a standardized test that will barrage HTTP, HTTPs, and SMTP simultaneously. There are limits to how much you can do with a single machine too. So in the end the numbers end up anecdotal.
The model numbers on my server weren't arbitrarily chosen. They do match what I expect to be maximum productive user count on the device. However, I will admit that there is far too much fairy magic in those numbers. They aren't nearly as defined as I wanted them to be. I know they work because of the installation base, not completely because I knew they would work up front.
It's a best guess game, and I hate it.