Here are results from tests I ran on various systems. The tests, while not truly scientific, gave each FTP server an opportunity to perform with the same anonymous FTP hierarchy as its competitors.
The tests were done by running a program I call the FTP Monkey which opened FTP connections and simulated a real live user. The Monkey would then wander the FTP filesystem and download files. At each available command prompt, the Monkey would try to download a file 50% of the time, do a directory listing 15% of the time, do a directory change 20% of the time, do a print-working-directory 5% of the time, and close the connection the remaining 10%. In between each command, the Monkey would sleep 0 to 3 seconds.
Each Monkey was an attempt to simulate an experienced user. (But in actuality the Monkey would probably be more aggressive than real human users, or automated FTP clients.) So, to simulate 50 heavy users, 50 Monkey processes were run.
During the test, the system performance monitoring tool "sar" was run to watch the system's "idle-time." This is the percentage of the time the system was not busy running something else. In general, if a system is idle 10% or less, the system is being overloaded. (On Linux, a script was used to get the idle-time from the /proc/stat kernel file).
The system's load average was also logged during testing. Interpretation of that value varies per system, but I use as a rule of thumb that if the load average stays over 15.0 the system is overloaded. If it stays between 3.0 and 6.0, I consider it moderately busy.
After the testing, throughput analysis was performed. I broke out the numbers into categories, to see how each server fared with different types of files. The first type is a simple NLST (/bin/ls -1) and the second is a LIST (/bin/ls -l). Then there are the file transfers, separated into tiny files (less than 10 kilobytes), small files (between 10 and 32 kilobytes), medium-sized files (between 32 and 128 kilobytes), and large files (greater than 128 kilobytes).