For PS2, I can only think of a handful of games with good polygon models, and these are lacking in overall polygon count. So allow me to revise my statements, with actual numbers:
Raw FLOPS (translates to vertex, 32-bit only) output:
Xbox (twin-vertex shaders), GC (fixed hardware T&L), PS2 (Emotion Engine)
Xbox(10 flops * 2 * 233 mhz) = 4.660 GFLOPS (32-bit, programmable)
GC(w/o lighting) = 3.726 GFLOPS/ (32-bit ops, fixed)
GC(w lighting) = 9.4 GFLOPS (32-bit & 20-bit ops, fixed)
PS2 (VU 1) = 3.08 GFLOPs (32-bit, fully programmable)
PS2 (VU1/VU0/CPU FP) = 6.2 GFLOPS (32-bit, fully programmable)
http://www.segatech.com/gamecube/overview/
But this may be inaccurate, excluding non-programmable XGPU hardware. Lets try total GFLOPS (minus pixel shaders, not including CPU for Xbox and GC):
Ranking (raw, peak, vertex-GFLOPs)
Xbox (21.6 GFLOPS - 2.932 FLOPS (CPU) - 7.456 GFLOPS (pixel shaders, 24-bit)) = 11.2 GFLOPS (32-bit and other, programmable & non-programmable)
Gamecube = 9.4 GFLOPS (32-bit and other, non-programmable)
PS2 = 6.2 GFLOPS (32-bit, fully programmable)
The 21.6 GFLOPS I retrieved from a book (Opening the Xbox), which is Xbox's total system power. For the pixel shaders (3 vector, 1 scalar, * 2 madd * 4 shaders). The rest should be the vertex shaders and related hardware. These comparisons are without any nifty optimizations of course (early z-checks). With CPU (lighting, animation), Gamecube is at 11.3 GFLOPS, making Xbox and GC almost exactly equal in polygon performance, without the XCPU (which isn't contributing to T&L). I'd say GC is better though, because of the aforementioned, probably existing, early z-check, and the fast z-clear (xbox might have fast z-clear as well).
So the ranking (polygon output):
Gamecube (11)
Xbox (11)
PS2 (6)
Xbox and Gamecube are tied, but when Xbox is overloaded with shader effects, like you said, Gamecube wins. With high complexity (requiring lots of z-culling), Gamecube wins. Actual PS2 optimization is limited to VU1, which is at 3.08 GFLOPS. And the Xbox DirectX configuration can slow things down quite a bit, unless push-buffers are used. And I've seen PS2 use good tesselation algorithms. Early Z-checks on the PS2 is hard, but continuous LOD isn't, as well as other software optimization. GC and PS2 will not have the texture resolution on its polygons that Xbox will have.
Ranking (polygons, ingame performance, out of 10):
Gamecube (10)
Xbox (6/7, with pushbuffers)
PS2 (3/4, with VU1 only, assuming good optimization)
Where would N64 be though?
Reality Co-Processor - 4 32-bit ops * 2 (madd) * 62.5 mhz = 500 MFLOPS
Right on the money, as Silicon Graphics told the press that N64's coprocessor could do 500 MIPS.
Ranking (polygons, ingame performance (moderately complex*), out of 10):
Gamecube (10)
Xbox (6/7, with pushbuffers, moderate pixel shader utilization)
PS2 (3/4, with VU1 only, assuming good optimization)
N64 (.5,
Ranking (w/N64, polygons, ingame performance (exception being N64, with no known performance inhibitors; moderately complex), out of 10):
Gamecube (10)
Xbox (6/7, with pushbuffers, moderate pixel shader utilization)
PS2 (3/4, with VU1 only, assuming good optimization - partial VU0 and CPU optimization)
Dreamcast (1.4 SH4 GFLOP capacity - 1.4)
N64 (.5)
PSX/Saturn (<.5)
*complex - many layers of interaction in the 3D scene
I'm not sure about Saturn and PSX...I can't find any FLOP performance for them. If you would like to learn more about these systems, you are bound to find tons of information at any major search engine just using the names of the GPU and CPU of the systems.