Jump to content

Nvidia Fermi GTX 4xx, GTX2xx (+ others) Users for Benchmark WANTED


  • Please log in to reply
61 replies to this topic

#1
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
EDIT :
DL link for newest slg version always at macupdate,com.
EDIT 30.07. PerFinal V171_3
http://rapidshare.co...uxGPU171_V3.zip

http://www.macupdate...632/smallluxgpu

Needed : all NVs >= 8800
Select the luxball (standard scene) and the Benchmark GPU only modes with 2,3 and 4 gpu threads and post your kSamles/Sec in that gpu only modes.
My gpu only results (8800GTX) are shown in the screenshoot.
GTX 260++ will perform much faster. 9400M much slower.


EDIT: after i while i find that the GRASS OpenCL Demo also is an good OPENCL Bench.
i get 54 FPS with 9600GT.

Attached Files



#2
machinist

machinist

    InsanelyMac Protégé

  • Members
  • PipPip
  • 86 posts
Running 10.6.3

Intriguingly this test divides the workload across the cores of the 9800 GX2, and uses both G92 chips in concert.

Cinebench 11.5 opengl test yields 26.17 fps in 10.6.3 and 34.32 with Win7(64).

Openglviewer produced lower scores in 10.6.3 then the ~3200+ fps scores with 10.6.2. It reports it is only using 16 compute units.

I would note opengl 3.0 was only at 65% with 10.6.2 while it's at 91% with 10.6.3.

Attached Files



#3
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Thanks !
Can you please try the new 1.5.2 version, which shows better comparable xy Sec as Speed in the new benchmark gpu Mode ?
8800GTX needs 28 sec, 9400M 156 sec

#4
machinist

machinist

    InsanelyMac Protégé

  • Members
  • PipPip
  • 86 posts

Thanks !
Can you please try the new 1.5.2 version, which shows better comparable xy Sec as Speed in the new benchmark gpu Mode ?
8800GTX needs 28 sec, 9400M 156 sec

And the 9800 GX2 needs 17.8 seconds.

Small matters: "title bar" and pull down menu is Deutsch; guessed to go to macupdate to download the program as you neglected to link to it here.

That aside, this is becoming an interesting little utility.

Attached Files



#5
wetzel

wetzel

    InsanelyMac Protégé

  • Members
  • Pip
  • 41 posts
  • Gender:Male
  • Location:Amherst, MA
MSI GTX260 192 core on 10.6.3 using NVenabler.

with 2 threads I had 668K/sec, 3 threads 678K/sec average after 128 samples.

I used version 1.5.3 and "benchmark midrange CPU" resulted in 16.9 seconds, highend benchmark in 31.2 seconds.

Hope this helps with whatever you're doing.

#6
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Thanks !
Perhaps an GTX 285 or 2*GTX 260 user can get closer to ATI 4850 (High Benchmark 17 sec) or ATI 4870 (15 sec) ?
GTX 260 in High around 29 sec (my 8800GTX=59 sec, 9600GT=80 sec) is fastest GTX gpu until now, but far away from the units speed of the 48xx.
Also shader unit MHZ may give little speed boost some GTX 260 showed 1348 MHz, some 1408 Mhz in the benchmark mode result window!

Thanks for the multi GPU card 9800X2 test !
Can you perhaps use newer slg 1.5.4 (in High Benchmark Mode) - gives 2 times more sec needed (High Mode does excat double work, reason was less % overhead for OpenCL in the time which is always about 0,5-1,0 sec CPU dependent for compiling OpenCL on the fly.)
http://www.macupdate...632/smallluxgpu

Would be also interesting if you perform an GPU only task with sponza scene , which is new and does huge load to gpu.
I get avg. 16 kSamples/Sec GPU only, 3 threads sponza with my 8800GTX. Your two gpus, shown in help screen, should perform at least 29 kSamples/Sec.
Let sponza scene run a while - at least until samples goes from 0 to 16 or 32 to get stable avg. result.

EDIT: I got Results from iMac 27" ATI 4850M : 21 sec in High Benchmark mode. Slower than 4870 (15 sec) but even faster than GTX 260.
Shaderspeed (lots of units) of ATI 48xx cant get cracked by older Geforces.
But Fermi will do - i am sure.

For sure, in overall gaming speed isnt so much different as in OpenCL speed !
ATI 4870 is not 4 times faster than 8800GTX running an game!

Attached Files



#7
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Updated to 1.5.5.
Added Ultra highend Benchmarkmode !
8800 GTX = 101 sec
GTX 285 (Mac) = 44,7 sec

As before , the ATI 48xx cards (even the mobile Imac 4850m) will outperform that :)

#8
machinist

machinist

    InsanelyMac Protégé

  • Members
  • PipPip
  • 86 posts
"Thanks for the multi GPU card 9800X2 test !
Can you perhaps use newer slg 1.5.4 (in High Benchmark Mode) - gives 2 times more sec needed (High Mode does excat double work, reason was less % overhead for OpenCL in the time which is always about 0,5-1,0 sec CPU dependent for compiling OpenCL on the fly.)

Would be also interesting if you perform an GPU only task with sponza scene , which is new and does huge load to gpu.
I get avg. 16 kSamples/Sec GPU only, 3 threads sponza with my 8800GTX. Your two gpus, shown in help screen, should perform at least 29 kSamples/Sec.
Let sponza scene run a while - at least until samples goes from 0 to 16 or 32 to get stable avg. result."



Newer slg in High Benchmark Mode = 36.7 secs.

Ultrahighend Benchmark Mode = 53.7 secs.

Sponza scene with 48 samples, 3 threads, GPU only = 35k samples/sec.

(Using version 1.5.5)

#9
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Thanks !

barefeat (Rob) uses now smallluxGPU as Bench beside Geekbench + Cinebench 11.5 :)

http://www.barefeats.com/mbpp18.html

#10
cfhuk

cfhuk

    InsanelyMac Protégé

  • Donators
  • 43 posts
  • Gender:Male
  • Location:That place in Lancashire Ghandi visited
Cheers.

Benched GTX 260 on its own before I eventually work out how to stick the second one in.

Midrange GPU - 16 seconds
High End GPU - 25 seconds
UltraHybrid Sponza - 22 seconds

#11
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Thanks !
Could you also compare High Hybrid vs High CPU only and Ultra Hybrid vs Ultra CPU only(both in the middle section of the screen, not the CPU only on the right - newest V 1.5.7 needed) ?

http://www.macupdate...632/smallluxgpu

high hybrid vs high cpu only on my 8800GTX = 16 sec vs 31 sec - GPU boosts good = 100% time saving (faster cpu, same gpu = less time saving %)
ultra hybrid vs ultra cpu only = much less GPU boost ("only" 20% time saving),
because C2D CPUs are overloaded/ near full load already with the cpu tasks and cant feed the GPU fast enough with data.
So CPUs with equal/more than 4 cpu cores (real not virt) will get higher boost % also in ultra hybrid. But also will not get same big boost as with high hybrid.

#12
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
SLG updated to 1.5.8 !
Benchmark result times cant be compared to old versions - some benches have siginificant diff settings = diff times to old version.

#13
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Ultra High GPU only was an Bug.
Now 1.6.0 available !
I added OpenCL Pixel Filter benches and cleanded up the gui.
Now all gpu only benches ware beside cpu only and hybrid and use same settings. Before the gpu only benches
had own settings compared to hybrid + cpu only.
Now its more clear and should be bugfree.
Ready to collect references again (will hold next versions).

Att pixelfilter Mega Samples/Sec of 8800GTX and Ultra GPU only (4870 will perform much faster, but not anymore 1,6 sec :rolleyes: )

Attached Files



#14
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Wow , ATI 4870 gets really fast MegaSamples/Sec in the new pixelfilter bench !
Any GTX 2xx users here which can get a bit closer than my old 8800GTX ?!

Attached Files



#15
machinist

machinist

    InsanelyMac Protégé

  • Members
  • PipPip
  • 86 posts
I may be getting anomalous results with the Open CL Benchmark test using version 1.6.2

The 9800GX2 is only processing at two-thirds the speed of your 8800GTX, yet is a third faster in the Ultrahigh GPU only Benchmark?

Attached File  Screen_shot_2010_05_15_at_6.13.54_AM.png   1.35MB   27 downloads

#16
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
8800GTX is much faster than 8800GT. In 8800GT vs 9800X2 the X2 would be looking better :)
9800x2 cant get near 2* 8800GTX.
Also the cpu maybe "to slow" to feed both OpenCL cpus fast enough.
Try High end CPU only vs hybrid - you may get better advantadge to my 8800GTX high end values.

I got also GT120 Results (MacPro 2009)
Attached File  Ulta_GT120.jpg   103.02KB   33 downloads

Ultra GPU only 280 sec - so dont worry about 9800x2 ;)
You even can see her, that OpenCL with very fast cpus (MacPro 2009) and slow GPU is worst case - hybrid even slower than cpu only.
Overhead of OpenCL in hybrid makes slow gpus with very fast cpus (4 cores+) useless.
But most of us will NOT have scuh an combination of 2*XEON + GT120 - i hope ;)

PS: I also got ATI 5870 (Win) OpenCL Pixelfilter values !

AddSample[FILTER_NONE] Benchmark
[CypressPixel][Samples/sec 1669.42M]

AddSample[FILTER_PREVIEW] Benchmark
[CypressPixel][Samples/sec 369.56M]

AddSample[FILTER_GAUSSIAN] Benchmark
[CypressPixel][Samples/sec 217.81M]

#17
machinist

machinist

    InsanelyMac Protégé

  • Members
  • PipPip
  • 86 posts

8800GTX is much faster than 8800GT...


Mitch:

Thanks for the reply, but I guess I wasn't quite clear. It's the Open CL Pixelfilter test which produces results that appear inconsistent or anomalous. In all the other tests the 9800GX2 predictably "bests" the 8800GTX. In the Pixelfilter run the 9800GX2 only processes two thirds the information in the 30 secs that the 8800GTX does in the same time. It is as if the Pixelfilter test does not use both cores of the 9800GX2. This may be a bug?

#18
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Ah, i now understand. I will ask the benchpixel devs if that is also using all gpus.
But for sure in benchpixel the usage of the vram is much more / more often than raytraycing benches. I dont know if on older 2 gpu cards it may happen a slowdown in case of concurrented vram usage (read/write) which reduces vram overallspeed of 2gpu card vs 1 gpu card.
For an closer look start benchpixel in terminal and post the output - here we can see how may gpu devices are used. Compare the infos of devices with mine.

8800GTX
Device 0,1 = cpu cores
Device 2 = GPU (single 8800GTX)


mitch:~ ami$ /Users/ami/Desktop/benchpixel
LuxRays Simple PixelDevice Benchmark v0.1alpha7dev
Usage (easy mode): /Users/ami/Desktop/benchpixel
OpenCL Platform 0: Apple
Device 0 NativeThread name: NativeThread-000
Device 1 NativeThread name: NativeThread-001
Device 2 OpenCL name: GeForce 8800 GTX
Device 2 OpenCL type: GPU
Device 2 OpenCL units: 16
Device 2 OpenCL max allocable memory: 192MBytes
Device 3 OpenCL name: Intel® Core™2 Duo CPU E7300 @ 2.66GHz
Device 3 OpenCL type: CPU
Device 3 OpenCL units: 2
Device 3 OpenCL max allocable memory: 1024MBytes
Selected pixel device: GeForce 8800 GTXCreating 1 pixel device(s)
Allocating pixel device 0: GeForce 8800 GTX (Type = OPENCL)

Attached Files



#19
machinist

machinist

    InsanelyMac Protégé

  • Members
  • PipPip
  • 86 posts

Ah, i now understand. I will ask the benchpixel devs if that is also using all gpus...


It appears the test is using both gpus and all memory. The 9800gx2 does better then the 8800gtx in every other test. May be a bug in card design with just this test, or could be a bug in the test? In WinWorld I've run many tests on the 9800gx2 while considering overclocking its bios. Watching proc temps and gpu usage I have noticed some benchmark and stress programs do not actually use both gpus, though they see both. Has this test run on other two gpu cards or multiple card setups?

Let me know how it goes. I am curious.


Attached File  terminal_pixel.rtf   2.6KB   7 downloads

#20
mitch_de

mitch_de

    InsanelyMacaholic

  • Retired
  • 2,896 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Yep. benchpixel uses both gpus.
Maybe because also uses 4 threads on cpu insted of 2 threads (Quad CPu vs C2D) it maybe an problem that cpu cant feed gpu fast enough or an L2 cache difference ! My C2D has 3 MB L2 = 1,5 MB each core.
Does your CPu has 4M or 6 MB for 4 cores (1 MB or 1,5 MB each core) ?
Because much use of RAM transfers (pic filtering!) also L2 size may be much used - the more L2 the better.





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

© 2014 InsanelyMac  |   News  |   Forum  |   Downloads  |   OSx86 Wiki  |   Mac Netbook  |   PHP hosting by CatN  |   Designed by Ed Gain  |   Logo by irfan  |   Privacy Policy