Jump to content

OpenCL Oceanwave Bench and (new) CompuBench CL


mitch_de
 Share

367 posts in this topic

Recommended Posts

OpenCL Info: Luxmark shows OpenCL 1.2 but it shows not the device version (which is 1.1) insted shows the the OpenCL Platform Version which is always 1.2 on OS X - doesnt matter which GPU you use. I have discussed thar already in the luxmark dev thread. They will fix that in an future version - using device version and not platfrom version.

Platform Version means which OpenCL version the platform (the software driver) max. can handle - independed from the gpu hw features.

 

OpenCL bandwidth console vs OceanWave bandwidthes GPU/VRAM result: Thanks you found a little cosmetic bug - the vram value was truncated left if > 99,9 GB/s.

So the first 1 was not shown and the result was way less :)

Dont worry about OpenCL compiler warnings - they doesnt matter - and the Code for OpenCLInfo and OpenCL OceanWave comes from Apple ;)

 

UPDATED to Version 1.5.1 (DL first post)

- fixed truncation of VRAM speed if >99999 MB/s

- reformated to show in GB/s vs MB/s = better readable than big MB/s values

Bildschirmfoto 2013-03-02 um 13.49.38.jpg

  • Like 1
Link to comment
Share on other sites

OK, not my best FPS ever, but the device to device bandwidth is awesome :) Guess 90GB/s is not max then. Still don't get why CPU would be higher device version than GPU though.

OpenCL OceanWave & bandwidth Benchmark V1.5.1.jpg

Link to comment
Share on other sites

Yep, i googelded a bit about VRAM bandwidth - can be much higher than my guessed 90 GB/s. Maybe modernst gpus like Titan or AMD 78/79xx have some enhanced Caches (similar to L1/L2 CPU caches) which makes VRAM access at least for small mem tranfers faster. But OPenCL bandwidth transfers huge (3 MB parts many times) so gpu caches will not help much.

Great to see that the truncated 1 of 140 is gone :)

PS: CPUs with gpu intern like Intel 4000 without an discrete gpu may have interesting results in bandwidth. Some of the values may be much faster then using normal PCIe x8 / x16 lanes. I dont know how and how fast the gpu part is connected to the CPU / BUS. But OpenCL speed (here FPS) will be slow of course.

Link to comment
Share on other sites

Two Mac results from Rob (barefeats) - http://www.barefeats.com/index.html

2012 iMac

OS X 10.8.2 Intel® Core™ i7-3770 CPU @ 3.40GHz 3400 MHz

GPU GeForce GTX 680MX 0 MHz 283.2 fps

Bandwidthes: device>host:8090.6MB/s host>device:5351.9MB/s device >device: 65785.4MB/s

 

2012 Retina MacBook Pro 15"

OS X 10.8.2 Intel® Core(™) i7-3820QM CPU @ 2.70GHz 2700 MHz

GPU GeForce GT 650M 0 MHz 145.3 fps

Bandwidthes: device>host:6092.7MB/s host>device:5847.6MB/s device >device: 35775.6MB/s

 

an other real Mac result (from someone else)

2010 Mac Pro 6-core

OS X 10.8.2 Intel® Xeon® CPU W3680 @ 3.33GHz 3330 MHz

GPU GeForce GTX 285 1476 MHz 130.7 fps

Bandwidthes: device>host:4615.4MB/s host>device:5569.4MB/s device >device: 76426.2MB/s

 

Mac Pro 6-core (2010)

OS X 10.8.2 Intel® Xeon® CPU W3680 @ 3.33GHz 3330 MHz

GPU Quadro 4000 950 MHz 218.7 fps

Bandwidthes: device>host: 6191.4MB/s host>device: 5520.8MB/s device >device: 53334.2MB/s

Link to comment
Share on other sites

Hi, can anyone tell me what patch is needed (for 10.8.2) to activate OpenCL on an GTX 680 ? Rob (barefeats) asks for his GTX 680 in an MaPro.EDIT: I fond that:

sudo perl -p -i.old -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); } ' /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib

is that patch working for native 10.8.2 Apple drives also or must Rob (GTX 680 in MacPro) install the Nvidia drivers 10.8.2 first?

EDIT: THat patch worked for rob :)

That worked.

I get 1094.6 FPS max for this EVGA GTX 680 Classified

CPU> PCIe > GPU = 5.35 GB/s

GPU > PCIe > CPU = 6.22 GB/s

GPU GPU VRAM speed = 86.48 GB/sRob

 

The other GTX 680 (Zotak) has significant more VRAM bandwidth - perhaps reason for more FPS than the EVGA GTX 680?

Link to comment
Share on other sites

You don't need the patch (for fermi/kepler) again - that only for enabling OpenCL NOT to speed it up.

I dont know why only 482.4 fps.

First run again and look at the realtime fps counters (bottom line of the rendering window) if they are same (in general) as the max result.

Thats to be sure that not first char was truncated showing the fps - like 482 vs 1482 fps.

But i don't think thats an cosmetic (truncation ) problem, because some other 680 GTX user has 1200 fps shown in the result window.

Perhaps AGPM setting problem?

 

 

NEW VERSION V 1.6.1 is out!

DL first post

 

Bildschirmfoto 2013-03-07 um 19.32.58.jpg

Link to comment
Share on other sites

Yep - the faster the gpu the less effect of the added AA gpu load rendering the waves. My 9600 GT gets 10% less fps AA ON.

Slower gpus ( getting

 

You may check if AA ON / OFF has more effect running in fullscreen. press f key fast after pressing Bench button.

I think AA ON will now reduce fps more than 1-2% as before (fixed window).

 

PS: The bench automatically checks (by the GLUT framework) if AA is available - if not it uses AA OFF.

So in case of drivers bugs or in case GLUT supports AMD devices not right, the switch AA OFF / ON has no effect.

 

If your fullscreen AA ON / OFF are also near same , it can be that GLUT cant handle AMD AA settings correct.

 

Normally you see the diff of AA ON / OFF also in the smooter rendered FPS counter text of the rendered window.

In may case it looks significant differemt. If it looks identical than AMD cant handle AA in this bench.

screenshoots Nvidia gpu AA ON vs AA OFF

Bildschirmfoto 2013-03-07 um 16.41.32.jpg

Bildschirmfoto 2013-03-07 um 16.41.06.jpg

Link to comment
Share on other sites

Would be interesting if AMD check if AA ON vs AA OFF gives different (smoother / multisampled) FPS counter info line (in the rendered window) - would show that AA is enabled on AMD also.

if fps diff even running fullscreen (f key) + AA ON vs fullscreen AA OFF is very similar (less than 1-2% diff) - thats also an check for working AA.

I ask because one AMD user has near same fps AA ON/OFF in the normal (windowed) bench. Can be OK on very fast gpus but should be more different than 1-2% , which happens also with multiple runs of same settings.

Link to comment
Share on other sites

Here's my new 670. I've been running the updated 10.8.2 build (12C2034) that came with the Retina Macs since it came out so I have newer nVidia drivers (304.10.20f04) than stock 10.8.2. I tried updating to the "newer retail" ones (304.00.05f02) but the system stopped loading before the login window (just a dead stop, no KP). Had to run the driver restore package from my install disk to get back in. This is with no AGPM. I also used the same AGPM edit I had for my GTX 460 (just edited the device id) and the results were pretty much the same.

 

screenshot20130307at225.png

Edited by Riley Freeman
Link to comment
Share on other sites

What was the diff between to two runs? Same System?

I see CPU wasnt clocked maximum (also other powermanagement functions may run idle).

Try to view the CPU load / CPU Mhz ( iStat others) beside the bench. CPu / Chipset should wake up otherwise, GPU cant work as fast as possible.

GPU load % is very high by the bench. CPU load much less.

 

multisampling on AMD: works. AMD user with 5770 got different fps AA ON/OFF.

fullscreen AA OFF: 122 fps vs AA ON : 112 fps ,

windowed AA ON : 139 fps AA OFF: 140 fps - near no diff , because the 5770 (and similar others) gpu is fast enough to handle AA with OpenGL without slowing down OpenCL tasks.

So the faster the gpu the less effect has AA - at least in the windowed mode it can be none.

Slower gpus, like my 9600 GT or others in range of 20-100 fps windowed AA OFF will loose some fps.

 

conclusion: AA OFF puts OpenCL speed more in the focus - the slower the gpu the more fps lost on AA on.

So the default AA OFF changes nothing for faster - highend gpus but let get more valide OpenCL power values for the lowend - midrange gpus.

 

Old fps values (AA was ON) can be comapred to new with AA OFF if the gpu got at least >= 150 fps in the older versions.

Link to comment
Share on other sites

What was the diff between to two runs? Same System?

I see CPU wasnt clocked maximum (also other powermanagement functions may run idle).

Try to view the CPU load / CPU Mhz ( iStat others) beside the bench. CPu / Chipset should wake up otherwise, GPU cant work as fast as possible.

GPU load % is very high by the bench. CPU load much less.

 

multisampling on AMD: works. AMD user with 5770 got different fps AA ON/OFF.

fullscreen AA OFF: 122 fps vs AA ON : 112 fps ,

windowed AA ON : 139 fps AA OFF: 140 fps - near no diff , because the 5770 (and similar others) gpu is fast enough to handle AA with OpenGL without slowing down OpenCL tasks.

So the faster the gpu the less effect has AA - at least in the windowed mode it can be none.

Slower gpus, like my 9600 GT or others in range of 20-100 fps windowed AA OFF will loose some fps.

 

conclusion: AA OFF puts OpenCL speed more in the focus - the slower the gpu the more fps lost on AA on.

So the default AA OFF changes nothing for faster - highend gpus but let get more valide OpenCL power values for the lowend - midrange gpus.

 

Old fps values (AA was ON) can be comapred to new with AA OFF if the gpu got at least >= 150 fps in the older versions.

 

 

hello :)

 

 

there are 2 different CPU, the first is a FX 6100 clocked at 3.3 in reality (Clover) and the second a Phenom x4 960 t unlocker 3.0 ghz in x6 to 3.4 ghz and not as revealed by the test.

the graphics card is an HD 4850 1024 HD 4870 and not as shown in the test.

This is to demonstrate the operation of the graphics card under ML10.8 reveals quite an anomaly.

the 112 fps are coherant for the graphics card.

Link to comment
Share on other sites

Ok, different CPU. But it should have not such an big effect. Especially not in the bandwidth VRAM, which is near independend from the CPU.

Can it be that on one system the timebase goes wrong? Means 1 sec (all benches needs correct time) is only 0.5 sec or 2 Sec because fsb clock or somethink else is wrong?

Also geekbench would show much bigger diffs in speed as the speed diffs of the cpus - also needs correct time.

Link to comment
Share on other sites

For the record, OpenCL works for me with no patching.

10.8.2, latest Nvidia drivers, EVGA vanilla GTX 660 2GB.

 

I can confirm this too. I had the libclh.dylib patched for my GTX460. After upgrading to the 670 I had to do an additional patch on the OpenCL framework because the card has over 2GB RAM. I've reverted libclh to the unpatched copy and OpenCL still works fine.

Link to comment
Share on other sites

 Share

×
×
  • Create New...