Jump to content

OpenCL Oceanwave Bench and (new) CompuBench CL


mitch_de
 Share

367 posts in this topic

Recommended Posts

Thanks commenting the OpenCL patch need/noneed. Can be helpful for others. So only older Fermi do need the patch like 4xx and 5xx ? Work 650...680 without patch? Or only some of the 6xx GPU type?

Link to comment
Share on other sites

Thanks, also AMD OpenCL drivers got updated - so you be free to check some diffs. But as for every bench, only significant >3% diff may happen by the update.1-3% % fps diff also happens between two runs on same system.

PS: New Bench version has AA OF by default, older versions uses AA ON, doesnt matter for fast gpus (had >> 100 fps in old version) but slower cards have up to 10% less fps with AA ON (default in old version) vs AA OF (default in new version.

Link to comment
Share on other sites

Thanks, we see that AA doesnt matter running this bench on fast gpus. This OpenCL Code seems to perform a bit faster on Nvidia Highend vs. AMD Highend. But doesnt matter - other OpenCL benches, whith different OpenCL code will have other ranking list of the highend gpus.

At least great steps forward compared to AMD 4870 & AMD 5870 Mac in case of OpenCL (here and Luxmark).

How doest the AMD 7950 perform OpenCL ? I guess at around 85%-90% speed of 7970?

 

Time to bring the new Saphirre 7950 Mac Edition to the market (for real MacPros) - hope they will see the card normal and not only by Apple in there shop.

Link to comment
Share on other sites

Just updated my Apple MacPro to OS X 10.8.3 thus allowing me to install a NVIDIA Quadro K5000. I figured the hackintosh commmunity would like to see results on Apple hardware.

 

Here are the results:

ScreenShot_2013_03_15_at_5_11_22PM.png

  • Like 2
Link to comment
Share on other sites

Thanks for 4870 result - we see that comapred to the 4890 GPU result, posted a few post above, the PCIe transferspeed is much higher.

Something is wrong with the PCIe Slot & or the Card of the 4890 User, because he reaches only AGP speed (first two bandwidth results are very low ,

In general he will not "see"/ "feel" that major bottleneck, because it only comes up if huge or very often data transfers happens. IN Games when textures for a new scene get updated or VRAM mem runs near out of mem ((90% VRAM used) or using OpenCL huge data is transfered. In OceanWave data transfer isnt high, so the PCIe bandwidth doenst matter for slower cards.

I gues that the 4890 GPU with > 2 GB/s) will have more FPS drops (lower minimal FPS) , some minimal freezes in Valley OpenGL bench than the 4870 running higher res with high quality setting.

Link to comment
Share on other sites

hello, i´ve got a GTX650 together with the HD4000 running. OpenCl works and the device was found in the log.

but if i start the test i´ve got the standart error.

 

i don´t know, could it be that the app use the wrong device?

Link to comment
Share on other sites

OceanWAve may run in probs with more than one OpenCL gpu. Try Luxmark instead - should work to select the benched GPUs there.

Can you try to disable HD4000 (in BIOS?) and run again if there is not an general prob with Wave & your GPU/System?

Link to comment
Share on other sites

2v8sh0x.png23ru73d.png

2a9utl3.png347z5zc.png

 

What I did notice is with multisampling on my results are all over the place (50 fps difference between lowest and highest values).

However without multisampling or fullscreen it's almost constant with 5 fps difference at most...

Interestingly full screen with multisampling on is 50 fps difference again... but stable this time.

So it must be the windowing which gives unreliable results.

Again this setup has been pretty stable for 2 years and over the past 1 or so benches have become really stable for this card.

Link to comment
Share on other sites

 

 

 

What I did notice is with multisampling on my results are all over the place (50 fps difference between lowest and highest values).

However without multisampling or fullscreen it's almost constant with 5 fps difference at most...

Interestingly full screen with multisampling on is 50 fps difference again... but stable this time.

So it must be the windowing which gives unreliable results.

Again this setup has been pretty stable for 2 years and over the past 1 or so benches have become really stable for this card.

 

What exact AMD 79xx do you have - AMD mostly reports not the Type number, only Type group (79xx).

 

Its normal that MIN/MAX Fps have an difference and not "stable" - the interesting values is MAX - MIN depends on CPU & other things which may reduce FPS at the first 1-2 seconds of running. IN this first start section the MIN fps always happens. After at max 5 sec actual and MAX value are stable.

 

That fullscreen MIN/MAX has less diff is also normal, because fps is much lower - depends on screen size - which do you have, because in that version i dont report screensize beside fullscreen comment. I will add that screen size in next version.

Running that bench in fullscreen has many disadvantages to bench OpenCL :

1. Much more OpenGL power is used by the gpu to render the result = much less gpu power left for OpenCL. Even the small window uses some of the gpus OpenGL power.

At least slow -midrange gpus dropping fps very much using fullscreen in >= 1600x1080 vs 500x500 windowed. OpenGL part, which renders the waves uses 7 times more gpu power in that example!

2. screensize does matter much: same gpu with 1400x900 & 1900x1200 has much diff in fps

3. so fullscreen is interesting but not very useable as OpenCL bench

 

Also next version will add some AMD special: If AMD driver reports no GPU number, only XYZ Type i will allow the user to select fitting GPU numerbs for that type, like

7950, 7970,....

Link to comment
Share on other sites

It's a HIS 6870, haven't decided on upgrading yet as I haven't gotten my brothers 7870 working yet with sleep.

 

But without multisampling max and also min(!) is stable, keep getting the exact same results within 3 FPS.

However with multisampling on, I get greatly different results for both min and max.

So far other benches have proven more reliable for me, that's the only thing I wanted to communicate to begin with :P

Link to comment
Share on other sites

Try that:

sudo perl -p -i.old -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); } ' /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib

  • Like 2
Link to comment
Share on other sites

Try that:

sudo perl -p -i.old -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); } ' /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib

 

 

 

Thanks man, great city Stuttgart, been there last summer to do a gig ! :)

 

About OS X , sadly, a clean install of 10.8.3 causes me freezes due to NVDA openGL timeout, 10.8.2 works perfectly (except VDA acceleration)... i'll guess I will wait for new nvidia drivers...

Link to comment
Share on other sites

Thanks for submitting lots of new benchmark results - will be included/ updated in the next version.

Now clear: AA does work, no significant FPS effects if the gpu has at least 150 FPS in windowed mode. AA may have little more effect even on fast gpus (300+ fps windowed) in fullscreen mode.

 

That will have: fullscreen @ X.Y size info , some AMD gpus will get named by the major type number, like 79xx together with their (already shown) AMD gpu name Tahiti ,....

  • Like 1
Link to comment
Share on other sites

Yep, but as always (benches) : Speed of each gpu type (AMD/Nvidia and its different gpu kernels) depens on which code is "liked" more or less for each gpu type.

To get an overall look on the speed of an gpu, we must use many benches!

But it seems that Luxmark OpenCL results are much more AMD ranked. At least the AMD 79xx runs very well compared to GTX 6xx.

Also GTX cars have much differnent OpenCL speed (hw design diffs: focus for more gaming speed vs gpu computing speed) - GTX 680 is much slower than older GTX 580 for example (same for 670 vs 570) !!

 

 

luxmark.jpg

PS: Its not fps, they are points as result ;)Geforce Titan works with early beta drivers - OpenCL tasks sometimes crashing - so Titan result are low / not optimized.

But Titan will stay (even debugged & optimized) little behind AMD 7970 of course.

Info: Luxmark main dev , is non commercial, develops in an AMD system (Win). He started dev Luxmark (and Luxrender) with AMD 5870, now has AMD 7970. He will not & cant buy some expensive Gefore to optimize the OpenCL code also for Nvidia, as he already did (with OpenCL profiling) for AMD.

So this info shows, that beside OpenCL driver speeds & GPU hw speed also the software, the OpenCL source code does matter. At least up to 20% performance can be optimized by source code opti for some gpu series (AMD or Nvidia, which are in detail different in gpu computing usage / opti!)

 

It may possible, that because Nvidia started years earlier than AMD with CUDA, the OpenCL Nvidia part has some advantages of the CUDA dev the past.

ATI had its own ATI Steam (its like CUDA) - but was never used much in the past.

 

CUDA (as ATI Steam in the past) can be more optimized & can have more features as OpenCL, because there are much less GPU types to support.

So gpu computing professionals (research, university, military,..) stay with CUDA and will not use OpenCL - at least not in the next 1-2 years.

CUDA has one more advantage: Nvidia developed some CUDA libs for wide range of usage - they are highly optimized & the devs , like for wether simulation have to code much less own code than using OpenCL. Much faster developing & also much less buggs using that ready to use & optimized cuda libs.

  • Like 1
Link to comment
Share on other sites

What surprises me is that nVidia is consistently beating ATI in this topic. Perhaps with stable ATI 7xxx drivers and more ATI 7xxx users stepping in, this changes a little.

 

All the best!

I beg to differ. OpenCL on 7970 can't be beat for single GPU. Of course this can change any day and Titan may be the one to do it soon. But with their back and forth battle, we all benefit by having GPU hardware always getting better at much faster rate than we see for CPU, where AMD still makes competition for intel, but with strategy of providing better value instead of pushing the limits of how powerful a CPU could possibly be. If AMD/intel had same level of competition as AMD/Nvidia, there would be plenty of new +$1000 CPUs always released by both, and lucky people who could afford to buy them to make us jealous :)
Link to comment
Share on other sites

 Share

×
×
  • Create New...