Help - Search - Members - Calendar
Full Version: OpenCL Benchmark - CPU vs GPU / DO NOT USE ANYMORE !
InsanelyMac Forum > Apple World > OS X > OS X Snow Leopard (10.6)
Pages: 1, 2
mitch_de
welcomeani.gif
The DL link is at the end of that posting.
- Mac OS X 10.6 - Snow Leopard ONLY ! (will not run in 10.5 / 10.4....)


VERY OLD THREAD !
Please use now the main OpenCL Thread
New MAIN openCL Thread


STLVNUB
Here mine mitch...
CODE
Last login: Tue Aug 25 19:34:34 on console
...........................................................
...................OpenCL Bench V 0.1 by mitch.............
.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
....may give much more speed advantage - at least on C2Ds..
...........................................................
CL_DEVICE_NAME: Intel® Core™2 Duo CPU E8200 @ 2.66GHz
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 33.682335
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 9800 GT
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 2.639566
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
mitch_de
Thanks ! I hope all get SMILIES smile.gif as the validate result!
PS: I cant test the dual GPU card bench - all cards should be benched. I hope some with 2 GPUs (like MacBookPro) didnt run in an error.
macwanabe
CL_DEVICE_NAME: Intel® Core™2 Quad CPU Q6600 @ 2.40GHz
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 15.900080
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 8800 GT
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 2.618529
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
Sherry Haibara
Seems something isn't working here:

...........................................................
...................OpenCL Bench V 0.1 by mitch.............
.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
....may give much more speed advantage - at least on C2Ds..
...........................................................
CL_DEVICE_NAME: Intel® Core™2 Duo CPU P8700 @ 2.53GHz
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 37.822647
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 9400M
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 12.428713
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
sad.gif Validate results test - results compute on gpu <> compute cpu

Sherry Haibara

EDIT: Second run:
...........................................................
...................OpenCL Bench V 0.1 by mitch.............
.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
....may give much more speed advantage - at least on C2Ds..
...........................................................
CL_DEVICE_NAME: Intel® Core™2 Duo CPU P8700 @ 2.53GHz
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 37.613495
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 9400M
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 15.683911
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif


By the way, am I supposed to run this with no applications open?
morfy
...........................................................
...................OpenCL Bench V 0.1 by mitch.............
.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
....may give much more speed advantage - at least on C2Ds..
...........................................................
CL_DEVICE_NAME: Pentium® Dual-Core CPU E5200 @ 2.50GHz (overclock 3.11ghz)
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 28.961924
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 8800 GT
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 2.580805
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
logout
cparm
CL_DEVICE_NAME: Intel® Core™2 Duo CPU E8500 @ 3.16GHz
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 28.509935
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 8800 GT
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 2.507916
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif

i suppose that the bench stat are value in red , but what really mean valid results GPU=CPU , mitch can you explain ?
johan
CL_DEVICE_NAME: Intel® Core™2 Quad CPU @ 2.40GHz
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 15.142966
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 8800 GTX
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 1.761477
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
uuid
QUOTE (cparm @ Aug 25 2009, 02:26 PM) *
i suppose that the bench stat are value in red , but what really mean valid results GPU=CPU , mitch can you explain ?



I guess that he does some benchmark computations in the gpu and in the cpu and then compares whether they gave the same result (as a number). It seems that in some cases, either because of lacking float precision or due to some flipped bit or whatnot, the results differ.

Also, another question to mitch: does this implementation of opencl use the cpu alongside the gpu? I thought I read somewhere that opencl was a rather generic abstraction platform where cpu cores are treated as just another computational unit. (That would mean that the gpu scores are a bit too fast to be real).

PS. Thanks for making the tool!!
miketress
Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9600 GT
Device 0 is an: GPU with max. 1625 MHz and 64 units/cores
Now computing - please be patient....
time used: 0.753 seconds

OpenCL Device # 1 = Intel® Core™ i7 CPU 920 @ 2.67GHz
Device 1 is an: CPU with max. 3800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.137 seconds

EDIT: updated to v025
mitch_de
Updated to V015. Hope fixed output for > 1 GPU
same speed (sure a vary of 2-5% between runs are normal)

to Question1:
The validate of GPU=CPU says:
compared the results which GPU has computed with that what shoud be the result.
For example, 1+1 should be 2 , not 2,1 or 3 wink.gif

to Q2:
Both beches are done by OpenCL - CPU and GPU.
I ony do an validate of the results by "Normal" cpu code.
Seems that OpenCL (running on CPU if no GPU there) does an good job !
i7920 runs really fast !!!
Maybe an real MacPro 2009 with 2 * XEON "i7" will be faster on CPU than GPU - at least with an GT120 (default gpu).

Hope we can see some ATI´s here unsure.gif
And of course some Geforce GT285 !!!! biggrin.gif

cparm
QUOTE (mitch_de @ Aug 25 2009, 07:32 PM) *
For example, 1+1 should be 2 , not 2,1 or 3 wink.gif

hysterical.gif

thank you for that precision, I always thought that 1+1 was equal to 4 biggrin.gif

edit:

last version work also

CODE
CL_DEVICE_NAME: Intel(R) Core(TM)2 Duo CPU     E8500  @ 3.16GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3166 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 28.503862
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 8800 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1650 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 112
Now computing - please be patient....
time used: 2.525435
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
:) Validate results test passed - GPU=CPU :)

mitch_de
"thank you for that precision, I always thought that 1+1 was equal to 4 "
Yes, but that happens only on Windows dev.gif
reinstaller
CODE
....CL_DEVICE_NAME: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3096 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 29.940746
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9800 GTX/9800 GTX+ .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1836 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 128
Now computing - please be patient....
time used: 2.056581
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
:) Validate results test passed - GPU=CPU :)
blackosx
Hi mitch. Nice tool smile.gif
CODE
...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel® Core™2 Duo CPU E7300 @ 2.66GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2666 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 39.562576
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 8800 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1650 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 112
Now computing - please be patient....
time used: 2.386418
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
catboy
...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel® Core™2 CPU 6600 @ 2.40GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2400 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 38.881557
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9800 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1715 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 112
Now computing - please be patient....
time used: 2.566827
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
Beerkex'd
Weird, V015 doesn't work here, this is the only output I get:

dyld: unknown required load command 0x80000022
Trace/BPT trap

10.5.8 vanilla, Core 2 Duo E8500, 9800GTX+ with latest drivers from Nvidia, NVEnabler.kext.

/Edit

Doh!

Failed the Snow Leopard test!!

proengin
Here is my "updated" score from SL.

...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel® Core™ i7 CPU 920 @ 2.67GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 4280 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 8
Now computing - please be patient....
time used: 3.834852
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce GTX 285 .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1584 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 240
Now computing - please be patient....
time used: 0.861248
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif

This program seems to multi-thread very well according to SL's CPU Usage monitor.
cparm
QUOTE (Beerkex'd @ Aug 26 2009, 03:22 AM) *
Weird, V015 doesn't work here, this is the only output I get:

dyld: unknown required load command 0x80000022
Trace/BPT trap

10.5.8 vanilla, Core 2 Duo E8500, 9800GTX+ with latest drivers from Nvidia, NVEnabler.kext.


this tool is for 10.6 only
miketress
Mitch,

I'm running a 9600GT 512Mb (like you) instead of the 8800GTx you wrote in your post.
mitch_de
QUOTE (proengin @ Aug 26 2009, 03:31 AM) *
This program seems to multi-thread very well according to SL's CPU Usage monitor.

Thanks for that detail !
I think the 10.6 changes "in the deep" will exspecially use much Cores better than 10.5 - even without special
source coding changes. But recompiling source with newest Xcode & using 10.6 dev framework needed , i think.

Also, even if the app itself is really small (< 100 KB) is uses much RAM (up to 60 MB!) and also interacts much with it.
So also the Systembus Speed and RAM Speed may be computed (in the CPU time!).
So DDR3 tripple channel vs DDR3 dual channel (2 Modules same size) vs DDR2 vs RAM Latency timings vs RAM MHz ... will give different CPU time usage. GPU time should not be so much affcted by that (RAM/Systembus speed)
ugokind
...................OpenCL Bench V 0.1 by mitch.............
.......C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec.......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
....may give much more speed advantage - at least on C2Ds..
...........................................................
CL_DEVICE_NAME: Intel® Core™2 Duo CPU P7350 @ 2.00GHz
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 110.848793
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 9600M GT
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 19.561712
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif

mitch_de
Thanks.
Would you please run again with V015 - shows also GPU Mhz and GPU Units(Cores).
( i removed the old, V010 dl link now. No speed code changes only new output formating + gpu mhz / units shown.
I would also recommand to run the tool twice and look if there are big differences. If yes, run an third time and make an overall of times. Close all other apps before running it. Expecially if you have less or equal 2 GB of RAM.

For mobile users:
check if it makes time differences if you change powersuppy / battery. Also if you set powersettings for speed / battery safing (Energy preferences). At least orig. Macbook / Pro will throttle CPU / GPU in different sitations (powersuppy = less speed i think, energy saving settings may change also gpu(cpu throttling)

For desktop users:
If you use voodoopower (speedstep) please comment that at your posting. Also geekbench & XBench results are a bit lower / vary more between runs when using voodoopower(speedstep).
ugokind
ok here you are

.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel® Core™2 Duo CPU P7350 @ 2.00GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2000 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 116.803825
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9600M GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1250 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 32
Now computing - please be patient....
time used: 19.378469
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
Embio
....CL_DEVICE_NAME: Intel® Core™2 Quad CPU Q6600 @ 2.40GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3600 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 4
Now computing - please be patient....
time used: 15.900147
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 8800 GTS .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1300 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 96
Now computing - please be patient....
time used: 2.111204
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif


Yes that is a highly overclocked GTS - the fan is on 85% minimum :-)
elitee
....CL_DEVICE_NAME: Intel® Core™2 Duo CPU E8400 @ 3.00GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3000 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 36.671032
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce GTX 260 .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1242 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 216
Now computing - please be patient....
time used: 1.314976
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
tmongkol
It didn't work on my ATI HD4870 after try to compute GPU. can move only mouse & got to press a reset button.

Also, see http://netkas.org/?p=164
RaZZe
mitch_de
QUOTE (tmongkol @ Aug 26 2009, 04:21 PM) *
It didn't work on my ATI HD4870 after try to compute GPU. can move only mouse & got to press a reset button.

Also, see http://netkas.org/?p=164


Have you used the lastest V020 - which added lost of error handling code ?
Please look / write down reported errors / error messages.
music-anderson
My test
*****



Last login: Wed Aug 26 16:57:16 on console
/Users/peterdavidanderson/Desktop/OpenCLBench_as_terminal_tool/OpenCL2_Bench_V020 ; exit;
noname:~ peterdavidanderson$ /Users/peterdavidanderson/Desktop/OpenCLBench_as_terminal_tool/OpenCL2_Bench_V020 ; exit;
...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel® Xeon® CPU 5150 @ 2.66GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2660 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 4
Now computing - please be patient....
time used: 16.817684
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 8800 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1500 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 112
Now computing - please be patient....
time used: 2.608059
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
logout

[Prozess beendet]
nofearl
cpu + 2xgpu

...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel® Core™2 Quad CPU @ 2.40GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2400 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 4
Now computing - please be patient....
time used: 28.956915
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9600 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1750 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 64
Now computing - please be patient....
time used: 2.694709
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9600 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1750 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 64
Now computing - please be patient....
time used: 2.797374
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif
logout
mitch_de
THANKS !
Question: Do you have 2 indentical GPUs´?
Also (will not change the times) always use the lastest BUILD, ist V020.
Has much error handling code for ATI users (NVIDIAs seems to run without errors so far smile.gif )

cmf
mbp/late 2008 result:
CODE
....CL_DEVICE_NAME: Intel(R) Core(TM)2 Duo CPU     P8600  @ 2.40GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2400 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 56.190952
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9600M GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1250 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 32
Now computing - please be patient....
time used: 10.169043
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9600M GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1250 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 32
Now computing - please be patient....
time used: 10.120525
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
:) Validate results test passed - GPU=CPU :)


seems to be a bug, you are testing the same gpu twice (or just printing out the info of the first gpu device twice?). the second gpu should be a 9400M.

QUOTE (mitch_de @ Aug 26 2009, 09:39 AM) *
QUOTE (proengin @ Aug 26 2009, 03:31 AM) *

This program seems to multi-thread very well according to SL's CPU Usage monitor.

Thanks for that detail !
I think the 10.6 changes "in the deep" will exspecially use much Cores better than 10.5 - even without special
source coding changes. But recompiling source with newest Xcode & using 10.6 dev framework needed , i think.

thats an opencl feature, or the purpose of opencl wink.gif scale a small program/kernel well to many cores - be it cpu or gpu.
morfy
Upgrade from Open CL Bench V 020.

CODE
...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
.......                                             .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Pentium(R) Dual-Core  CPU      E5200  @ 2.50GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3129 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 28.777699
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 8800 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1600 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 112
Now computing - please be patient....
time used: 2.618950
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
:) Validate results test passed - GPU=CPU :)

netkas
./OpenCL2_Bench_V020
...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel� Core™2 Quad CPU Q9450 @ 2.66GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3072 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 4
Now computing - please be patient....
time used: 14.658403
Number of elements computed: 2097152

....CL_DEVICE_NAME: Radeon HD 4870 .....
CL_DEVICE_VENDOR: AMD
CL_DEVICE_MAX_CLOCK_FREQUENCY: 750 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 4
Now computing - please be patient....
///here gui freezes immedeatly
time used: 27.399342
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif


x2000.kexts dumps ** GPU Debug Info ** to dmesg

maybe its too much loops and so too much memory used by arrays, decreased loops number to 1000(edited source inside bin), now no crash.

mitch_de
QUOTE (cmf @ Aug 26 2009, 09:26 PM) *
mbp/late 2008 result:
CODE
....CL_DEVICE_NAME: Intel(R) Core(TM)2 Duo CPU     P8600  @ 2.40GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2400 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 2
Now computing - please be patient....
time used: 56.190952
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9600M GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1250 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 32
Now computing - please be patient....
time used: 10.169043
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9600M GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1250 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 32
Now computing - please be patient....
time used: 10.120525
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
:) Validate results test passed - GPU=CPU :)


seems to be a bug, you are testing the same gpu twice (or just printing out the info of the first gpu device twice?). the second gpu should be a 9400M.


Thanks.
I will fix that bug soon.
A workaround for that bug:
Please post result again after you disabled the 9600M GT ( so 9400M is the alone GPU).
The 10 sec results is for the 9600MGT - 9400M will run slower.
tommix1968
This is my result:
CODE
...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
.......                                             .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2836 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 4
Now computing - please be patient....
time used: 15.836717
Number of elements computed: 2097152

....CL_DEVICE_NAME: GeForce 9600 GT .....
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1625 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 64
Now computing - please be patient....
time used: 2.700367
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
:) Validate results test passed - GPU=CPU :)

mitch_de
QUOTE (netkas @ Aug 26 2009, 09:56 PM) *
./OpenCL2_Bench_V020
...........................................................
.................. OpenCL Bench V 0.15 by mitch ...........
...... C2D 3GHz = 30 sec vs Nvidia 9600GT = 3.10 sec ......
....... .......
........My test code (simple adds) is cpu friedly..........
.more gpu friedly+complexer code (raytracing/video encod.).
... may give much more speed advantage - at least on C2Ds .
...........................................................

....CL_DEVICE_NAME: Intel� Core�„�2 Quad CPU Q9450 @ 2.66GHz .....
CL_DEVICE_VENDOR: Intel
CL_DEVICE_MAX_CLOCK_FREQUENCY: 3072 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 4
Now computing - please be patient....
time used: 14.658403
Number of elements computed: 2097152

....CL_DEVICE_NAME: Radeon HD 4870 .....
CL_DEVICE_VENDOR: AMD
CL_DEVICE_MAX_CLOCK_FREQUENCY: 750 MHz
CL_DEVICE_MAX_COMPUTE_UNITS: 4
Now computing - please be patient....
///here gui freezes immedeatly
time used: 27.399342
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif


x2000.kexts dumps ** GPU Debug Info ** to dmesg

maybe its too much loops and so too much memory used by arrays, decreased loops number to 1000(edited source inside bin), now no crash.



Thanks, i also thougt about this memory problem.
Will compile an ATI_debug version soon and post it below other DL Link.
Sure, if i will decrease the loop down from 5000 to 1000, the time results of very fast GPUs like GTX 285 will also decrease from 0.8 down to 0,0xy wink.gif
I am working on an other solution, which does more complex work but not in such an huge loop.
netkas
smth like

for(i=0;i<5;i++)
for(loop....

should be enough to add just one line (and one for int i;)
vidkidd
With Version: OpenCL2_Bench_V020

Application still hangs on 4870, MacPro 1,1 3.0ghz

CPU is calculated at 15 seconds.
Application crashes at GPU Please Wait.

Thx,
Vidkidd
nofearl
QUOTE (mitch_de @ Aug 26 2009, 07:19 PM) *
THANKS !
Question: Do you have 2 indentical GPUs´?
Also (will not change the times) always use the lastest BUILD, ist V020.
Has much error handling code for ATI users (NVIDIAs seems to run without errors so far smile.gif )


yep 2 palit nvidia 9600 gt on 2 pciex x16 ports
El.Pilote
Hi all here is mine :



thumbsup_anim.gif
grue
Doesn't seem to test my setup correctly.

CL_DEVICE_NAME: Intel® Xeon® CPU X5365 @ 3.00GHz
CL_DEVICE_VENDOR: Intel
Now computing - please be patient....
time used: 7.710562
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 8800 GT
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 2.492461
Number of elements computed: 2097152
CL_DEVICE_NAME: GeForce 8800 GT
CL_DEVICE_VENDOR: NVIDIA
Now computing - please be patient....
time used: 2.489143
Number of elements computed: 2097152
Now checking if results are valid - please be patient....
smile.gif Validate results test passed - GPU=CPU smile.gif



I have an 8800GT in Slot 3 working as the helper card to a GTX260. Looks like it's testing the 8800GT twice.
mitch_de
QUOTE (grue @ Aug 27 2009, 02:19 AM) *
Doesn't seem to test my setup correctly.


I have an 8800GT in Slot 3 working as the helper card to a GTX260. Looks like it's testing the 8800GT twice.



NEW VERSION on the road - DL V025, lots of changes (and hopefully fixed ATI + > 2 gpu tests)
grue
BINGO


Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.683 seconds

OpenCL Device # 1 = GeForce GTX 260
Device 1 is an: GPU with max. 1400 MHz and 216 units/cores
Now computing - please be patient....
time used: 0.365 seconds

OpenCL Device # 2 = Intel® Xeon® CPU X5365 @ 3.00GHz
Device 2 is an: CPU with max. 3000 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.094 seconds

Now checking if results are valid - please be patient....
smile.gif Validate test passed - GPU results=CPU results smile.gif
mitch_de
QUOTE (grue @ Aug 27 2009, 02:49 AM) *
BINGO


Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.683 seconds

OpenCL Device # 1 = GeForce GTX 260
Device 1 is an: GPU with max. 1400 MHz and 216 units/cores
Now computing - please be patient....
time used: 0.365 seconds

OpenCL Device # 2 = Intel® Xeon® CPU X5365 @ 3.00GHz
Device 2 is an: CPU with max. 3000 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.094 seconds

Now checking if results are valid - please be patient....
smile.gif Validate test passed - GPU results=CPU results smile.gif


YEAH !
Hope also ATI users will not get an freezed system anymore - their GPUs got overloaded with old code - NVIDIAs not wink.gif

Remember : the V025 time used results of V025 cant be compared 1:1 with the old version ones.
Thats because of code changes for ATI users wink.gif
proengin
Here are my scores from V0.25 script:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 285
Device 0 is an: GPU with max. 1584 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.231 seconds

OpenCL Device # 1 = Intel® Core™ i7 CPU 920 @ 2.67GHz
Device 1 is an: CPU with max. 4280 MHz and 8 units/cores
Now computing - please be patient....
time used: 1.296 seconds

Now checking if results are valid - please be patient....
smile.gif Validate test passed - GPU results=CPU results smile.gif
gzfelix
CODE
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 3
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used:  4.126 seconds

OpenCL Device # 1 = GeForce GT 120
Device 1 is an: GPU with max. 1400 MHz and 32 units/cores
Error: clBuildProgram for device # 1
ERROR NUMBER = -11
vidkidd
This just shows how SAD the ATI Drivers currently are!!! OUCH!!!!


_tool-1/OpenCL2_Bench_V025 ; exit;
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.065 seconds

OpenCL Device # 1 = Intel® Xeon® CPU 5160 @ 3.00GHz
Device 1 is an: CPU with max. 3000 MHz and 4 units/cores
Now computing - please be patient....
time used: 6.079 seconds

Now checking if results are valid - please be patient....
smile.gif Validate test passed - GPU results=CPU results smile.gif
logout

[Process completed]
real3x
CODE
tool-1/OpenCL2_Bench_V025; exit;
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 260
Device 0 is an: GPU with max. 1242 MHz and 192 units/cores
Now computing - please be patient....
time used:  0.357 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz
Device 1 is an: CPU with max. 3600 MHz and 2 units/cores
Now computing - please be patient....
time used: 10.433 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.