Jump to content

CUDA-Z Info+Bench (Nvidia only) - updated Oct 2012


  • Please log in to reply
74 replies to this topic

#61
myrorym

myrorym

    InsanelyMac Protégé

  • Members
  • PipPip
  • 79 posts
  • Gender:Male
10.6.7 : CUDA 4.0.17 : Driver 256.02.05.f1

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
> Compute 2.0 CUDA device: [GeForce GTX 470]
61440 bodies, total time for 10 iterations: 1393.516 ms
= 27.089 billion interactions per second
= 541.777 single-precision GFLOP/s at 20 flops per interaction
[nbody] test results...
PASSED

Soon test 10.6.8 or 10.7 with updated drivers.

TY

#62
meroy

meroy

    InsanelyMac Protégé

  • Members
  • Pip
  • 46 posts
Hi all,

Just wanted to chime in to say that I finally got sleep working on my system with the GTX 295 card.

http://www.insanelym...p...t&p=1717189

Note: The culprit was was with the device-type. Each GPU is different. The one having display connections is set to NVDA,Parent and the other to NVDA,GeForce.

#63
mitch_de

mitch_de

    InsanelyMacaholic

  • Local Moderators
  • 2,879 posts
  • Gender:Male
  • Location:Stuttgart / Germany
CUDA Speed should only depends on CUDA driver versions, OS X Version (OpenGL Version) should not have any speed diff effect.

#64
meroy

meroy

    InsanelyMac Protégé

  • Members
  • Pip
  • 46 posts
One can even benchmark double-precision via the N-Body CUDA demo. GTX 4xx owners will be able to see a much larger increase in performance when comparing to GTX 2xx cards.

Here are my results:

./nbody -fp64 -n=30720 -benchmark

GTX 295 OC'ed:

-- One GPU: 66.966 double-precision GFLOP/s
-- Two GPUs via -numdevices=2: 128.672 double-precision GFLOP/s

GTX 295 (standard clocks):

-- One GPU: 55.010 double-precision GFLOP/s
-- Two GPUs via -numdevices=2: 106.478 double-precision GFLOP/s

This is where a GTX 4xx can shine over a GTX 2xx variant.

Try -n=15360 if -n=30720 reports unspecified launch failure.
Not all cards support double-precision.

#65
meroy

meroy

    InsanelyMac Protégé

  • Members
  • Pip
  • 46 posts
Lion is soon out and folks will be able to benchmark GTX 5xx series.

The following is taken from my Windows 7 box running a pair of GTX 560's. They came factory OC'd at 900/1800/2004 (4008) 1.012 volts. However, I under-clocked/under-volted them down to 855/1710/2100 (4200) 0.987 volts.

Single-Precision: ./nbody -n=61440 -benchmark

-- One GPU: 548.804 single-precision GFLOP/s
-- Two GPUs via -numdevices=2: 1068.541 single-precision GFLOP/s

Double-Precision: ./nbody -fp64 -n=30720 -benchmark

-- One GPU: 89.702 double-precision GFLOP/s
-- Two GPUs via -numdevices=2: 166.125 double-precision GFLOP/s


I wish that NVIDIA will one day make a single-PCB card containing 2 GTX 560's to have a good balance for compute-power and electric power utilization.

#66
hiphopboy

hiphopboy

    InsanelyMac Protégé

  • Members
  • PipPip
  • 89 posts
Hope have 4.0.0.20 today for support Lion Final

#67
Wayang-NT

Wayang-NT

    InsanelyMac Geek

  • Members
  • PipPipPip
  • 132 posts
  • Gender:Male
new CUDA 4.0.21 ....

Posted Image

#68
hiphopboy

hiphopboy

    InsanelyMac Protégé

  • Members
  • PipPip
  • 89 posts
oh ! Thanks WaYang ! Wait 1 week for this :)

#69
Lord_Jeremy

Lord_Jeremy

    InsanelyMac Sage

  • Members
  • PipPipPipPipPipPip
  • 380 posts
When I run nbody I get the following:
dyld: Library not loaded: @rpath/libcudart.dylib
  Referenced from: /Users/lord_jeremy/Applications/./nbody
  Reason: image not found
I've got the nVidia CUDA package 4.1.25 installed and CUDA-Z shows benchmark info so I presume it's functioning correctly. Any thoughts?

#70
Gringo Vermelho

Gringo Vermelho

    The Jan Bird fix

  • Supervisors
  • 6,034 posts
  • Gender:Male
  • Location:Brazil
Not sure, I think you have to install the CUDA SDK or tools (or whatever) as well in order to use nbody. Everything is available on the CUDA download page.

#71
Lord_Jeremy

Lord_Jeremy

    InsanelyMac Sage

  • Members
  • PipPipPipPipPipPip
  • 380 posts
Yep, that was it. Thanks!

#72
Cavendish Qi

Cavendish Qi

    InsanelyMac Protégé

  • Members
  • Pip
  • 24 posts
  • Gender:Male
Thanks for the info:
10.8, NVidia GT 540M, 1G, CUDA Driver 5.0.17

CUDA-Z results: http://www.tonymacx8...elp-needed.html
nbody result:
https://gist.github.com/3278149
https://gist.github.com/3278246

#73
mitch_de

mitch_de

    InsanelyMacaholic

  • Local Moderators
  • 2,879 posts
  • Gender:Male
  • Location:Stuttgart / Germany
NEW (beta) Sep-2012 Version available!

http://sourceforge.n...es/cuda-z/Beta/

Select the .dmg for download OS X version.

Attached Files



#74
RobertX

RobertX

    InSanelyMac Maverick

  • Members
  • PipPipPipPipPipPipPip
  • 531 posts
  • Gender:Not Telling
...ahhh, in comes the smell of something sweet and new... :rolleyes:
Attached File  Core.png   95.41KB   26 downloads Attached File  Memory.png   77.48KB   33 downloads Attached File  Performance.png   78.96KB   33 downloads

#75
mitch_de

mitch_de

    InsanelyMacaholic

  • Local Moderators
  • 2,879 posts
  • Gender:Male
  • Location:Stuttgart / Germany
Performance Tab : Device to Device Speed shows VRAM Speed.
In your case, GT 520 , you can see that that value is not good, because of limited (Bits) vram bandwidth of GT 520 (GT 420, GT 620) and others using 64/128 Bit instead of 256/384 Bit.
Host to Device or Device to host values (VRAM copy/access over PCI-E) are mostly limited by PCI-E Speed and way less than transferspeed onboard VRAM.





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

© 2014 InsanelyMac  |   News  |   Forum  |   Downloads  |   OSx86 Wiki  |   Mac Netbook  |   PHP hosting by CatN  |   Designed by Ed Gain  |   Logo by irfan  |   Privacy Policy