Jump to content

CUDA-Z Info+Bench (Nvidia only) - updated Dec 2015


mitch_de
 Share

86 posts in this topic

Recommended Posts

Hi,
some OS X Apps already use CUDA (must be installed extra, Nvidia gpus only!!) , like Squeeze 7, Mathematica, Toast 11 for h.264 encoding export also.

I found an great CUDA-Z Tool which shows much informations and also has an small benchmarks (VRAM speed, PCI-E VRAM Speed,..) within.

EDIT: New Beta 0.11.259 Version available!
Added DL Link and new screenshoots of my 9600GT / 10.8.2
New version shows more details about CUDA and gpu card.
GPU GHz + VRAM GHz are max+fixed values (like using OpenCL info) and can´t be used to check AGPM.


http://sourceforge.n...es/cuda-z/Beta/

Bildschirmfoto 2012-10-04 um 21.35.05.jpg

Bildschirmfoto 2012-10-04 um 21.43.50.jpg

  • Like 2
Link to comment
Share on other sites

In diff to OpenCL , GTX4xx card should run CUDA (withz newest CUDA drivers) also with OS X 10.6.x. Would be nice to see some CUDA GTX4xx SL basic benchmark values (Gigaflops + memory copy speeds / VRAM speed (dev to dev speed).

Link to comment
Share on other sites

For mitch.

 

1GB ASUS ENGTX460 on 10.6.7, latest Quadro 4000 drivers, CUDA driver and SDK installed.

 

 
CUDA Device Query (Runtime API) version (CUDART static linking)

There is 1 device supporting CUDA

Device 0: "GeForce GTX 460"
 CUDA Driver Version / Runtime Version		  4.0 / 4.0
 CUDA Capability Major/Minor version number:	2.1
 Total amount of global memory:				 1024 MBytes (1073414144 bytes)
 ( 7) Multiprocessors x (48) CUDA Cores/MP:	 336 CUDA Cores
 GPU Clock Speed:							   1.35 GHz
 Memory Clock rate:							 1800.00 Mhz
 Memory Bus Width:							  256-bit
 L2 Cache Size:								 524288 bytes
 Max Texture Dimension Size (x,y,z)			 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)
 Max Layered Texture Size (dim) x layers		1D=(16384) x 2048, 2D=(16384,16384) x 2048
 Total amount of constant memory:			   65536 bytes
 Total amount of shared memory per block:	   49152 bytes
 Total number of registers available per block: 32768
 Warp size:									 32
 Maximum number of threads per block:		   1024
 Maximum sizes of each dimension of a block:	1024 x 1024 x 64
 Maximum sizes of each dimension of a grid:	 65535 x 65535 x 65535
 Maximum memory pitch:						  2147483647 bytes
 Texture alignment:							 512 bytes
 Concurrent copy and execution:				 Yes with 1 copy engine(s)
 Run time limit on kernels:					 Yes
 Integrated GPU sharing Host Memory:			No
 Support host page-locked memory mapping:	   Yes
 Concurrent kernel execution:				   Yes
 Alignment requirement for Surfaces:			Yes
 Device has ECC support enabled:				No
 Device is using TCC driver mode:			   No
 Device supports Unified Addressing (UVA):	  No
 Device PCI Bus ID / PCI location ID:		   1 / 0
 Compute Mode:
 < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.0, CUDA Runtime Version = 4.0, NumDevs = 1, Device = GeForce GTX 460
[./deviceQuery] test results...
PASSED

 

./bandwidthTest Starting...

Running on...

Device 0: GeForce GTX 460
Quick Mode

Host to Device Bandwidth, 1 Device(s), Paged memory
  Transfer Size (Bytes)	Bandwidth(MB/s)
  33554432			2516.6

Device to Host Bandwidth, 1 Device(s), Paged memory
  Transfer Size (Bytes)	Bandwidth(MB/s)
  33554432			2199.2

Device to Device Bandwidth, 1 Device(s)
  Transfer Size (Bytes)	Bandwidth(MB/s)
  33554432			58782.2

[./bandwidthTest] test results...
PASSED

Link to comment
Share on other sites

Your GTX 460 is clocked higher than mine - compare:

GTX460.jpg

Your extra 80 Mhz = nice g1g4fl0pz boost.

 

And look at your pageable memory copy, that's crazy...it's more than twice as fast as mine!

 

You: Core i7, memory controller is part of the CPU, DDR3

 

Me: Core 2 Duo, P45 chipset, DDR2

Link to comment
Share on other sites

*lmao* yeah that's depressing alright.

 

I just kept scrolling down and down and down.. until my E8500 finally appeared waaaaay down at the end. ;)

 

The fastest CPU on that list that would work in my motherboard is the Intel Core 2 Extreme X9750 - interestingly, its clock frequency is 3.16 GHz just like my E8500, but it scores in the low 5000s - almost twice as much as the E8500. I wonder if there's more to it than the two extra cores.

 

I can't find it for sale online, which is probably for the best, I'm sure it costs a million dollah..

Link to comment
Share on other sites

hello, need some help to install the deviceQuery

 

xcode, cuda driver, toolkit and tools are installed.

 

i open cd /Developer/GPU\ Computing in terminal and execute make.

 

 

bash-3.2# make
make -C src/alignedTypes/ 
make -C src/asyncAPI/ 
make -C src/bandwidthTest/ 
make -C src/bicubicTexture/ 
make -C src/bilateralFilter/ 
make -C src/binomialOptions/ 
make -C src/BlackScholes/ 
make -C src/boxFilter/ 
make -C src/clock/ 
make -C src/concurrentKernels/ 
make -C src/convolutionFFT2D/ 
make -C src/convolutionSeparable/ 
make -C src/convolutionTexture/ 
make -C src/cppIntegration/ 
make -C src/dct8x8/ 
make -C src/deviceQuery/ 
ld: can't open output file for writing: ../../bin/darwin/release/deviceQuery, errno=21
collect2: ld returned 1 exit status
make[2]: *** [../../bin/darwin/release/deviceQuery] Error 1
make[1]: *** [src/deviceQuery/Makefile.ph_build] Error 2
make: *** [all] Error 2

 

Cuda Z and Pyrit work with the cuda drive but can´t get the deviceQuery.

 

could you please help ?

Link to comment
Share on other sites

hello, need some help to install the deviceQuery

 

xcode, cuda driver, toolkit and tools are installed.

 

i open cd /Developer/GPU\ Computing in terminal and execute make.

 

Cuda Z and Pyrit work with the cuda drive but can´t get the deviceQuery.

 

could you please help ?

 

solved, installed all again and "make" works fine now ;)

 

bash-3.2# ./deviceQuery
[./deviceQuery] starting...
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

There is 1 device supporting CUDA

Device 0: "GeForce GTS 450"
 CUDA Driver Version / Runtime Version		  4.0 / 4.0
 CUDA Capability Major/Minor version number:	2.1
 Total amount of global memory:				 1024 MBytes (1073283072 bytes)
 ( 4) Multiprocessors x (48) CUDA Cores/MP:	 192 CUDA Cores
 GPU Clock Speed:							   1.57 GHz
 Memory Clock rate:							 1804.00 Mhz
 Memory Bus Width:							  128-bit
 L2 Cache Size:								 262144 bytes
 Max Texture Dimension Size (x,y,z)			 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)
 Max Layered Texture Size (dim) x layers		1D=(16384) x 2048, 2D=(16384,16384) x 2048
 Total amount of constant memory:			   65536 bytes
 Total amount of shared memory per block:	   49152 bytes
 Total number of registers available per block: 32768
 Warp size:									 32
 Maximum number of threads per block:		   1024
 Maximum sizes of each dimension of a block:	1024 x 1024 x 64
 Maximum sizes of each dimension of a grid:	 65535 x 65535 x 65535
 Maximum memory pitch:						  2147483647 bytes
 Texture alignment:							 512 bytes
 Concurrent copy and execution:				 Yes with 2 copy engine(s)
 Run time limit on kernels:					 Yes
 Integrated GPU sharing Host Memory:			No
 Support host page-locked memory mapping:	   Yes
 Concurrent kernel execution:				   Yes
 Alignment requirement for Surfaces:			Yes
 Device has ECC support enabled:				No
 Device is using TCC driver mode:			   No
 Device supports Unified Addressing (UVA):	  No
 Device PCI Bus ID / PCI location ID:		   1 / 0
 Compute Mode:
 < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.0, CUDA Runtime Version = 4.0, NumDevs = 1, Device = GeForce GTS 450
[./deviceQuery] test results...
PASSED

Press ENTER to exit...

bash-3.2#

Link to comment
Share on other sites

Yep, there's a readme somewhere that says you must delete folders from any previous installation - and install everything in a specific order.

 

Hi,

Cuda-Z Mac OS X running with Lion ?

 

Readme:

 

Replace the existing open64/lib/gfec file in your CUDA Toolkit 3.2

installation with this new version if you are experiencing this issue.

 

Note that this issue does not apply to platforms other than MacOS, and

it was not present in CUDA Toolkit 3.1 or earlier.

 

No explanation in the readme in the installation odre.

 

Only replace "open64/lib/gfec file in your CUDA Toolkit 3.2" by CUDA Toolkit 3.1.

 

Works if we replacement ?

 

 

Thanks.

Link to comment
Share on other sites

Not work on my Lion preview 2.(cuda driver 4.0.13rc)

Yep, that Mac Beta Tool cant "see" Cuda on OS X Lion - ist because its cuda detection check code in the tool, not an general problem.

Will be sure fixed by that people next months.

Link to comment
Share on other sites

 Share

×
×
  • Create New...