Jump to content

Metal Particles (as demo /bench) new Nbody-Metal (demo/bench)


mitch_de
 Share

144 posts in this topic

Recommended Posts

to change particles count (in 500000 steps) in my build (frist post) use left or right key on keyboard, not mouse l/r button click.

 

AMD Radeon HD6450

OSX 10.11.1

ParticlesMetalDemo does not work

but this one works with 50fps

Particles_OpenGL_ES_2_Demo

 

If my build, ParticlesMetalDemodoesnt work, METAL seems not available on your system.

If metal works, it shows Rendering using METAL in my build.

Using the second build Particles_OpenGL_ES_2_Demo it used OpenGL ES (= OpenGL for embedded deivices like phones - what is not OS X  OpenGL!!!)

if METAL is not available and shows then  Rendering using OpenGL ES.

 

For my system, the build which can do both, Metal and OpenGL ES crashes.

 

For us, using desktop OS X OpenGL  and NOT iPhone iOS OpenGL ES FPS results are not very interesting because OS X uses OpenGL or Metal.

 

I think, METAL is and will be very usefull, good for the mobile devices but will- if ever - be usefull for some OS X games - if they support METAL.

Because METAL on OS X is today very limited by supported gpus AND is at the beginning (on OS X) METAL is more an academic thing.

 

Like OpenCL, which was first by Apple advertised to normal users long time ago is even today very less used in real Apps.

Some reasons: Faster cpus todays, buggy OpenCL drivers = risk for the dev to get angry customers, hard to write komplex (really useful) OpenCL code.

CUDA is more often used - at least on Windows. CUDA accelerated apps for OS X is limited to Macs with Nvidia gpus = to less real mac! user group with fast! + non mobile Nvidia GPUs. 

 

Perhaps METAL will get more used in the future - but as said, the faster the cpu, the less is the advantage of develop for Metal.

  • Like 3
Link to comment
Share on other sites

Virtual Mac OS X EL Capitan with GPU Passthrough (GTX 770 - see http://www.insanelymac.com/forum/topic/309087-insanely-fast-virtual-mac-qemu-ovmf-clover-and-native-graphics/for more details, if interested):

 

1000000: 30

2000000: 20

3000000: 12

 

Same results as you with an EVGA GTX 770 4Gb

 

Thanks mitch_de this is a great app!!

  • Like 1
Link to comment
Share on other sites

Result 10.11.1 using GTX960 2GB (Inno3D GTX 960 iChill Ultra), with mitch_de compile :

1000000
tld.jpg

2000000
sld.jpg

3000000
rld.jpg

But while I’m running this test, my GPU load not maxed out, no more than 20% even in 3000000 particles since i think it will pushed out GPU to render those animation. VRAM usage is not affected.




 

Link to comment
Share on other sites

Trying with fantomas build. (Particles OpenGL ES 2 Demo)

vld.jpg

And I’m curious how metal using system resources, and then fired up activity monitor :
xld.jpg

1. I’m running this test about 5 minutes. Is metal really using GPU? My CPU is stressed about 85% and GPU stress is low. I’m running this test about 20 minutes.
2. Animation is quite smooth, but sometimes became laggy and continue smooth. In Demo app it’s show 11-14 fps, but in iStatMenu GPU monitor shows 13-60fps. 

Link to comment
Share on other sites

Yep, huge cpu load - same for me. In Pause Mode - click left mouse - cpu goes down but gpu rises.

I think the generation of the lot of particles is cpu bound. If they are already generated ("pause mode") and must only be spinned gpu load for me is 70% (normal mode = 20%).

So this source interesting for an demo / check if Metal works but not really an Metal bench, without changes in the source code.

For me also very huge main memory usage up to 980 MB after using 3 Mio Particles. VRAM stays always at 40% usage (of 1 GB).

Link to comment
Share on other sites

Would it now name METAL DEMO. For an bench, cpu load is much to high.

Yep, i think onCPU gpu benefits from very fast / shared cpu-gpu connection - at least if the tasks to do are not very complex and didnt use much vram.

I am not sure but after frist reading, METAL seems to be limited in general to 60 FPS. For games ok, for benching (other "DEMO"!) maybe a problem.

Link to comment
Share on other sites

Hi all, 

after investigation that the DEMO is really more an DEMO and not an bench (cpu limited) i found an other,  NEW  METAL Particles App.

 

DL first page.

 

Screenshoot shows that NOW GPU has full work load (99% GPU load vs 25% of "old" Metal DEMO, even GUI is slow beside app running) and CPU load is low :) = GOOD as bench.

post-110586-0-77166100-1447095882.jpeg

 

 6 FPS for my GT 740 low end gpu. GTX should perform now much faster than my gpu - compared to the DEMO FPS diff.

Takes some seconds running, after the FPS will be shown - be patient :)

 

post-110586-0-48977300-1447095026_thumb.jpg

  • Like 3
Link to comment
Share on other sites

43-48fps

Intel Iris 5100 on i7-4558U

 

Ho i said, i think Metal i heavy optimized for intel IGP...

attachicon.gifBildschirmfoto 2015-11-09 um 21.25.44.jpg

attachicon.gifBildschirmfoto 2015-11-09 um 21.26.47.jpg

 

Cheers :-)

Yep, top  and first goal of METAL is for use on embedded gpu systems -  iPad, iPhone & Co.

Very fast cpus AND fast  discrete gpus didnt need really metal and will use normal OpenGL also in the near future.

last but not least: Apple will first (or only!) optimize metal for THEIR used/buildin gpu types. Lot of work to optimize metal also for our wide range of hackintosh gpus.

  • Like 1
Link to comment
Share on other sites

Looked deeper in the source code , plus METAL details and found out that the source was optimized using embedded gpus , without own+fast VRAm like discret gpu have.

 

Build new test version - now runs at least 6 times faster by the discrete vram usage option.

But - because of the 60 FPS Metal limit the 4 Mill optimized version runs to fast  :whistle: - even on my lowend GT 740 the FPS raised from 6 to 36 FPS - only by setting private (gpu VRAM) memory usage. GTX cards would run into the 60 FPS limit! 

So i had to use an many times bigger particles count (33 Mill! vs 4 Mill first version) to avoid running even little fast gpus than mine in the 60 FPS limit.

 

Please try that new version - (has 33 Mill particles count shown vs 4 Mill from first version) 

new 33 Mill + optimized for discrete VRAM usage 

 

Intel GPU may fail now (no discrete VRAM, perhaps METAL disable / fix that on the fly)

 

Nvidia GT 740 , 33 Mill version = 10 FPS

 

 

 

 

DL test 33 Mill version: now on first post, added 33 Mill Version into the app window title  (for screenshoots)

  • Like 2
Link to comment
Share on other sites

Looked deeper in the source code , plus METAL details and found out that the source was optimized using embedded gpus , without own+fast VRAm like discret gpu have.

 

Build new test version - now runs at least 6 times faster by the discrete vram usage option.

But - because of the 60 FPS Metal limit the 4 Mill optimized version runs to fast  :whistle: - even on my lowend GT 740 the FPS raised from 6 to 36 FPS - only by setting private (gpu VRAM) memory usage. GTX cards would run into the 60 FPS limit! 

So i had to use an many times bigger particles count (33 Mill! vs 4 Mill first version) to avoid running even little fast gpus than mine in the 60 FPS limit.

 

Please try that new version - (has 33 Mill particles count shown vs 4 Mill from first version) 

new 33 Mill + optimized for discrete VRAM usage 

 

Intel GPU may fail now (no discrete VRAM, perhaps METAL disable / fix that on the fly)

 

Nvidia GT 740 , 33 Mill version = 10 FPS

 

attachicon.gifBildschirmfoto 2015-11-09 um 23.58.00.jpg

 

 

DL test 33 Mill version: now on first post, added 33 Mill Version into the app window title  (for screenshoots)

 

Intel Iris 5100 (i7-4558U)

 

33 Mill version: Constant 5 FPS

 

EDiT: New version? Redownloaded and have now 5 FPS, before i got only 3^^

 

post-735125-0-33748500-1447113890_thumb.jpg

 

Cheers :-)

Link to comment
Share on other sites

Looked deeper in the source code , plus METAL details and found out that the source was optimized using embedded gpus , without own+fast VRAm like discret gpu have.

 

Build new test version - now runs at least 6 times faster by the discrete vram usage option.

But - because of the 60 FPS Metal limit the 4 Mill optimized version runs to fast  :whistle: - even on my lowend GT 740 the FPS raised from 6 to 36 FPS - only by setting private (gpu VRAM) memory usage. GTX cards would run into the 60 FPS limit! 

So i had to use an many times bigger particles count (33 Mill! vs 4 Mill first version) to avoid running even little fast gpus than mine in the 60 FPS limit.

 

Please try that new version - (has 33 Mill particles count shown vs 4 Mill from first version) 

new 33 Mill + optimized for discrete VRAM usage 

 

Intel GPU may fail now (no discrete VRAM, perhaps METAL disable / fix that on the fly)

 

Nvidia GT 740 , 33 Mill version = 10 FPS

 

attachicon.gifBildschirmfoto 2015-11-09 um 23.58.00.jpg

 

 

DL test 33 Mill version: now on first post, added 33 Mill Version into the app window title  (for screenshoots)

 

WOW - solid 60 FPS on GTX 980 now - everything butter smooth :XD

Link to comment
Share on other sites

 Share

×
×
  • Create New...