Jump to content
Welcome to InsanelyMac Forum

Register now to gain access to all of our features. Once registered and logged in, you will be able to contribute to this site by submitting your own content or replying to existing content. You'll be able to customize your profile, receive reputation points as a reward for submitting content, while also communicating with other members via your own private inbox, plus much more! This message will be removed once you have signed in.

  • Announcements

    • Allan

      Forum Rules   04/13/2018

      Hello folks! As some things are being fixed, we'll keep you updated. Per hour the Forum Rules don't have a dedicated "Tab", so here is the place that we have our Rules back. New Users Lounge > [READ] - InsanelyMac Forum Rules - The InsanelyMac Staff Team. 
cmf

OpenCL fix for non-GF100/GF110 cards (aka CC/SM 2.1+)

140 posts in this topic

Recommended Posts

better late than never: *updated first post for 10.8.3+ with the info posted by robertx.


for the people who are interested, the replacement code does this:


movl $2, %eax

nop

jmp 6

movl $0, %eax

nop

so you should be able to set this to a different CC by replacing $MAJOR and $MINOR in this sequence:
B8 $MAJOR 00 00 00 90 EB 06 B8 $MINOR 00 00 00 90


if any of the gtx titan / gk110 folks reads this, please test if this makes opencl work on those devices (sets it to CC3.0 which is used for gk104 devices):
B8 03 00 00 00 90 EB 06 B8 00 00 00 00 90

this does not work!

Share this post


Link to post
Share on other sites

I Don't use Web driver. My solution for NVDA Channel Timeout was install Cuda 5.0.59 and freezefix. I have been there five days without NVDA Channel timeout with

Share this post


Link to post
Share on other sites

this fix is again valid for the stock apple drivers from 10.8.5

backup original, and in Terminal type:

 

sudo perl -pi -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); }' /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib

reboot not necessary

 

:smoke:

Share this post


Link to post
Share on other sites

This doesn't work for me.  10.8.5 with a Titan.  Apps using OpenCL just crash.  Won't even start actually.

...i don't believe the Titan is a fermi card...so this fix may not apply... :smoke:

Share this post


Link to post
Share on other sites

This entire thread is for non-Fermi cards, no?  I thought GF is Fermi and GK is Keplar.  And the thread title says "non-GF100/GF110 cards".

Share this post


Link to post
Share on other sites

This entire thread is for non-Fermi cards, no?  I thought GF is Fermi and GK is Keplar.  And the thread title says "non-GF100/GF110 cards".

sorry...brain-freeze here...try this slightly modified script in terminal

sudo perl -pi -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x03\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); }' /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib

if any of the gtx titan / gk110 folks reads this, please test(sets it to CC3.0 which is used for gk104 devices)

Share this post


Link to post
Share on other sites

I just tried it and still not working.  When I run OCLInfo it shows OpenCL version, but any apps that use OpenCL either give errors or completely crash out when trying to run them.

Share this post


Link to post
Share on other sites

ok, again i apologize for confusing you...hopefully someone with your card will chime in with a fix or work-around for this issue... :smoke: maybe nvidia retail drivers for 10.8.5 will improve results(when they are released)

Share this post


Link to post
Share on other sites

http://www.insanelymac.com/forum/user/42821-cmf/ posted:

it's basically the same as the GeForceGLDriver binary patch, but at the user level (and w/o modifying the binary, obviously).
i.e. if the binary patch isn't applied, opencl will only function if "CL_ENABLE_SM2_DEVICE" is defined/set in the users environment vars (default set can be changed through the .profile file).
so, if apple ever decides to add an additional "is device opencl capable?" check or shuffle the code around, so that the binary patch doesn't work any more (which i'm surprised it did in 10.7.3), opencl should still work (to some extent*) using the "CL_ENABLE_SM2_DEVICE" define.

downside and (*): this really only works if a program is started by the user and the program doesn't ignore/overwrite the cl define or does some other weird stuff (like luxmark does ...). some system services/programs that use opencl and are started by another user also won't be able to use the opencl device (just open activity monitor and look which processes aren't owned by your user).


and ftr and the people that are interested:
the GeForceGLDriver binary patch changes the "cmp eax, 2" to a "cmp eax, 3" (note: eax at this point contains the major version of the nvidia compute capability (cc/sm), which is 2 for fermi gpus), so the subsequent jump condition will evaluate to true (instead of false, b/c 2 is not less than 2, but 2 < 3!) and continue @loc_8F014446. if the binary patch has not been applied (it still says "cmp eax, 2"), it will continue and check if "CL_ENABLE_SM2_DEVICE" is defined in the users env vars. if so, it will also continue @loc_8F014446. if not, the device will be "destroyed" and be "declared" not opencl capable.
post-42821-0-80167800-1329226199_thumb.p

Share this post


Link to post
Share on other sites

Sorry, this is all new to me, so I don't really understand.  After running the sudo perl command and the echo "export CL_ENABLE_SM2_DEVICE=1" >> ~/.profile commands, here is the output I get when I run oclinfo.  Seems like it says OpenCL is supported, but nothing with OpenCL works.  I've tried Promiere Pro CC, LuxMark, OpenCL Oceanwave, and the new beta builds for RedCine-X Pro, and none of them will work.

 

1 OpenCL platform found!

 
[Platform 0]
Name: Apple
Vendor: Apple
Version: OpenCL 1.2 (Apr 25 2013 18:32:06)
Profile: FULL_PROFILE
Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event
 
 
[OpenCL-only Context]
2 OpenCL devices found!
 
[Device 0]
Name: Intel® Core i7-3770K CPU @ 3.50GHz
Vendor: Intel
Type: CPU 
Device Version: OpenCL 1.2 
Driver Version: 1.1
Compute Units: 8
Work Group Size: 1024
Clock: 3500 MHz
Global Memory (Total): 32768 MB
Global Memory (Host): 32768 MB
Global Memory (PCIe): 0 MB
Local Memory: 32 KB
Cache Size: 0.0625 KB
Cache Line Size: 8388608 Bytes
Available: Yes
Double-Precision: Yes
Extensions: 
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_APPLE_clut
cl_APPLE_query_kernel_names
cl_APPLE_gl_sharing
cl_khr_gl_event
cl_khr_fp64
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_3d_image_writes
cl_APPLE_fp64_basic_ops
cl_APPLE_fixed_alpha_channel_orders
cl_APPLE_biased_fixed_point_image_formats
 
[Device 1]
Name: GeForce GTX TITAN
Vendor: NVIDIA
Type: GPU 
Device Version: OpenCL 1.1 
Driver Version: 8.16.74 310.40.00.10f02
Compute Units: 14
Work Group Size: 1024
Clock: 928 MHz
Global Memory: 6144 MB
Local Memory: 48 KB
Cache Size: 0 KB
Cache Line Size: 0 Bytes
Available: Yes
Double-Precision: Yes
Extensions: 
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_APPLE_clut
cl_APPLE_query_kernel_names
cl_APPLE_gl_sharing
cl_khr_gl_event
cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_APPLE_fp64_basic_ops
cl_khr_fp64
cl_khr_3d_image_writes
 
[shared OpenCL+OpenGL Context]
2 OpenCL devices found!
 
[Device 0]
Name: GeForce GTX TITAN
Vendor: NVIDIA
Type: GPU 
Device Version: OpenCL 1.1 
Driver Version: 8.16.74 310.40.00.10f02
Compute Units: 14
Work Group Size: 1024
Clock: 928 MHz
Global Memory: 6144 MB
Local Memory: 48 KB
Cache Size: 0 KB
Cache Line Size: 0 Bytes
Available: Yes
Double-Precision: Yes
Extensions: 
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_APPLE_clut
cl_APPLE_query_kernel_names
cl_APPLE_gl_sharing
cl_khr_gl_event
cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_APPLE_fp64_basic_ops
cl_khr_fp64
cl_khr_3d_image_writes
 
[Device 1]
Name: Intel® Core i7-3770K CPU @ 3.50GHz
Vendor: Intel
Type: CPU 
Device Version: OpenCL 1.2 
Driver Version: 1.1
Compute Units: 8
Work Group Size: 1024
Clock: 3500 MHz
Global Memory (Total): 32768 MB
Global Memory (Host): 32768 MB
Global Memory (PCIe): 0 MB
Local Memory: 32 KB
Cache Size: 0.0625 KB
Cache Line Size: 8388608 Bytes
Available: Yes
Double-Precision: Yes
Extensions: 
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_APPLE_clut
cl_APPLE_query_kernel_names
cl_APPLE_gl_sharing
cl_khr_gl_event
cl_khr_fp64
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_3d_image_writes
cl_APPLE_fp64_basic_ops
cl_APPLE_fixed_alpha_channel_orders
cl_APPLE_biased_fixed_point_image_formats
 
logout
 
[Process completed]

Share this post


Link to post
Share on other sites

This thread applies to fermi cards only, where non-GF100/GF110 means any GF10x/GF11x card where x != 0 ;)

 

For GK110 based cards (Titan and 780), I'd highly recommend updating to 10.9 - I doubt we'll ever see full and proper OpenCL support for these in 10.8.

In 10.9 OpenCL is working OOTB on these cards. And since I haven't mentioned it anywhere yet: 780 OpenCL support has been added in some earlier DP.

Share this post


Link to post
Share on other sites

This thread applies to fermi cards only, where non-GF100/GF110 means any GF10x/GF11x card where x != 0 ;)

 

For GK110 based cards (Titan and 780), I'd highly recommend updating to 10.9 - I doubt we'll ever see full and proper OpenCL support for these in 10.8.

In 10.9 OpenCL is working OOTB on these cards. And since I haven't mentioned it anywhere yet: 780 OpenCL support has been added in some earlier DP.

How do Fermi cards work in 10.9 (450 GTS here)?

Share this post


Link to post
Share on other sites

for the new nvidia "Web" drivers (10.8.5)this again works for me
(always backup first) then...in Terminal type:
sudo perl -pi -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); }' /System/Library/Extensions/GeForceGLDriverWeb.bundle/Contents/MacOS/libclh.dylib
:smoke:

Share this post


Link to post
Share on other sites

for me I have strange thing ... in 10.9 560ti is working out of the box but Benchmarking are really low when I compared them to the 10.8.5 benchmark...

I also fix my freezing problem (FERMI FREEZE)

 

 

/EDIT

 

Nope, FREEZE Came back ... it is just less redundant

Share this post


Link to post
Share on other sites

Also on 10.9 now. Haven't got a freeze yet (EVGA GeForce 450 GTS 1GB), hope it stays this way. :)

 

What I do find strange is that OS X never is clocking down my GPU anymore. This is because of the new nVidia drivers that where also on 10.8.3>

 

I already added my card to AGPM.kext (see my post), GUI is smooth. But again no clocking-down = high power usage

Share this post


Link to post
Share on other sites

Yewwo :)

 

I'm stuck in the same boat as many others here - I have an EVGA GTX 780 w/ ACX Cooler and the fixes here aren't working in the retail 10.9 drivers. It seems that the 10.9 drivers are a regression and are actually older than the stock 10.8.5 drivers, yet somehow modified with GK110 code. The DP versions of Mavericks seem to have worked for people here, but the release version has regressions and none of the fixes here work.

 

Has anybody gotten 780 cards working in Mavericks w/ OpenCL? There is a lot more that uses OpenCL now, especially Preview when viewing anything other than text/pdf, and having the apps just crash outright because of OpenCL (or Apple's pathetic implementation of it) is a total productivity killer.

 

Any help would be greatly appreciated. :)

Share this post


Link to post
Share on other sites

Info: Since 10.9 final, my GT 440 didnt need the OpenCL patch anymore - runs OpenCL with unpatched driver. Before (older DPs) the patch was needed.

Share this post


Link to post
Share on other sites

Yewwo :)

 

I'm stuck in the same boat as many others here - I have an EVGA GTX 780 w/ ACX Cooler and the fixes here aren't working in the retail 10.9 drivers. It seems that the 10.9 drivers are a regression and are actually older than the stock 10.8.5 drivers, yet somehow modified with GK110 code. The DP versions of Mavericks seem to have worked for people here, but the release version has regressions and none of the fixes here work.

 

Has anybody gotten 780 cards working in Mavericks w/ OpenCL? There is a lot more that uses OpenCL now, especially Preview when viewing anything other than text/pdf, and having the apps just crash outright because of OpenCL (or Apple's pathetic implementation of it) is a total productivity killer.

 

Any help would be greatly appreciated. :)

This fix was never meant for any GK1xx cards and might even result in problems, because sm_2x (fermi) != sm_3x (kepler). Although the compiled binaries should generally be upwards compatible, there might be other things that get messed up because of the different sm version.

 

That being said, this fix isn't even necessary any more in 10.9 (even for sm_21 cards) as I've already written in the first post (the compiler in 10.9 can properly produce and handle up to sm_35 code now - thats GK110).

 

Concerning the GTX 780: I have no issues besides the general OpenGL slowness of the driver (really hoping for R330 drivers in 10.9.1 or 10.9.2 soon ...). On the other hand, OpenCL is actually pretty fast and you probably won't see any performance increases in future drivers as it is already at the max - at least for well written programs. It scales pretty much perfectly for my code (6x over a GT650M).

So, I don't know what you did wrong, but it should just work OOTB in 10.9.

Share this post


Link to post
Share on other sites

Unfortunately even a vanilla install with only FakeSMC.kext and HWSensors installed results in OpenCL crashing immediately on any app that uses it. Boot to safe mode and OpenCL based apps don't crash anymore. And since the only driver that is disabled in Safe Mode for the nVidia GPUs is GeForce.kext, that puts the onus squarely on Apple's driver. Unfortunately without that driver loaded I get no acceleration either and the entire UI is choppy to say the least.

 

Mavericks' drivers are older than ML 10.8.5's stock drivers. Why the DP versions of Mavericks had awesome drivers and the release got drivers that are a year old is what's making me scratch my head.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Recently Browsing   0 members

    No registered users viewing this page.



×