Note: This still applies for 10.7.4 and 10.8! No longer needed for 10.9!
good news everyone
After I bought a GTX 560 Ti, I noticed a few odd things about the OpenCL support of this card.
It's telling you that it's capable of all these things, but it actually isn't and will produce compile errors like "requires .target sm_12 or higher" even though it's a sm_21 capable card. So, I started digging and from the looks of it, Apples OpenCL compiler is only (directly) supporting cards up to sm_20 (Quadro 4000, GTX 480/470/580/570). If it's higher than this it will fallback to sm_10 or sm_11.
The solution: let's just pretend we have a 2.0 card
So, open up a hex editor of your liking and do this:
open /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib (as root or with sudo)
on 10.7.x and <=10.8.2:
find: 8B 87 1C 0C 00 00 89 06 8B 87 20 0C 00 00 89 02
replace by: 31 C0 FF C0 FF C0 89 06 31 C0 89 02 90 90 90 90
on 10.8.3+ (as mentioned here):
find: 8B 81 1C 0C 00 00 EB 06 8B 81 20 0C 00 00
replace by: B8 02 00 00 00 90 EB 06 B8 00 00 00 00 90
reboot is not required, but recommended
What this basically does is replacing the dynamic cc device info in clhDeviceComputeCapability with a hardcoded 2.0 "info". Note that this is x64 only for the moment (which most people are certainly using since 10.7). I will add x86 support at a later point.
Also, if you have another non-sm2.0 capable nvidia card installed, this will (probably) break OpenCL support for it.
Now, everything that did work before should still be working ...
[Device 0] Name: GeForce GTX 560 Ti Vendor: NVIDIA Type: GPU Device Version: OpenCL 1.1 Driver Version: CLH 1.0 Compute Units: 16 Work Group Size: 1024 Clock: 0 MHz Global Memory: 1024 MB Local Memory: 48 KB Cache Size: 0 Bytes Cache Line Size: 0 Bytes Available: Yes Double-Precision: No Extensions (12): cl_APPLE_ContextLoggingFunctions cl_APPLE_SetMemObjectDestructor cl_APPLE_clut cl_APPLE_fp64_basic_ops cl_APPLE_gl_sharing cl_APPLE_query_kernel_names cl_khr_byte_addressable_store cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics
... but programs that are using some advanced OpenCL features (e.g. lexmark) should work now too:
Screen_Shot_2011_08_24_at_1.44.15_AM.png 481.35KB 965 downloads