Jump to content

OpenCL fix for non-GF100/GF110 cards (aka CC/SM 2.1+)


cmf
 Share

138 posts in this topic

Recommended Posts

Hi cmf

 

First thanks for a great thread!

 

I installed 10.7.2 from from comboupdate (newest build), edited GeForceGLDriver.bundle version 7.12.9 with the following:

 

GeForceGLDriver.bundle/Contents/MacOS/GeForceGLDriver:

78e883f8 02 7c11

replaced by:

78e883f8 03 7c11

 

GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib:

8B 87 1C 0C 00 00 89 06 8B 87 20 0C 00 00 89 02

replaced by:

31 C0 FF C0 FF C0 89 06 31 C0 89 02 90 90 90 90

 

So everything should be fine with my "GeForceGLDriver.bundle" i guess ?

 

The problem is i cant get 1920*1080 with QE working with the new "NVDAGF100Hal.kext version: 7.12.9".

I have to go back to "NVDAGF100Hal.kext version: 7.10.8" to get full resolution and QE etc. working... ;)

 

What version of NVDAGF100Hal.kext are you sing for your gf 560ti? Do i need to change something in the new one?

 

Please help me and i will send you me firstborn kid :D

Or if it's easier can you share the files you using as we have the same card etc.?

Link to comment
Share on other sites

So everything should be fine with my "GeForceGLDriver.bundle" i guess ?

yes.

The problem is i cant get 1920*1080 with QE working with the new "NVDAGF100Hal.kext version: 7.12.9".

I have to go back to "NVDAGF100Hal.kext version: 7.10.8" to get full resolution and QE etc. working... :wallbash:

 

What version of NVDAGF100Hal.kext are you sing for your gf 560ti? Do i need to change something in the new one?

NVDAGF100Hal 7.12.9 270.05.10f03 ... you did add the device id, right? "0x120010de&0xffc0ffff"

Link to comment
Share on other sites

EDIT: It's working now, think i had some permission problems. Thanks for the help once again!

 

Tried to run luxmark and i get the:RUNTIME ERROR: Unable to find any appropiate IntersectionDevice

Dont know if i miss editing something? Sorry for being a noob btw :wallbash:

 

I made the changes below (posted before but here they are again:

 

GeForceGLDriver.bundle/Contents/MacOS/GeForceGLDriver:

78e883f8 02 7c11

replaced by:

78e883f8 03 7c11

 

GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib:

8B 87 1C 0C 00 00 89 06 8B 87 20 0C 00 00 89 02

replaced by:

31 C0 FF C0 FF C0 89 06 31 C0 89 02 90 90 90 90

 

Changed the device ID in "NVDAGF100Hal.kext" info.plist

 

<key>IOPCIPrimaryMatch</key>

<string>

0x06c010de&0xffe0ffff

0x0dc010de&0xffc0ffff

0x0e2010de&0xffe0ffff

0x0ee010de&0xffe0ffff

0x0f0010de&0xffc0ffff

0x104010de&0xffc0ffff

0x124010de&0xffc0ffff

0x120010de&0xffc0ffff

</string>

Link to comment
Share on other sites

Hi again ;-)

 

Luxmark i running fine now!

 

I tried to run OpenCL ocean_wave but it fails on launch with the message below. Have you got it working with your 560ti?

 

Connecting to NVIDIA GeForce GTX 560 Ti...

Error opening file compute_kernels.cl

Segmentation fault: 11

Link to comment
Share on other sites

Hi again ;-)

 

Luxmark i running fine now!

 

I tried to run OpenCL ocean_wave but it fails on launch with the message below. Have you got it working with your 560ti?

 

Connecting to NVIDIA GeForce GTX 560 Ti...

Error opening file compute_kernels.cl

Segmentation fault: 11

 

Hi,

this error type ( error opening file ....) happens when you try to start the app by simple double clicking - NOT WORKING because its an command line app.

1. Start the Terminal

2. cd SPACE (drag&drop the folder which contains the app+files into the terminal)

3. drag & drop the app into the terminal ENTER

Link to comment
Share on other sites

@cmf,

 

Do I need to apply both OpenCL patches, the one for "GeForceGLDriver" Netkas patch, and the one from this thread for "libclh.dylib"?

 

If so do I need to apply the patch x86 and x86_64 for 10.7.2 or just the x86_64? CAn I apply both in case I boot kernel 32 bit?

 

I left a comment for you over at Netkas Thread.

 

I have an Asus G74SX with a GTX 560M Nvidia graphics card and with the 10.7.1 update I applied only the Netkas hack back then. No issues QE/CL working 100% with the latest Chameleon trunk but now:

 

After the 10.7.2 update my desktop kept freezing with the latest Chameleon trunk using Chameleon Wizard. The freezing went away as soon as I installed the latest version of Chimera v1.5.4 r1394. That makes no sense why does Chimera support this card and the latest release of the Chameleon is now causing a freeze to my desktop?

 

Here was the kernel message I got when it was freezing under Chameleon r1657.

 

Oct 25 14:16:29 osxfr33ks-MacBook-Pro kernel[0]: NVDA(OpenGL): Channel exception! exception type = 0x26 = FECS Err: Watchdog Timeout

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: NVDA(OpenGL): Channel timeout!

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: IOVendorGLContext::ReportGPURestart

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 0000006e

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 00080000 00000000 00000000 00000000

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 00000000 00000000 00000000 0000000b

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 00000000 00000000 00000081

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 00000000 00000000

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: NVDA(OpenGL): Channel exception! exception type = 0xd = GR: SW Notify Error

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 0000006e

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 00200000 00009097 00000000 00000000

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 00000000 0000130c 00000201 0000000b

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 00000000 00000000 00000000

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: 00000000 00000000

Oct 25 14:16:47 osxfr33ks-MacBook-Pro kernel[0]: NVDA(OpenGL): Channel exception! exception type = 0xd = GR: SW Notify Error

 

Anyhow everything working with Chimera and the patch from this thread just need to know if I should also apply the newly modified Netkas Patch for 10.7.2 both x86 and x86_64, in case I ever boot 32 bit mode which I do sometimes to be able to run my Sprint Satellite card and software which only runs at 32 bit.

 

which one of these 3 patches do I use for the GTX 560M?

 

sm 1.3:

31 C0 FF C0 89 06 FF C0 FF C0 89 02 90 90 90 90

 

sm 1.2:

31 C0 FF C0 89 06 FF C0 89 02 90 90 90 90 90 90

 

sm 1.1:

31 C0 FF C0 89 06 89 02 90 90 90 90 90 90 90 90

 

Thanks

 

 

EDITED COUPLE MINUTES LATER:

 

I am not getting the water wav affect for the add widgets in dashboard anymore? Even with show dashboard as space unchecked. Anyone else?

Link to comment
Share on other sites

@cmf,

 

 

I have an Asus G74SX with a GTX 560M Nvidia graphics card and with the 10.7.1 update I applied only the Netkas hack back then. No issues QE/CL working 100% with the latest Chameleon trunk but now:

 

After the 10.7.2 update my desktop kept freezing with the latest Chameleon trunk using Chameleon Wizard. The freezing went away as soon as I installed the latest version of Chimera v1.5.4 r1394. That makes no sense why does Chimera support this card and the latest release of the Chameleon is now causing a freeze to my desktop?

 

 

 

 

EDITED COUPLE MINUTES LATER:

 

I am not getting the water wav affect for the add widgets in dashboard anymore? Even with show dashboard as space unchecked. Anyone else?

 

I have a 560M and I get the waves (i'm on 10.7.2 with latest chimera). I get the channel exception error and lockup though if I go around in the app store. But if I delete the AppleGraphicsPowerManagement.kext, it doesn't hard lock up, but seems to restart the desktop and go to the login screen. I happens far less often though.

 

I don't think I have "libclh.dylib" patched, but I'll look.

Link to comment
Share on other sites

Ok I see the wav now I forget to x out the more widgets window which remains and once I remove the window and add more widgets I see the affect.

 

I hex edited both files not just the GLDriver but the libclh.dylib as well. My question still remains which one of the sm strings do I use, sm1.1, sm1.2 or sm1.3?

 

Also for the 10.7.2 GlDriver, only hex edit x86_64 alone or hex edit both x86 and x86_64 in case I ever boot 32 bit mode? Would this create a problem if I do both hex edits in the GlDriver binary?

Link to comment
Share on other sites

I hex edited both files not just the GLDriver but the libclh.dylib as well. My question still remains which one of the sm strings do I use, sm1.1, sm1.2 or sm1.3?

neither. use the one from the first post.

 

and yes, the netkas opencl fix is always needed to make it work in the first place.

Link to comment
Share on other sites

@cmf,

 

Thanks for getting back with me. I just noticed that the one from the first post is different from the three you posted later. What are the others for?

 

Thanks

the one from the first post sets it to ptx 2.0, the others set it to ptx 1.x. the latter is only required if you have another non-fermi card installed.

Link to comment
Share on other sites

Hi guys with Ocinfo i had the seguent info:

What can i do to make work LUXMARK benchmark? My error on benchmark is: OpenCL ERROR: clBuildProgram(-11)

 

[Device 1]

Name: GeForce GTX 550 Ti

Vendor: NVIDIA

Type: GPU

Device Version: OpenCL 1.1

Driver Version: CLH 1.0

Compute Units: 4

Work Group Size: 1024

Clock: 0 MHz

Global Memory: 1024 MB

Local Memory: 48 KB

Cache Size: 0 KB

Cache Line Size: 0 Bytes

Available: Yes

Double-Precision: No

Extensions:

cl_APPLE_SetMemObjectDestructor

cl_APPLE_ContextLoggingFunctions

cl_APPLE_clut

cl_APPLE_query_kernel_names

cl_APPLE_gl_sharing

cl_khr_gl_event

cl_khr_byte_addressable_store

cl_khr_global_int32_base_atomics

cl_khr_global_int32_extended_atomics

cl_khr_local_int32_base_atomics

cl_khr_local_int32_extended_atomics

cl_APPLE_fp64_basic_ops

Link to comment
Share on other sites

  • 2 months later...

EDIT: It's working now, think i had some permission problems. Thanks for the help once again!

 

Tried to run luxmark and i get the:RUNTIME ERROR: Unable to find any appropiate IntersectionDevice

Dont know if i miss editing something? Sorry for being a noob btw ;)

 

I made the changes below (posted before but here they are again:

 

GeForceGLDriver.bundle/Contents/MacOS/GeForceGLDriver:

78e883f8 02 7c11

replaced by:

78e883f8 03 7c11

 

GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib:

8B 87 1C 0C 00 00 89 06 8B 87 20 0C 00 00 89 02

replaced by:

31 C0 FF C0 FF C0 89 06 31 C0 89 02 90 90 90 90

 

Changed the device ID in "NVDAGF100Hal.kext" info.plist

 

<key>IOPCIPrimaryMatch</key>

<string>

0x06c010de&0xffe0ffff

0x0dc010de&0xffc0ffff

0x0e2010de&0xffe0ffff

0x0ee010de&0xffe0ffff

0x0f0010de&0xffc0ffff

0x104010de&0xffc0ffff

0x124010de&0xffc0ffff

0x120010de&0xffc0ffff

</string>

 

 

easy to mod the dylib but the bynary ? the string/hex is not visible anywhere!

I have the 460m gtx

tried the chimera.. the newer 1784 chameleon..

I've edited the 100hal OK

edited the dylib OK

 

and opencl never run

with 100HAL injected but no mods in the bundle:

-with chimera is not good.. the laptop boots without a part of ACPI and never recognizes a lot of parts (also wifi!)

-with chameleon starts fine.. QE/CI but lower benchmarks and the obviously random freeze

 

in the past I tried to overwrite and mix and match older rev of invidia drivers.. no luck

2 weeks ago in this tread I had some instructions.. but I suppose the road is longer .. more than I expected

Link to comment
Share on other sites

  • 2 weeks later...

always error in luxmarks.. other opencl bench run!

why?

 

2012-02-01 22:19:40 - [RenderConfig] scene.file = scenes/sala/sala.scn
2012-02-01 22:19:42 - [LuxRays] OpenCL Platform 0: Apple
2012-02-01 22:19:42 - [LuxRays] Device 0 NativeThread name: NativeThread-000
2012-02-01 22:19:42 - [LuxRays] Device 1 NativeThread name: NativeThread-001
2012-02-01 22:19:42 - [LuxRays] Device 2 NativeThread name: NativeThread-002
2012-02-01 22:19:42 - [LuxRays] Device 3 NativeThread name: NativeThread-003
2012-02-01 22:19:42 - [LuxRays] Device 4 NativeThread name: NativeThread-004
2012-02-01 22:19:42 - [LuxRays] Device 5 NativeThread name: NativeThread-005
2012-02-01 22:19:42 - [LuxRays] Device 6 NativeThread name: NativeThread-006
2012-02-01 22:19:42 - [LuxRays] Device 7 NativeThread name: NativeThread-007
2012-02-01 22:19:42 - [LuxRays] Device 8 OpenCL name: Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
2012-02-01 22:19:42 - [LuxRays] Device 8 OpenCL type: CPU
2012-02-01 22:19:42 - [LuxRays] Device 8 OpenCL units: 8
2012-02-01 22:19:42 - [LuxRays] Device 8 OpenCL max allocable memory: 6144MBytes
2012-02-01 22:19:42 - RUNTIME ERROR: No OpenCL device selected or available

Link to comment
Share on other sites

always error in luxmarks.. other opencl bench run!

why?

 

2012-02-01 22:19:40 - [RenderConfig] scene.file = scenes/sala/sala.scn
2012-02-01 22:19:42 - [LuxRays] OpenCL Platform 0: Apple
2012-02-01 22:19:42 - [LuxRays] Device 0 NativeThread name: NativeThread-000
2012-02-01 22:19:42 - [LuxRays] Device 1 NativeThread name: NativeThread-001
2012-02-01 22:19:42 - [LuxRays] Device 2 NativeThread name: NativeThread-002
2012-02-01 22:19:42 - [LuxRays] Device 3 NativeThread name: NativeThread-003
2012-02-01 22:19:42 - [LuxRays] Device 4 NativeThread name: NativeThread-004
2012-02-01 22:19:42 - [LuxRays] Device 5 NativeThread name: NativeThread-005
2012-02-01 22:19:42 - [LuxRays] Device 6 NativeThread name: NativeThread-006
2012-02-01 22:19:42 - [LuxRays] Device 7 NativeThread name: NativeThread-007
2012-02-01 22:19:42 - [LuxRays] Device 8 OpenCL name: Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
2012-02-01 22:19:42 - [LuxRays] Device 8 OpenCL type: CPU
2012-02-01 22:19:42 - [LuxRays] Device 8 OpenCL units: 8
2012-02-01 22:19:42 - [LuxRays] Device 8 OpenCL max allocable memory: 6144MBytes
2012-02-01 22:19:42 - RUNTIME ERROR: No OpenCL device selected or available

 

I'm now getting these same errors after updating to 10.7.3, wheras with 10.7.2, no errors.

 

Are the Hex edits different after updating to 10.7.3 or are the the same location and same string within the bundled files?

 

I'm thinking something has changed in 10.7.3. My Cinebench score dropped from 45.22 to 34.96. I had used a script to perform this update, which worked on 10.7.2, but does not on 10.7.3. Many other users are reporting a 25% drop in their benchmark scores. Could OpenCL 2.0 support be the culprit?

Link to comment
Share on other sites

echo "export CL_ENABLE_SM2_DEVICE=1" >> ~/.profile

seems to be absolutely necessary now. it won't get recognized otherwise, even with binary patches.

 

Are the Hex edits different after updating to 10.7.3 or are the the same location and same string within the bundled files?

not sure about the locations, but the hex strings are still the same as in 10.7.2.

 

 

looking into luxmark right now, will report back when i figured out whose fault it is ;)

Link to comment
Share on other sites

hello, GTS450 working

 

GeForceGLDriver.bundle/Contents/MacOS/GeForceGLDriver:

78e883f8 02 7c11

replaced by:

78e883f8 03 7c11

 

GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib:

8B 87 1C 0C 00 00 89 06 8B 87 20 0C 00 00 89 02

replaced by:

31 C0 FF C0 FF C0 89 06 31 C0 89 02 90 90 90 90

 

Insert the device ID in "NVDAGF100Hal.kext" info.plist

 

<key>IOPCIPrimaryMatch</key>

<string>

0x06c010de&0xffe0ffff

0x0dc010de&0xffc0ffff

0x0e2010de&0xffe0ffff

0x0ee010de&0xffe0ffff

0x0f0010de&0xffc0ffff

0x104010de&0xffc0ffff

0x124010de&0xffc0ffff

0x0dc410de&0xffc0ffff

</string>

 

but Boinc http://boinc.berkeley.edu/ won´t reconice the openCL like it does before in 10.7.2

post-282470-0-65780900-1328862415_thumb.png

Link to comment
Share on other sites

I'm not sure if this has anything to do with the Open Cl 2.0 fix or not, but I just verified that I am not getting QE Ci with 10.7.3. I tried adding a Widget to the dashboard, and I no longer get the ripple effect as I previously did.

Link to comment
Share on other sites

@cmf,

 

Same hex code in 10.7.2 as in 10.7.3 for for the libclh.dylib, but there is a change in the GeForceGLDriver see below

From Netkas Thread:

 

 

Find

EB A8 83 F8 02 7C 15

replace 02 with 03 to get

EB A8 83 F8 03 7C 15

Find

78 E8 83 F8 02 7C 11

replace 02 with 03 to get

78 E8 83 F8 03 7C 11

 

I cannot find the second set "78 E8 83 F8 02 7C 11"

 

@cmf

 

What exactly is this command for or what will it do?

 

 

echo "export CL_ENABLE_SM2_DEVICE=1" >> ~/.profile

 

 

 

Thanks

 

 

EDITED A FEW MINUTES LATER:

 

My Fault I was using Crossover with the windows Editor HxD and for some reason it did not find the second set hex string. Ran a Mac Hex editor and it found it.

 

Sorry!!

 

I still have the question about the export CL_ENABLE_SM2_DEVICE=1

Link to comment
Share on other sites

What exactly is this command for or what will it do?

 

echo "export CL_ENABLE_SM2_DEVICE=1" >> ~/.profile

it's basically the same as the GeForceGLDriver binary patch, but at the user level (and w/o modifying the binary, obviously).

i.e. if the binary patch isn't applied, opencl will only function if "CL_ENABLE_SM2_DEVICE" is defined/set in the users environment vars (default set can be changed through the .profile file).

so, if apple ever decides to add an additional "is device opencl capable?" check or shuffle the code around, so that the binary patch doesn't work any more (which i'm surprised it did in 10.7.3), opencl should still work (to some extent*) using the "CL_ENABLE_SM2_DEVICE" define.

 

downside and (*): this really only works if a program is started by the user and the program doesn't ignore/overwrite the cl define or does some other weird stuff (like luxmark does ...). some system services/programs that use opencl and are started by another user also won't be able to use the opencl device (just open activity monitor and look which processes aren't owned by your user).

 

 

and ftr and the people that are interested:

the GeForceGLDriver binary patch changes the "cmp eax, 2" to a "cmp eax, 3" (note: eax at this point contains the major version of the nvidia compute capability (cc/sm), which is 2 for fermi gpus), so the subsequent jump condition will evaluate to true (instead of false, b/c 2 is not less than 2, but 2 < 3!) and continue @loc_8F014446. if the binary patch has not been applied (it still says "cmp eax, 2"), it will continue and check if "CL_ENABLE_SM2_DEVICE" is defined in the users env vars. if so, it will also continue @loc_8F014446. if not, the device will be "destroyed" and be "declared" not opencl capable.

post-42821-0-80167800-1329226199_thumb.png

Link to comment
Share on other sites

 Share

×
×
  • Create New...