Jump to content

OpenCL fix for non-GF100/GF110 cards (aka CC/SM 2.1+)


cmf
 Share

138 posts in this topic

Recommended Posts

Hi i have a lenovo g460 running 10.8 having following configuration:

Intel core i5 520m

4 gb ram

500 gb hdd

nvidia geforce 310m

I have succcessfully installed 10.8 and also booted up without any problems but the system hangs after a couple of minutes after boot. I can move the mouse but everything else is frozen. In the system log i get this:

NVDA(OpenGL): Channel exception! exception type = 0xd = GR: SW Notify Error

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: IOVendorGLContext::ReportGPURestart

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 0000006e

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00200000 00008597 00000474 00000040

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 0000047e 000017b4 00000001 0000000a

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00000000 00000000 00000002

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00000040 00000000

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: NVDA(OpenGL): Channel exception! exception type = 0xd = GR: SW Notify Error

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 0000006e

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00200000 00008597 00000474 00000010

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 0000047e 00000dfc 0000002b 0000000a

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00000000 00000000 00000403

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00000010 00000000

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: NVDA(OpenGL): Channel exception! exception type = 0xd = GR: SW Notify Error

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 0000006e

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00200000 00008597 00000474 00000010

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 0000047e 00000dfc 0000006e 0000000a

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00000000 00000000 00000403

Jul 18 20:20:49 Sankets-MacBook-Pro kernel[0]: 00000010 00000000

How do i solve this?? In 0.7.4 the gpu worked just fine. Thanks in advance.

Link to comment
Share on other sites

  • 3 weeks later...

Hello to all, i want to make a request for anyone that have Final Cut Pro X installed, and is using this kind of video card, if the OPENCL ativated by this method is used by the FCPX for rendering video, because some people reported that especialy in FCP the GPU is not activated to render, just the CPU, and after make this procedure.

Can someone please test and post here?

 

I know that some guys post in tonymac forum that they can see what i wrote, with a system temperatu monitor, the gpu temp stay the same after some time in rendering, only the cpu temp raises, reaching the conclusion that the OPENCL activated did not work for Final Cut.

 

Sincerely thanks since now.

Link to comment
Share on other sites

  • 3 weeks later...

from oclinfo:

1 OpenCL platform found!
[Platform 0]
Name:   Apple
Vendor:  Apple
Version:  OpenCL 1.2 (Jun 20 2012 14:18:19)
Profile:  FULL_PROFILE
Extensions:	cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event

[OpenCL-only Context]
2 OpenCL devices found!
[Device 0]
Name:	Intel® Core™2 Duo CPU	 E6550  @ 2.33GHz
Vendor:   Intel
Type:	CPU
Device Version:  OpenCL 1.2
Driver Version:  1.1
Compute Units:   2
Work Group Size:  1024
Clock:	2327 MHz
Global Memory (Total):  8192 MB
Global Memory (Host):  8192 MB
Global Memory (PCIe):  0 MB
Local Memory:   32 KB
Cache Size:   0.0625 KB
Cache Line Size:  4194304 Bytes
Available:   Yes
Double-Precision:  Yes
Extensions:
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_APPLE_clut
cl_APPLE_query_kernel_names
cl_APPLE_gl_sharing
cl_khr_gl_event
cl_khr_fp64
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_3d_image_writes
cl_APPLE_fp64_basic_ops
cl_APPLE_fixed_alpha_channel_orders
cl_APPLE_biased_fixed_point_image_formats
[Device 1]
Name:	GeForce GT 520
Vendor:   NVIDIA
Type:	GPU
Device Version:  OpenCL 1.1
Driver Version:  CLH 1.0
Compute Units:   1
Work Group Size:  1024
Clock:	1620 MHz
Global Memory:   1024 MB
Local Memory:   48 KB
Cache Size:   0 KB
Cache Line Size:  0 Bytes
Available:   Yes
Double-Precision:  No
Extensions:
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_APPLE_clut
cl_APPLE_query_kernel_names
cl_APPLE_gl_sharing
cl_khr_gl_event
cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_APPLE_fp64_basic_ops
[shared OpenCL+OpenGL Context]
2 OpenCL devices found!
[Device 0]
Name:	GeForce GT 520
Vendor:   NVIDIA
Type:	GPU
Device Version:  OpenCL 1.1
Driver Version:  CLH 1.0
Compute Units:   1
Work Group Size:  1024
Clock:	1620 MHz
Global Memory:   1024 MB
Local Memory:   48 KB
Cache Size:   0 KB
Cache Line Size:  0 Bytes
Available:   Yes
Double-Precision:  No
Extensions:
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_APPLE_clut
cl_APPLE_query_kernel_names
cl_APPLE_gl_sharing
cl_khr_gl_event
cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_APPLE_fp64_basic_ops
[Device 1]
Name:	Intel® Core™2 Duo CPU	 E6550  @ 2.33GHz
Vendor:   Intel
Type:	CPU
Device Version:  OpenCL 1.2
Driver Version:  1.1
Compute Units:   2
Work Group Size:  1024
Clock:	2327 MHz
Global Memory (Total):  8192 MB
Global Memory (Host):  8192 MB
Global Memory (PCIe):  0 MB
Local Memory:   32 KB
Cache Size:   0.0625 KB
Cache Line Size:  4194304 Bytes
Available:   Yes
Double-Precision:  Yes
Extensions:
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_APPLE_clut
cl_APPLE_query_kernel_names
cl_APPLE_gl_sharing
cl_khr_gl_event
cl_khr_fp64
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_3d_image_writes
cl_APPLE_fp64_basic_ops
cl_APPLE_fixed_alpha_channel_orders
cl_APPLE_biased_fixed_point_image_formats
logout
[Process completed]

...so i have something... :whistle: ...just not sure what

 

and from displacement, run from terminal

Downloads/Release/displacement
----------------------------------------------------------------------
Setting up Graphics...
----------------------------------------------------------------------
Creating Shadow FrameBuffer...
Creating Jitter Texture...
Loading Light Probe "stpeters_probe.pfm"
Creating Light Probe Texture (1500 x 1500)....
----------------------------------------------------------------------
Filling Sphere 1040384 bytes 65024 elements (127 x 512) => (127 x 512)
Loading Shader Program "fresnel.vert"...
Loading Shader Program "fresnel.frag"...
Loading Shader Program "phong.vert"...
Loading Shader Program "phong.frag"...
Loading Shader Program "skybox.vert"...
Loading Shader Program "skybox.frag"...
----------------------------------------------------------------------
Setting up Compute...
----------------------------------------------------------------------
Using active OpenGL context...
----------------------------------------------------------------------
Connecting to NVIDIA GeForce GT 520...
----------------------------------------------------------------------
Allocating buffers on compute device...
----------------------------------------------------------------------
Loading kernel source from file 'displacement_kernel.cl'...
----------------------------------------------------------------------
Building compute program...
OpenCL Build Warning : Compiler build log:
<program source>:107:5: warning: no previous prototype for function 'mod'
int mod(int x, int a)
^
<program source>:116:7: warning: no previous prototype for function 'mix1d'
float mix1d(float a, float b, float t)
  ^
<program source>:124:8: warning: no previous prototype for function 'mix2d'
float2 mix2d(float2 a, float2 b, float t)
   ^
<program source>:132:8: warning: no previous prototype for function 'mix3d'
float4 mix3d(float4 a, float4 b, float t)
   ^
<program source>:140:7: warning: no previous prototype for function 'smooth'
float smooth(float t)
  ^
<program source>:145:5: warning: no previous prototype for function 'lattice3d'
int lattice3d(int4 i)
^
<program source>:150:7: warning: no previous prototype for function 'gradient3d'
float gradient3d(int4 i, float4 v)
  ^
<program source>:157:8: warning: no previous prototype for function 'normalized'
float4 normalized(float4 v)
   ^
<program source>:166:7: warning: no previous prototype for function 'gradient_noise3d'
float gradient_noise3d(float4 position)
  ^
<program source>:214:7: warning: no previous prototype for function 'ridgedmultifractal3d'
float ridgedmultifractal3d(
  ^
<program source>:223:8: warning: unused variable 'remainder'
	float remainder = 0.0f;
		  ^
<program source>:224:8: warning: unused variable 'sample'
	float sample = 0.0f;  
		  ^
<program source>:252:8: warning: no previous prototype for function 'cross3'
float4 cross3(float4 va, float4 vb)
   ^
<program source>:280:14: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int')
if(index >= count)
   ~~~~~ ^  ~~~~~
<program source>:283:10: warning: unused variable 'di'
int2 di = (int2)(tx, ty);
	 ^
<program source>:275:9: warning: unused variable 'ix'
int ix = (int) dimx;
	^

Break on OpenCLWarningBreak to debug.
Creating kernel 'displace'...
Maximum Workgroup Size '768'
----------------------------------------------------------------------
Starting event loop...
----------------------------------------------------------------------
Leslies-Mac-Pro:Release leslie$

 

 

a screen from displacement...

 

post-11772-0-43014200-1347491059_thumb.png

 

here's a screenshot of an OpenCL based screensaver...

 

post-11772-0-80737300-1347488254_thumb.png

Link to comment
Share on other sites

...think i've got the fix now...from "the other place"

 

"Anyway I just wanted to share this patch for libclh.dylib since the old one doesn't work anymore and it's quite frustrating that OpenCL isn't working out of the box so to speak. And it took some time to work it out since Nvidia anonymized the function names in the library.

 

Just copy and paste this in a terminal window (don't forget to backup libclh.dylib beforehand).

 

Code:

sudo perl -pi -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); }' /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib

Alternatively you can use a hex editor to search for:

8b 81 1c 0c 00 00 eb 06 8b 81 20 0c 00 00

 

and replace it by:

b8 02 00 00 00 90 eb 06 b8 00 00 00 00 90

 

There should be only one occurrence of this in the whole file. The perl script will tell you the number of substitutions it made.

 

Normally it should take effect immediately so there's no need to reboot, it should also trigger a rebuild of the kernel cache.

If like me you're a cautious person feel free to do those two things manually." (all credit to this code digger) credit removed

 

...using the new drivers from nvidia for 10.8.1 i can now run the LuxMark v2.0 benchmark ... :thumbsup_anim:

 

post-11772-0-87276400-1347596704_thumb.png post-11772-0-83474800-1347597133_thumb.png

Edited by robertx
Link to tonymac removed
  • Like 2
Link to comment
Share on other sites

for the stock 10.8.2 drivers (8.0.61 295.30.20f02 version) this fix still works

 

"So, open up a hex editor of your liking and do this:

open /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib (as root or with sudo)

find: 8B 87 1C 0C 00 00 89 06 8B 87 20 0C 00 00 89 02

replace by: 31 C0 FF C0 FF C0 89 06 31 C0 89 02 90 90 90 90

save

reboot is not required, but recommended" from the first post... :thumbsup_anim:

Link to comment
Share on other sites

  • 3 weeks later...

for the stock 10.8.2 drivers (8.0.61 295.30.20f02 version) this fix still works

 

"So, open up a hex editor of your liking and do this:

open /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib (as root or with sudo)

find: 8B 87 1C 0C 00 00 89 06 8B 87 20 0C 00 00 89 02

replace by: 31 C0 FF C0 FF C0 89 06 31 C0 89 02 90 90 90 90

save

reboot is not required, but recommended" from the first post... :thumbsup_anim:

 

By doing so, OpenCL doesn't work for me.

I'm on OS X 10.8.2 (12C60) and my graphics card is Gigabyte GeForce GTX 560 Ti 1024 MB.

What I am doing wrong?

Greetings and thanks.

Link to comment
Share on other sites

By doing so, OpenCL doesn't work for me.

I'm on OS X 10.8.2 (12C60) and my graphics card is Gigabyte GeForce GTX 560 Ti 1024 MB.

What I am doing wrong?

Greetings and thanks.

 

well,

 

this script with latest nVidia drivers get back openCL for me

 

sudo perl -pi -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); }' /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib

Link to comment
Share on other sites

  • 5 months later...

Hi, this guide is succesfully enable my openCL :thumbsup_anim:

but when click on oceanwave test the image is having some glitches

(see the bottom of the sun)

is it normal???

 

Screen Shot 2013-04-03 at 8.59.09 AM.png

 

No, the rendering is not correct. The waves are missed.

You can try much newer (than 1.2) OceanWave version , but i think (because the OpenCL code wasnt changed) the missing waves will stay.

Such things may happen with some of the subgpu types which not already fully supported by the OpenCL drivers .

to be more excact: to on the fly OpenCL compiler which compiles the source for the found gpu type on the host system. Other than CUDA, which is compiled by the dev already (for several nvidia gpu types) OpenCL used OpenCL source code and compiled it later on run time (first run).

 

Doesnt matter much, because the complex used OpenCL functions in OceanWave , which fail on your gpu may not used in other OpenCL acceled apps.

  • Like 1
Link to comment
Share on other sites

No, the rendering is not correct. The waves are missed.

You can try much newer (than 1.2) OceanWave version , but i think (because the OpenCL code wasnt changed) the missing waves will stay.

Such things may happen with some of the subgpu types which not already fully supported by the OpenCL drivers .

to be more excact: to on the fly OpenCL compiler which compiles the source for the found gpu type on the host system. Other than CUDA, which is compiled by the dev already (for several nvidia gpu types) OpenCL used OpenCL source code and compiled it later on run time (first run).

 

Doesnt matter much, because the complex used OpenCL functions in OceanWave , which fail on your gpu may not used in other OpenCL acceled apps.

 

Thanks for the reply,

anyway a redo all the patch from the beginning

now the rendering is flickering between screenshot #1 and #2

i think it's getting better?

 

post-1128849-0-03520000-1365038495_thumb.png

post-1128849-0-16576700-1365038510_thumb.png

Link to comment
Share on other sites

...think i've got the fix now...from "the other place"

 

"Anyway I just wanted to share this patch for libclh.dylib since the old one doesn't work anymore and it's quite frustrating that OpenCL isn't working out of the box so to speak. And it took some time to work it out since Nvidia anonymized the function names in the library.

 

Just copy and paste this in a terminal window (don't forget to backup libclh.dylib beforehand).

 

Code:

sudo perl -pi -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); }' /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/libclh.dylib

 

 

Thank YOU man! It's works fine for me. i'm using G75VW RS 72 and 10.8.3 and GTX 670m 3GB GDDR5

 

http://cl.ly/O6ro

Link to comment
Share on other sites

  • 2 weeks later...
  • 2 weeks later...
  • 2 weeks later...

for the new nvidia "Web" drivers this works for me

...in Terminal type:

sudo perl -pi -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); }' /System/Library/Extensions/GeForceGLDriverWeb.bundle/Contents/MacOS/libclh.dylib

:smoke:

 

edit: although the fix works, i now get this error in safari while attempting to view the iphone ad...

5/16/13 4:50:36.000 PM kernel[0]: NVDA(Video): Channel exception! exception type = 0x1f = Fifo: MMU Error

5/16/13 4:50:36.000 PM kernel[0]: NVDA(Video): Channel exception! exception type = 0x1f = Fifo: MMU Error

5/16/13 4:50:44.365 PM WebProcess[405]: VADriver: Channel timeout (client), ch = 2

5/16/13 4:50:56.000 PM kernel[0]: NVDA(Video): Channel timeout!

5/16/13 4:51:16.000 PM kernel[0]: NVDA(Video): Channel timeout!

5/16/13 4:51:24.809 PM WebProcess[405]: VADriver: Channel timeout (client), ch = 3

5/16/13 4:51:36.000 PM kernel[0]: NVDA(OpenGL): Channel timeout!

5/16/13 4:51:57.000 PM kernel[0]: NVDA(Video): Channel timeout!

5/16/13 4:52:17.000 PM kernel[0]: NVDA(OpenGL): Channel timeout!

...totally locks up my system...i'll keep hunting

 

...changed my smbios.plist to MacPro 3.1 and no longer have the channel timeout freeze in safari

Edited by robertx
Link to comment
Share on other sites

for the new nvidia "Web" drivers this works for me

...in Terminal type:

sudo perl -pi -e '$c+=s/\x8b\x81\x1c\x0c\x00\x00\xeb\x06\x8b\x81\x20\x0c\x00\x00/\xb8\x02\x00\x00\x00\x90\xeb\x06\xb8\x00\x00\x00\x00\x90/; END { printf "%s: %d substitution%s made.\n",($c==1 ? "Success" : "Error"),$c,(!$c || $c>1 ? "s" : ""); $?=($c!=1); }' /System/Library/Extensions/GeForceGLDriverWeb.bundle/Contents/MacOS/libclh.dylib

:smoke:

 

edit: although the fix works, i now get this error in safari while attempting to view the iphone ad...

5/16/13 4:50:36.000 PM kernel[0]: NVDA(Video): Channel exception! exception type = 0x1f = Fifo: MMU Error

5/16/13 4:50:36.000 PM kernel[0]: NVDA(Video): Channel exception! exception type = 0x1f = Fifo: MMU Error

5/16/13 4:50:44.365 PM WebProcess[405]: VADriver: Channel timeout (client), ch = 2

5/16/13 4:50:56.000 PM kernel[0]: NVDA(Video): Channel timeout!

5/16/13 4:51:16.000 PM kernel[0]: NVDA(Video): Channel timeout!

5/16/13 4:51:24.809 PM WebProcess[405]: VADriver: Channel timeout (client), ch = 3

5/16/13 4:51:36.000 PM kernel[0]: NVDA(OpenGL): Channel timeout!

5/16/13 4:51:57.000 PM kernel[0]: NVDA(Video): Channel timeout!

5/16/13 4:52:17.000 PM kernel[0]: NVDA(OpenGL): Channel timeout!

...totally locks up my system...i'll keep hunting

Safari is buggy for me to.. using Chrome as browser now. No freezes, although I'm using the DP3 kexts.

Link to comment
Share on other sites

better late than never: *updated first post for 10.8.3+ with the info posted by robertx.


for the people who are interested, the replacement code does this:


movl $2, %eax

nop

jmp 6

movl $0, %eax

nop

so you should be able to set this to a different CC by replacing $MAJOR and $MINOR in this sequence:
B8 $MAJOR 00 00 00 90 EB 06 B8 $MINOR 00 00 00 90


if any of the gtx titan / gk110 folks reads this, please test if this makes opencl work on those devices (sets it to CC3.0 which is used for gk104 devices):
B8 03 00 00 00 90 EB 06 B8 00 00 00 00 90

this does not work!

Link to comment
Share on other sites

 Share

×
×
  • Create New...