Jump to content

[Solved] Debugging intermittent sleep problems when using WEG boot-arg igfxfw=2 with HP EliteDesk 800 G4/G5 Mini


deeveedee
 Share

37 posts in this topic

Recommended Posts

This issue is solved with solution posted here.

 

===========================================

 

This thread is intended to debug sleep problems when OC boot-arg igfxfw=2 is used on the  HP EliteDesk 800 G4/G5 Mini as noted here.  I didn't want to clutter the original thread, since I suspect that this thread will be both a diagnostic and a learning thread for participants.  I have attached an EFI (based on OC 0.8.2) that includes changes to enable ACPI debugging.  This attached EFI uses Rehabman's ACPI Debugging utility which you can read about here.

 

If you want to participate in this diagnostic effort to figure out why the igfxfw=2 boot-arg causes the sleep problem and/or you wish to figure out how to solve the problem (which may be as simple as @FredWst 's suggested fix which is to add a delay in _PTS as he noted here), you will need the following:

  • Modify the attached EFI config.plist with your own PlatformInfo > Generic attributes and DeviceProperties.  If you're booting from a SATA SSD, set Kernel > Quirks > ThirdPartyDrives = True.
  • Build your own ACPIDebug.kext and replace my ACPIDebug.kext in the attached EFI (EFI > OC > Kexts).  I built my ACPIDebug.kext using XCode 13.4.1.  I signed the kext for local execution, so I don't think it will work for others (you can try it).  Follow Rehabman's kext build instructions here.
  • Boot with the attached/modified EFI
  • View the ACPI debug output using this command in terminal: 
     log show --predicate 'processID == 0 && eventMessage CONTAINS[c] "ACPIDebug"' --start 2022-07-05 --debug

     

 

Note that the addition of ACPI debugging will affect system performance.  You don't want to run with ACPI Debugging enabled unless you're debugging.  Note also that the ACPI Debugging affects timing, so the addition of debug statements to _PTS will introduce delays that may mask the very problem we're trying to find.

 

In order to enable ACPI debugging, I made the following changes to the attached EFI:


   *** The attached EFI is not intended for regular use and has been modified specifically for ACPI Debugging ***
   EFI/OC/Kexts

  •    Added Rehabman's ACPIDebug.kext

   EFI/OC/ACPI

  • Added Rehabman's SSDT-RMDT.aml
  • Added SSDT-PTS.aml with ACPI debugging for _PTS (paired with rename of _PTS to XPTS)

   EFI/OC/config.plist

  • ACPI > Add > SSDT-PTS (original EliteDesk 800 G4/G5 _PTS with ACPI Debugging)
  • ACPI > Add > SSDT-RMDT
  • ACPI > Patch > _PTS -> XPTS (pairs with SSDT-PTS)
  • Kernel > Add > ACPIDebug.kext

 

In order to debug _PTS, I implemented an ACPI patch (in config.plist) to rename the original _PTS to XPTS and created a new _PTS in SSDT_PTS.  SSDT_PTS is a copy of the original HP EliteDesk 800 G4 / G5 _PTS with ACPI debugging statements.  With the debugging statements that I have configured, ACPI debug output (captured with the 'log show...' command above) is as follows:

 

Filtering the log data using "processIdentifier == 0 AND composedMessage CONTAINS[c] "ACPIDebug""
Skipping info messages, pass --info to include.
Timestamp                       Thread     Type        Activity             PID    TTL  
2022-07-05 12:54:14.445715-0400 0x2ae      Default     0x0                  0      0    kernel: (ACPIDebug) ACPIDebug: Version 0.1.4 starting on OS X Darwin 21.5.
2022-07-05 13:06:12.230168-0400 0x1a4a     Default     0x0                  0      0    kernel: (ACPIDebug) ACPIDebug: { "Entering _PTS.  Arg0: ", 0x3, }
2022-07-05 13:06:12.230255-0400 0x1a4a     Default     0x0                  0      0    kernel: (ACPIDebug) ACPIDebug: "Calling HPTS (Arg0)"
2022-07-05 13:06:12.230338-0400 0x1a4a     Default     0x0                  0      0    kernel: (ACPIDebug) ACPIDebug: "We are here #3"

 

OC0.8.2-EFI-r001-DEBUG.zip

Edited by deeveedee
  • Like 1
Link to comment
Share on other sites

Hi @deeveedee I would like to contribute my 2 cents as I saw your post in your thread (and replied accordingly).

 

Regarding igfxfw=2 it was advised to stop using it, rather recently, by the WhateverGreen developer himself. I found his quote when also investigating the impact (or not) on my Intel NUCs.

 

Despite not mentioning any sleep-related issues, @vit9696 wrote this here:

 

Quote

vit9696 commented on 9-13 Aug 2020
In the modern WhateverGreen igfxfw=2 is not benefitial, as host scheduler performance is fixed (by default). Closing this [ticket].

[...]

WEG 1.4.1 got new IGPU performance control support, which you can disable igfxnorspc=1, this is documented.

[...]

I am afraid there is some level of confusion here.

1. Apple GuC, enabled exclusively by igfxfw=2 boot argument and disabled by default is bugged, and will eventually lead to IGPU locking at high frequency. This thing is not planned to be investigated or fixed.
2. CPU scheduling performance was fixed by WEG 1.4.0 allowing the IGPU frequency to reach high values without GuC. This is enabled by default and can be disabled by igfxnorspc=1. We believe it works correctly.
3. Without igfxfw=2 and with igfxnorspc=1 WEG 1.4.1 should also work correctly, but the IGPU will not be able to reach high frequencies.
If you believe that anything of the three does not hold, please provide more detail.

 

Since I had found this comment some time ago, I also dropped using this boot argument... Just FYI. But while using it on my Intel NUC8 personally I did not have problems (with Catalina).

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

As @vit9696 wrote in that ticket reply among other things "Apple GuC, enabled exclusively by igfxfw=2 boot argument and disabled by default is bugged, and will eventually lead to IGPU locking at high frequency." so once I read this last year, I dropped the boot-argument completely.

 

This is very useful information hidden in some bug ticket reply, so it's a shame that this is not mentioned in the main WhateverGreen page on GitHub so that people stop using it. @vit9696 perhaps you can update the README.md file there?

  • Like 1
Link to comment
Share on other sites

I have done some further research on this on igfxfw=2 loading of the apple GUC. Though there has been some isolated reports of people observing the iGPU to be stuck at high clock after processing videos, the initial feedback has been that on iGPU after skylake, the loading of the apple firmware has improved up to 2x the performance of the iGPU. I have seen some reports of configurations where the macOS would outright not load or would take several minutes to load when force loading of the apple firmware. The unverified suspicion is that the bugs are due to apple's firmware being made for Apple specific hardware from intel so YMMV. The very large number of successful reports of use of the the apple GUC though is likely why @vit9696 is not removing it altogether from WEG. I myself have observed an improvement in performance and power management by loading it running my daily driver machines which have been on nearly 24/7 for a couple of years.

On my machine, looking at the bootlog, this is what I am seeing without the apple firmware:

image.thumb.png.67c71bbeb8ea79560b092ca755124997.png

 

And with the apple firmware loaded:

image.png.298542e1864df8fa2af665a4d367b8b6.png

The boot time is a whopping 38ms longer... due to firmware loading.

 

I have also seen people having varying version of the apple firmware. I would not conclude that using the host preemptive scheduler is the way to go for every configuration and that the Apple firmware is guaranteed to be buggy because:

1. That's what a real mac runs so there are likely compatibility issues in corner cases using the host scheduler by not loading the apple GuC as well.

2. Obviously not everyone has observed bugs.

3. The version of the Apple GuC also needs to be taken into consideration as compatibility may vary.

 

There is also a discussion of improving the power management using the later implemented rps-control which has been defaulted in WEG to on and then off which fixed video performance issues without the apple GuC. The two appear to be mutually exclusive and the GuC firmware patch would supercede the rps control patch if both are simultaneously enabled.

Edited by rafale77
  • Like 1
Link to comment
Share on other sites

@rafale77 Based on your research, what are your recommended WhateverGreen boot-args related to Apple GuC?

 

I have not yet performed any testing with ACPI debug enabled (as per Post #1), but based on visual inspection of the sleep behavior with igfxfw=2, it appears that storage (both USB and the SATA SSD) go offline before data writes are completed (resulting in a system freeze where the power LED remains illuminated and screens are black).  For this reason, it makes sense why adding a delay to _PTS (as done by FredWst) "fixes" the intermittent sleep problem when boot-args include igfxfw=2.  With the _PTS delay, storage sleep is delayed enough for the data write to complete.

Link to comment
Share on other sites

For the specific hardware configuration (USB/SATA SSD), I would test in the following order of preference/priority:

 

1. igfxfw=2, igfxrpsc=0 with the PTS delay. Look for iGPU frequency being stuck after video playback (i.e YouTube 4K)

2. igfxfw=0, igfxrpsc=1 and run the same tests without the PTS delay.

 

Anything =0 can just be deleted from the boot-arg

Having neither will lead to the iGPU frequency being capped (at 300MHz if I remember correctly) when processing Videos.

 

I run on an NVMe so I am not having this issues. My only problem with sleep is occasional failure to watch unlock when waking up from sleep... completely different problem altogether.

  • Like 2
Link to comment
Share on other sites

@deeveedee, just a thought: correct me if I am wrong but I don't believe any of the officially supported Mac on Mionterey is expected to boot from a SATA SSD. It wouldn't be a stretch to say that such a configuration is a stretched corner case and that Apple is not designing/testing for this setup and you discovered one aspect of the loss of support of booting from a slow drive which requires additional patching to fix, analogous to running on an AMD CPU or an nVidia GPU...

Link to comment
Share on other sites

@rafale77 You're probably right.  I performed a few experiments and found that adding a delay Sleep(0x50) at the end of _PTS allows my G4 Mini with SATA SSD to reliably sleep when using boot-arg igfxfw=2.  A delay of Sleep(0x30) does not work.  If I add a comfortable delay margin with Sleep(0x100), the delay is still not noticeable.  I have added the delay conditional on Arg0 == 0x03 so that it only applies when the rig is sleeping.  I'll perform a few more tests and if all is good, I'll post the patch and invite others to test.

Link to comment
Share on other sites

To those impacted by the igfxfw=2 sleep issue, please test this fix and report your findings.  Thank you.

 

Details

 

After some experimentation (and the hint from @FredWst mentioned in Post #1), I have a candidate fix for EliteDesk 800 G4 / G5 Minis booting/running macOS from SATA SSDs.  Because of the nature of this fix, if it works, the resulting EFI would become my baseline for all EliteDesk 800 G4 / G5 Minis (even those booting / running macOS from NVMe SSDs).  Those who don't need the fix won't even notice that the fix has been implemented.

 

Attached are two files: OC0.8.2-EFI-R002.zip and OC0.8.2-EFI-R002-CHANGES.zip.  OC0.8.2-EFI-R002-CHANGES.zip contains just the incremental changes needed to implement this fix for OpenCore 0.8.2.  OC0.8.2-EFI-R002.zip is the complete EFI for OC 0.8.2 which includes the changes.  The changes needed to implement this fix are as follows:

 

OC 0.8.2 EFI R002

  • EFI/OC/ACPI
    • Added SSDT-PTS.aml with new _PTS which adds delay after _PTS to fix issue with boot-arg igfxfw=2 (paired with rename of _PTS to XPTS)
  • EFI/OC/config.plist
    • ACPI > Add > SSDT-PTS
    • ACPI > Patch > _PTS -> XPTS (pairs with SSDT-PTS)
    • Restored boot-arg igfxfw=2 after adding delay after _PTS

 

During my experimentation with an EliteDesk 800 G4 Mini 65W (i5-8600, BIOS 02.19.00, 32GB RAM, SATA SSD), I found that adding a delay of Sleep(0x30) after _PTS was insufficient while adding a delay of Sleep(0x50) worked well.  Increasing the delay after _PTS to Sleep(0x100) adds some margin (to accommodate longer storage latency) and does not have a noticeable affect on behavior/performance.  My added delay is conditional on Arg0 == 0x03, so the delay is only applied when the hack sleeps (not for other _PTS states like shutdown).

 

The fix that I have implemented does the following:

  • ACPI patch to rename the original _PTS method to XPTS
  • ACPI add SSDT-PTS.aml which includes a new _PTS method that calls XPTS (the original, now renamed _PTS) followed by a delay Sleep(0x100).

To implement this fix in your EFI, do the following (mimic the changes in the attached files):

  • Add SSDT-PTS.aml to your OC/ACPI folder
  • Add the _PTS -> XPTS patch to ACPI > Patch in your OC config.plist
  • Add SSDT-PTS to ACPI > Add in your OC config.plist
  • Restore boot-arg igfxfw=2 to NVRAM > Add > 7C436110-AB2A-4BBB-A880-FE41995C9F82 > boot-args in your OC config.plist

 

Please test and report your findings.  Thank you.

OC0.8.2-EFI-R002-CHANGES.zip OC0.8.2-EFI-R002.zip

Edited by deeveedee
  • Like 3
Link to comment
Share on other sites

@joevt I took a quick look at your patch and understand it to be relevant to wake.  The problem being investigated in this thread is sleep: specifically, with WEG boot-arg igfxfw=2, method _PTS appears to exit before sleep preparation is completed.  I have only observed this sleep problem when macOS boots/runs from a SATA SSD and have not observed it when macOS boots/runs from an NVMe SSD, so I suspect that the slower read/write/latency of the SATA SSD is to blame.  I don't fully understand the behavioral changes resulting from boot-arg igfxfw=2, so I don't know why there would be extra storage reads/writes caused by igfxfw=2 that do not complete before _PTS exits.  The problem appears to be "fixed" by adding a delay to the end of _PTS.  @FredWst pointed me in the direction for this fix which he had to implement on his Dell.  His solution placed the delay at the beginning of _PTS.

 

Sorry for the long answer.  I think your patch addresses a different wake-related issue than the sleep issue addressed in this thread.

  • Like 1
Link to comment
Share on other sites

@miliuco In your post here, you include both igfxfw=2 and rps-control=1.  Should we be setting both boot-args for UHD 630?

 

EDIT: These boot-args are processed in different WhateverGreen source files as shown below.  igfxfw=2 "force loading of Apple GuC firmware" and igfxrpsc=1 "enable RPS control patch (improves IGPU performance)".

WhateverGreen/kern_igfx.cpp

                if (supportsGuCFirmware && getKernelVersion() >= KernelVersion::HighSierra) {
                        if (!PE_parse_boot_argn("igfxfw", &fwLoadMode, sizeof(fwLoadMode)))
                                WIOKit::getOSDataValue<int32_t>(info->videoBuiltin, "igfxfw", fwLoadMode);
                        if (fwLoadMode == FW_AUTO)
                                fwLoadMode = info->firmwareVendor == DeviceInfo::FirmwareVendor::Apple ? FW_APPLE : FW_DISABLE;
                } else {
                        fwLoadMode = FW_APPLE; /* Do nothing, GuC is either unsupported due to low OS or Apple */
                }
WhateverGreen/kern_igfx_pm.cpp

void IGFX::RPSControlPatch::processKernel(KernelPatcher &patcher, DeviceInfo *info) {
        uint32_t rpsc = 0;
        if (PE_parse_boot_argn("igfxrpsc", &rpsc, sizeof(rpsc)) ||
                WIOKit::getOSDataValue(info->videoBuiltin, "rps-control", rpsc)) {
                enabled = rpsc > 0 && available;
                DBGLOG("weg", "RPS control patch overriden (%u) availabile %d", rpsc, available);
        }
}

 

EDIT2: I remember that I had posted what I thought was a WhateverGreen bug here.  This code is unchanged in the latest WhateverGreen.kext/kern_igfx.cpp (above).  Does anyone else think that the else { fwLoadMode = FW_APPLE; } is wrong in kern_igfx.cpp?  @joevt - is this a bug in WhateverGreen.kext?

Edited by deeveedee
Link to comment
Share on other sites

@deeveedee

Yes, I have both (DeviceProperties instead of boot-args) when Intel UHD 630 (Coffee Lake R) on headless mode. With iMac19,1 SMBIOS. CPU i7-9700. Read at some time those WEG comments and put them in the config file. Really macOS boots also without them in my PC, but it seems to work better with these 2.

 

I also have read igfxfw=2 is deprecated by WEG but I kept it. It is one of those things that sometimes I keep without the reason being 100% clear.

Edited by miliuco
  • Thanks 1
Link to comment
Share on other sites

@miliuco I have added boot-arg igfxrpsc=1 (same as your DeviceProperty rps-control) and will test.  As you implied, guidance/instruction regarding these boot-args is ambiguous at best.  Thank you.

  • Like 1
Link to comment
Share on other sites

Yup, not super well documented as it is the case for a large number of WEG options.

As I mentioned in another post, my own research and testing concluded that

igfxfw=2 is the preferred way to go because that's what a real mac uses. Some odd configurations or BIOS may lead to problems, most reported being the iGPU being stuck at high frequency after a video processing event. I personally have never encountered any issues. @miliucoIt is clearly not deprecated. There has been posts on github from @vit9696 not recommending it anymore due to isolated problem reports the team didn't want to look into in favor of using rps-control. The code is still there and works perfectly fine. Though compatibility may be an issue for some and it is great that there is an alternative, I lean towards disagreeing with that recommendation because of possible unintended future consequences. I philosophically prefer to stay as close to a real mac as possible when there is a choice.

When the apple firmware is loaded using this option, rps control (either from boot arg or using the device property) is not used even if enabled unless the driver decides to not use the firmware power schedule which I have never observed but it doesn't mean it cannot happen. It doesn't hurt to have both enabled in case the driver changes even though only one is being used at any given time. It was coded that way so as not to affect real macs (some users use WEG on real macs apparently to extend GPU compatibility). The RPS control of the iGPU power management is not macOS native and is a patch/workaround for situations when loading the firmware doesn't work. I only recently removed rps control on my setup as I am trying to clean out unnecessary/unused patches.

 

 

 

Edited by rafale77
  • Like 2
  • Thanks 1
Link to comment
Share on other sites

EDIT 3: I think that the "stuck" GFX frequencies I observed on the EliteDesk 800 G4 Mini may be only when using both igfxfw=2 and igfxrpsc=1 boot-args.  I am not seeing this "stuck" GFX frequency (as measured by Intel Power Gadget) with only igfxfw=2, but I need more testing and still would like to see test results from others.

 

EDIT 2: I am not seeing the "stuck" GFX frequencies reported by Intel Power Gadget when using boot-arg igfxfw=2 on an EliteDesk 800 G5 Mini 65W (i9-9900, UHD630).  There appears to be behavioral differences between the G4 and G5 Minis when using boot-arg igfxfw=2.  I'll leave my other comments below unchanged and welcome test results / observations from others.

 

EDIT 1: The GFX Frequency graph displayed by Intel Power Gadget is different with and without the igfx boot-args.  I'm not sure how to interpret this.  I'm inclined to think that this is a measurement problem and not an actual stuck frequency problem.   Details below...

 

I am continuing to test on an EliteDesk 800 G4 Mini 65W (i5-8600, UHD630, 32GB RAM, SATA SSD) and sleep / wake is working perfectly.  I am not sure that I know how to test the "iGPU stuck at high frequency" that has been mentioned by others.  Below is a Frequency graph from Intel Power Gadget while streaming a non-DRM video.  The video quality remains perfect and video playback appears fine, but Intel Power Gadget reports different GFX frequencies after sleep/wake.  The GFX frequency transitions in the graph below occur when resuming from sleep, but I don't observe any material GFX frequency changes while running (without sleeping).  Could the "stuck frequency" problems being reported simply be a problem with Intel Power Gadget's measurement and not a real problem?

 

Intel Power Gadget: Frequency (EliteDesk 800 G4 Mini 65W with igfxfw=2 and igfxrpsc=1)

Spoiler

346011978_ScreenShot2022-07-07at2_07_30PM.png.7a7a09341bd88060ed1fda7ee5b454b5.png

 

Intel Power Gadget: Frequency (EliteDesk 800 G4 Mini without igfxfw=2 and without igfxrpsc=1)

Spoiler

1071866637_ScreenShot2022-07-07at2_33_44PM.png.40701309ebb6792910979143526e83eb.png

 

Intel Power Gadget: Frequency (EliteDesk 800 G4 Mini with only igfxfw=2 (not igfxrpsc=1))

Spoiler

103087318_ScreenShot2022-07-08at6_47_33AM.png.b6628a7ce9618a2aac861937cedfbf41.png

 

System Specs:

  • HP EliteDesk 800 G4 Mini 65W (i5-8600, UHD630, 32GB RAM, SATA SSD, BIOS 02.19.00)
  • Open Core 0.8.2 (baseline EFI here)
  • macOS Catalina 10.15.7 (happens to be what I have on this rig and is the only reason I'm not testing Big Sur or Monterey)
Edited by deeveedee
  • Like 1
Link to comment
Share on other sites

@miliuco I am beginning to suspect that using both rps-control=1 (igfxrpsc=1) and igfxfw=2 properties is causing the "stuck" GFX frequency (measured by Intel Power Gadget) on the EliteDesk 800 G4 Mini.  At the very least, I don't see an improvement when adding igfxrpsc=1 to a system that already employs igfxfw=2.

 

@rafale77 Thank you for continuing to monitor and respond to this thread.  Very helpful!

Edited by deeveedee
  • Like 2
Link to comment
Share on other sites

FYI - I am using boot-arg igfxrpsc=1 on my Kabylake-R (i5-8250u / UHD620) HackBookPro15,2.  Sleep/Wake works perfectly without any GFX frequency issues.  As reported in some of the discussions shared by rafale77, igfxfw=2 does not work with my Kabylake-R UHD620 hack.

  • Like 2
Link to comment
Share on other sites

I've noticed that I have igfxonln=1 as device property but according to WEG readme it must be boot argument, force-online=1 is the correct device property (to force online status on all displays).

As I was on AMD as main card and iGPU headless mode, maybe this property has nothing to do and gives no error.

I'll try with iGPU as main card and AMD disabled to see and comment.

 

Link to comment
Share on other sites

 Share

×
×
  • Create New...