Jump to content

AptioMemoryFix


vit9696
595 posts in this topic

Recommended Posts

Hello again vit9696,

 

How are you?

I come to you with another issue related to AptioMemoryFix (latest rev):

 

When it is used on boards with chipset Z390 (and according to some other reports, it seems that also with H370/B370/H310 chipsets) - problems appear:

- it results in OSX kernel panic on reboot/shutdown (you can see some reports, for example, in this topic).

- there also seems to be a problem with native NVRAM from within OSX, as a test var saved from OSX with 'nvram' command is NOT existing after reboot (Clover's last chosen OS is being saved though).

 

I just replaced my board to an AsRock Z390 ITX/ac (I had before an AsRock Z370 ITX/ac, really similar boards), and these issues started appearing (everything was fine with AptioMemoryFix and Z370).

Now, In my case, OSX starts fine with AptioMemoryFix, but when I reboot/shutdown, I don't even get to see the console screen with panic information, and the system ends up either hanging up or (uncleanly) rebooting (I use -v, and I do see the console screen with AptioMemoryFix when OSX loads).

 

When I add EmuVariableUefi (toghether with AptioMemoryFix, although I know it is not the intended use), shutdown/reboot work properly, and I also see the -v output when shutting down or rebooting.

 

I will gladly help with any further information or tests you may need.

 

Thanks.

 

Edited by Pene
  • Like 3
Link to comment
Share on other sites

Hi Pene,

 

I am aware of this issue, but as you can imagine I do not have the hardware to research it.

 

What happened here is another change in APTIO. In former versions there were 3 drivers to implement NVRAM: NvramDxe, NvramSmm, NvramSmi. In newer boards NvramSmi and NvramSmm became one driver. I believe we can try to research this to a certain level with a considerable amount of help from your side.

 

Frankly said, I am not sure Hardware NVRAM works on these boards in any OS at all, so the first step would be to:

1) Test NVRAM support in UEFI Linux, I believe they have a tool for this

2) Test NVRAM support in UEFI Windows, you will have to write some tool that elevates the privileges and utilises GetFirmwareEnvironmentVariableA/SetFirmwareEnvironmentVariableA APIs, see this tool for an example.

 

The "test" we want here is fairly simple. Firstly we want to try to write some variable to Apple Boot Variable GUID: 7C436110-AB2A-4BBB-A880-FE41995C9F8, then read it, reboot, and read again. If it all fails, we could try writing to Efi Global Variable GUID: 8BE4DF61-93CA-11D2-AA0D-00E098032B8C, and check if it at least some GUIDs are available for writing.

 

Next, the kernel panic on reboot is what really worries me. We know that macOS writes stuff to NVRAM during the reboot, and I would like to know if disabling SetVariable/GetVariable, GetVariableNext "fixes" the problem with the reboot. Just like we debugged it previously, write jmp ASM_PFX(RtShimsReturnInvalidParameter) to the relevant shims here for a test:

https://github.com/acidanthera/AptioFixPkg/blob/master/Platform/AptioMemoryFix/X64/AsmRtShims.nasm

 

Thirdly, regarding the panics on reboot, I would like to get as much information as possible. Please apply the kernel patch mentioned here: https://applelife.ru/posts/686953, which disables kext list printing in the panic log, and try screening the kernel panic in -v keepsyms=1 debug=0x100 mode. We should hopefully see more data to stat to work with.

 

@Slice, unfortunately this one is new. These guys never learn, and they managed to bork things once again. So it is neither the issue we researched a year ago, nor the whitelist. It is just another extension of the borked code.

 

Vit

Edited by vit9696
  • Like 9
Link to comment
Share on other sites

16 hours ago, vit9696 said:

What happened here is another change in APTIO. In former versions there were 3 drivers to implement NVRAM: NvramDxe, NvramSmm, NvramSmi. In newer boards NvramSmi and NvramSmm became one driver. I believe we can try to research this to a certain level with a considerable amount of help from your side.

 

Hi again,

 

Thank you for your reply. Here are some results:

 

NVRAM is writable from Windows. I used before a Windows tool for writing to the nvram (UEFIVAR). So I issued this command from Windows:

uefivar -G:"7C436110-AB2A-4BBB-A880-FE41995C9F82" -N:"TestVarWin" -WHEX:"01020304" -A:"NV"

EDIT: NVRAM is also writable from Linux:

# sudo -s
# printf "\x07\x00\x00\x00\x01\x02\x03\x04" > /sys/firmware/efi/efivars/TestVarLin-7c436110-ab2a-4bbb-a880-fe41995c9f82

And then, under OSX:

$ nvram -p | grep TestVar
TestVarWin	%01%02%03%04
TestVarLin	%01%02%03%04

 

For the second test, disabling only SetVariable in AptioMemoryFix solved the panic, and system shuts down properly.

 

For the third test, I cannot post any result, because, as I mentioned, I do not get the console window on shutdown/restart when the panic occurs, so with the steps you mentioned, I just get a black screen forever, until I press the power button to turn off the PC.

 

More tests are welcome ;)

 

P.S. Hi Slice :) But apparently as Vit mentioned we are "lucky" with a new Aptio issue. Hopefully solvable.

 

Edited by Pene
  • Like 2
Link to comment
Share on other sites

Well, with the random reboots there is no real issue indeed, just NVRAM variable saving once again corrupts our memory. I believe, if you try to write a lot of variables from macOS the operating system will eventually panic. That could have been somewhat usable if you were able to get the log dumps, but otherwise it is probably not beneficial.

 

Basically what needs to be done here now is reverse-engineering NvramSmm/NvramDxe and comparing them against the previous versions, which we fortunately have sources for. I do not think I will have much time any soon for that, so if you have time and are able to use IDA Pro/Hex-Rays, help will be welcome here.

 

Link to comment
Share on other sites

1 hour ago, vit9696 said:

I believe, if you try to write a lot of variables from macOS the operating system will eventually panic. That could have been somewhat usable if you were able to get the log dumps, but otherwise it is probably not beneficial.

Well, I didn't manage to make macOS panic by a lot of variable writes. But then again... does OSX save these vars to EFI in runtime or does it store internally and then call SetVariable only on reboot/shutdown?

In any case, the lifetime of these added variables is just as long as the computer doesn't reboot. After reboot they are all gone.

Link to comment
Share on other sites

Mmmm... kernel panic did not occur when writing to any GUID (with SetVariable enabled, of course). Only on reboot/shutdown I get the panic.

And when SetVariable is disabled, it seems I can still "write" to any GUID. All Info is present while in OSX, but gone after reboot.

 

As a side note, NVRAM write can work with AptioMemoryFix when not in OSX.

So, for example, if we return InvalidParameter only when we have a virtualized pointer, Clover's Last booted OS is saved to nvram, and OSX won't panic on reboot/shutdown:

global ASM_PFX(RtShimSetVariable)
ASM_PFX(RtShimSetVariable):
    ; For performance and simplicity do initial validation ourselves.
    test       rcx, rcx
    jz         ASM_PFX(RtShimsReturnInvalidParameter)     ; VariableName is NULL
    test       rdx, rdx
    jz         ASM_PFX(RtShimsReturnInvalidParameter)     ; VendorGuid is NULL
.INITIAL_VALIDATION_OVER:
    ; Once boot.efi virtualizes the pointers we should protect read-only
    ; variables from writing.
    mov        rax, qword [ASM_PFX(gGetVariableOverride)]
    test       rax, rax
    jnz        .SKIP_ACCESS_CHECK
    ; We have a virtualized pointer, so we also need to protect write-only
    ; variables from reading. Compare VendorGuid against gReadOnlyVariableGuid
    ; and return EFI_SECURITY_VIOLATION on equals.
    mov        rax, qword [rdx]
    cmp        qword [ASM_PFX(gReadOnlyVariableGuid)], rax
    ;jnz        .SKIP_ACCESS_CHECK
jnz ASM_PFX(RtShimsReturnInvalidParameter)
    mov        rax, qword [rdx+8]
    cmp        qword [ASM_PFX(gReadOnlyVariableGuid)+8], rax
    jz         ASM_PFX(RtShimsReturnSecurityViolation)
.SKIP_ACCESS_CHECK:
    mov        rax, qword [ASM_PFX(gSetVariable)]
    jmp        short FiveArgsShim

 

Edited by Pene
Link to comment
Share on other sites

yes, of course. I just mentioned it in the idea that the code in AptioMemroyFix basically works on the new Aptio.

As with SetVariable completely disabled, Clover won't update its last booted OS, when booting OSX.

 

Edited by Pene
Link to comment
Share on other sites

10000 is about the lowest that loads. And it still panics.

And about the panic details - I guess it was mere luck that I managed to see it, as once again it freezes before I get to see it.

By the way, did you notice in the previous screenshot the multiple vm_map_delete: .... nothing at <address>? Is that normal?

Link to comment
Share on other sites

Well, the guard prints at vm_map_delete are most likely work in progress for improving the vm_map unexpected situation detection. It is very unlikely that it has anything to do to us.

I believe without debug=0x100 you should be able to see the panic, but I suppose it could be the same.

Link to comment
Share on other sites

I found out that what allowed me to view the panic before was actually the nv_disable=1 argument. So I have now a reliable method to see panics.

 

Here are some panics when using slto_us=10000:

 

IMG_6350.JPG

 

IMG_6354.JPG

 

IMG_6356.JPG

Edited by Pene
Link to comment
Share on other sites

Good to have. Well, the second picture makes it very clear. XNU kernel invokes APTIO RuntimeServices SetVariable code, and then this code never returns. 

 

What we have in SetVariable is the following code coming from NvramDxe, I can tell that it did not change anyhow since the source leak, and the one in the source leaks are known to work.

 

 

UINTN GetVariableNameSize(IN CONST CHAR16 *String, IN UINTN MaxSize){
    CHAR16 *Str, *EndOfStr;
    ASSERT(String!=NULL);
    if (String==NULL) return 0;
    
    EndOfStr = (CHAR16*)((UINT8*)String + MaxSize);
    for(Str = String; Str < EndOfStr; Str++)
        if (!*Str) return (Str - String + 1)*sizeof(CHAR16);

    return MaxSize+1;
}

EFI_STATUS Communicate (UINTN MessageLength){
    UINTN CommSize;
    UINT64 Control; 
    EFI_STATUS Status;
    
    if (SmmCommProtocol==NULL) return EFI_UNSUPPORTED;
    if (   NvramSmmCommunicationBuffer == NULL 
        || NvramSmmCommunicationBufferPhysicalAddress == NULL
    ) return EFI_OUT_OF_RESOURCES;
    if (MessageLength > MaxMessageLength) return EFI_OUT_OF_RESOURCES;

    Control = NvramSmmCommunicationBuffer->Control;
    NvramSmmCommunicationBuffer->MessageLength = MessageLength;
    CommSize = CommunicationHeaderSize + MessageLength;
    Status = SmmCommProtocol->Communicate (SmmCommProtocol, NvramSmmCommunicationBufferPhysicalAddress, &CommSize);

    if (EFI_ERROR(Status)) return Status;
    if (NvramSmmCommunicationBuffer->Control == Control)
            return EFI_NO_RESPONSE;
    if ((NvramSmmCommunicationBuffer->Control & NVRAM_SMM_ERROR_BIT)!=0)
        Status = NVRAM_SMM_STATUS_TO_EFI_STATUS(NvramSmmCommunicationBuffer->Control);
    return Status;
}

EFI_STATUS DxeSetVariableSmmWrapper (
    IN CHAR16 *VariableName, IN EFI_GUID *VendorGuid,
    IN UINT32 Attributes, IN UINTN DataSize, IN VOID *Data
)
{
    EFI_STATUS Status;
    UINTN AvailableBufferSize, VariableNameSize;
    SMI_SET_VARIABLE_BUFFER *SetBuffer;

    if (NvramSmmCommunicationBuffer == NULL) return EFI_UNSUPPORTED;
    if (!VariableName || !VendorGuid || (DataSize && !Data))
        return EFI_INVALID_PARAMETER;
    
    AvailableBufferSize = MaxMessageLength - sizeof(SMI_SET_VARIABLE_BUFFER);
    VariableNameSize = GetVariableNameSize(VariableName, AvailableBufferSize);
    
    // If variable name or data is too large to fit into our buffer, it is also too large to fit
    // into NVRAM store.
    if (AvailableBufferSize < VariableNameSize) return EFI_OUT_OF_RESOURCES;
    AvailableBufferSize -= VariableNameSize;
    if (AvailableBufferSize < DataSize) return EFI_OUT_OF_RESOURCES;

    SetBuffer = (SMI_SET_VARIABLE_BUFFER *)&NvramSmmCommunicationBuffer->Control;
    SetBuffer->Control = NVRAM_SMM_COMMAND_SET_VARIABLE;
    SetBuffer->Attributes = Attributes;
    SetBuffer->DataSize = DataSize;
    SetBuffer->Guid = *VendorGuid;
    SetBuffer->VariableNameSize = VariableNameSize;
    MemCpy(SetBuffer+1, VariableName, VariableNameSize);
    MemCpy((UINT8*)(SetBuffer+1)+VariableNameSize, Data, DataSize);
    
    Status = Communicate( sizeof(SMI_SET_VARIABLE_BUFFER) + VariableNameSize + DataSize );

    return Status;
}

EFI_STATUS DxeSetVariableSafe(
    IN CHAR16 *VariableName, IN EFI_GUID *VendorGuid,
    IN UINT32 Attributes, IN UINTN DataSize, IN VOID *Data
)
{
    EFI_STATUS Status;

    BEGIN_CRITICAL_SECTION(NvramCs);
    if (NvramSmmIsActive)
        Status = DxeSetVariableSmmWrapper(
                     VariableName,VendorGuid,Attributes,DataSize,Data
                 );
    else
        Status = DxeSetVariableWrapper(
                     VariableName,VendorGuid,Attributes,DataSize,Data
                 );
    END_CRITICAL_SECTION(NvramCs);
    return Status;
}

 

 

The code relevant to SMM switching looks the same too, and EFI_SMM_COMMUNICATION_PROTOCOL implementation is provided by EDK2. They still allocate the SMM communication buffer as EfiRuntimeServicesData, and still pass its address via NvramMailbox NVRAM variable, so it should be guarded by AptioMemoryFix. As a result I believe that the infinite loop happens somewhere on the way to NvramSmm (which now represents former Smi and Smm code glued together). However, the brief checking of the binary and the source shows that the Smi handler (NvramSmmCommunicationHandler, SetVariableSmmHandler) is pretty much the same too. This leaves us in an uneasy situation, where we do not know where to look for the problem.

 

What I could suggest is writing a EFI runtime driver (by ripping off the known APTIO V source) that will reimplement communication with SMM:

1. Allocate a new communication buffer.

2. Check & overwrite the address of the old communication buffer in MailBox variable

3. Overwrite EFI_RUNTIME_SERVICES Variable functions with APTIO code but the new communication buffer.

 

The above will result in having a complete path prior to SMM code under our control. Afterwards we should be able to get this code fully functional on some working APTIO V system (e.g. Skylake or Kaby Lake), and try it on the new problematic system. By changing the logic via the return codes it should be easy to ensure where the issue is: DXE or SMM driver. Other than it may even help us to understand whether the SMI handler exists at all.

 

If it is SMM, I would probably try replacing NvramSmm with NvramSmm & NvramSmi from some Z370 BIOS first and reflashing the firmware. Then… perhaps reverse-engineer/reimplement NvramSmm with the new changes and try to debug it too.

 

If you like the idea, I can share APTIO src and let you proceed.

Edited by vit9696
  • Like 5
Link to comment
Share on other sites

Hi vit9696,

 

Yes, an uneasy situation indeed...

 

Meanwhile I made some test, with a rather strange result.

I set a simple override for SetVariable, something like:

OvrSetVariable(
        IN CHAR16                       *VariableName,
        IN EFI_GUID                     *VendorGuid,
        IN UINT32                       Attributes,
        IN UINTN                        DataSize,
        IN VOID                         *Data
)
{
	EFI_GUID gEfiAppleBootGuid = {0x7C436110, 0xAB2A, 0x4BBB, {0xA8, 0x80, 0xFE, 0x41, 0x99, 0x5C, 0x9F, 0x82}};
	return gOrgRS.SetVariable(L"TestVar", &gEfiAppleBootGuid, EFI_VARIABLE_NON_VOLATILE | EFI_VARIABLE_BOOTSERVICE_ACCESS | EFI_VARIABLE_RUNTIME_ACCESS, 4, "1234");
}

... which from some reason didn't panic on restart (obviously the "TestVar" was updated first by Clover's call to SetVariable, so I don't think it actually tried to write anything to nvram on restart). 

 

But then I changed only the preset apple boot guid to:

return gOrgRS.SetVariable(L"TestVar", VendorGuid, EFI_VARIABLE_NON_VOLATILE | EFI_VARIABLE_BOOTSERVICE_ACCESS | EFI_VARIABLE_RUNTIME_ACCESS, 4, "1234");

... which resulted again in panic on restart. Can't really explain why. Perhaps you have an idea.

EDIT: Maybe it panics only when it actually has data to change? Otherwise it probably just returns EFI_SUCCESS and exists.

 

Regarding your suggestion, it sounds like a good plan, but it's a bit too big for me at the moment, considering my limited experience with runtime EFI drivers combined with the limited free time that I currently have. But if someone else is up for this task, I'll be more than willing to test it on the Z390.

 

Edited by Pene
Link to comment
Share on other sites

On 10/28/2018 at 1:54 PM, vit9696 said:

They still allocate the SMM communication buffer as EfiRuntimeServicesData, and still pass its address via NvramMailbox NVRAM variable, so it should be guarded by AptioMemoryFix. 

By the way, no chance it is not guarded? How can we check this?

Link to comment
Share on other sites

×
×
  • Create New...