Jump to content

Mavericks kernel testing on AMD (formerly Mountain Lion kernel testing on AMD)

Mountain Lion AMD legacy kernel x64_86 ssse3 ssse3 emulator

  • Please log in to reply
5471 replies to this topic

#41
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male

It may not that bad considering the price-to-performance ratio of any non-portable Macs.


I couldn't disagree more. Mac Pros and iMacs (worse with the former) are outrageously overpriced, tdtran.

#42
SS01

SS01

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 265 posts
  • Gender:Male
  • Location:Ottawa

So i think there's two main paths for AMD hackintoshing by now: 1) focus on Mountain Lion XNU itself, and before even trying to patch it for AMD, we must discover if it has any ssse3 routines required to run, which ones, and how to change it and then apply patches AMD-specific (unless we plan to abandon non-Buldozer users like me, lol) or 2) focus on Lion 10.7.4 XNU, since we have a patched one for AMD, then making it run 64-bit flawlessly on AMD with Lion itself, by correcting the ssse3 calls in the commpage and bcopy.s routine and perfecting the AMD patches (remember that even Bulldozer ssse3-enabled CPUs have to boot arch=i386, which won't work with ML), and only then try to either have ML booting with it or generate a diff file from it by comparing it with ML's, and then apply this diff with AMD patches and ssse3 correction as a patch to ML XNU (I tried something like that though, applying RAW's diff directly as a patch on 12.0 XNU, and got an epic fail).

Personally I'd go with plan 1 - making the kernel sse3 capable, whether via an on-the-fly emulator or manually translating the instructions (the first would be hard, while the second would be easy but rather tedious) then focus on AMD patching. meklort did say that an AMD binary patch in C wouldn't be too hard, after all.

By the way, I don't know if this helps or not, but netkas compiled a 32-bit XNU kernel for ML: http://rghost.net/39532549 - it KPs almost instantly because all ML kexts are 64-bit, though.

#43
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male

Personally I'd go with plan 1 - making the kernel sse3 capable, whether via an on-the-fly emulator or manually translating the instructions (the first would be hard, while the second would be easy but rather tedious) then focus on AMD patching. meklort did say that an AMD binary patch in C wouldn't be too hard, after all.


Pursuing your suggestion, i'm right now taking a look at the commpage of the 10.8.2 XNU (http://opensource.ap.../i386/commpage/). Let's see if i manage to figure a little bit of something out of this. :)

Whoa, very interesting! Instructions!


Here's what to do if you want to add a new routine to the comm page:
*
* 1. Add a definition for it's address in osfmk/i386/cpu_capabilities.h,
* being careful to reserve room for future expansion.

*
* 2. Write one or more versions of the routine, each with it's own
* commpage_descriptor. The tricky part is getting the "special",
* "musthave", and "canthave" fields right, so that exactly one
* version of the routine is selected for every machine.
* The source files should be in osfmk/i386/commpage/.
*
* 3. Add a ptr to your new commpage_descriptor(s) in the "routines"
* array in osfmk/i386/commpage/commpage_asm.s
. There are two
* arrays, one for the 32-bit and one for the 64-bit commpage.
*
* 4. Write the code in Libc to use the new routine.


#44
bcobco

bcobco

    InsanelyMac Geek

  • Members
  • PipPipPip
  • 142 posts
  • Gender:Male
  • Location:Argentina
good investigation job!
so, the problem is that the comm page has instructions only for intel SSSE3 (?)
adding more routines to the comm page will make xnu compatible with all x86_64 processors (?)

#45
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male
And these statements summarize the CPU instructions. Notice the switch statement: for those non-familiar with C, this statement makes the machine evaluate the first conditional "case", by order of appearance. if the first case is not satisfied, that's it, bits |= kHasAVX1, that means different (|=) of kHasAVX1, than the other cases (satisfied by being different to given referred instruction also) are subsequently evaluated. If none of the cases are satisfied, than the machine evaluates the default, which breaks the statement. That means that the statement breaks when none of the cases are satisfied, that is, if the variable "bits" is not different from any of the CPU instructions (equal to all of them).

This could mean nothing (i'll have to take a look at the consequences of the assigment of inequality of any of the instructions to the variables being true). But it could be precisely here that the kernel evaluates if all required instructions are supported in a given CPU! If it's the case (no pun intended :D), it suddenly comes to my mind that some of these "cases" should be erased from the map, lol!

Anyway, this seems crucial. Now i really need the assistance of someone more skilled programming-wise than i am!


switch (cpu_info.vector_unit) {
case 9:
bits |= kHasAVX1_0;
/* fall thru */
case 8:
bits |= kHasSSE4_2;
/* fall thru */
case 7:
bits |= kHasSSE4_1;
/* fall thru */
case 6:
bits |= kHasSupplementalSSE3;
/* fall thru */
case 5:
bits |= kHasSSE3;
/* fall thru */
case 4:
bits |= kHasSSE2;
/* fall thru */
case 3:
bits |= kHasSSE;
/* fall thru */
case 2:
bits |= kHasMMX;
default:
break;

good investigation job!
so, the problem is that the comm page has instructions only for intel SSSE3 (?)
adding more routines to the comm page will make xnu compatible with all x86_64 processors (?)


Maybe! Seems that the commpage explicitly requires a lot of CPU instructions to be present! Now i need some experienced advice here. We're getting close to something, it seems. The whole code is quite commented, which is good. Almost like the coders are giving guidance to the ones who wish to play with the code. :D

#46
SS01

SS01

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 265 posts
  • Gender:Male
  • Location:Ottawa
Something just occurred to me: this is likely a stupid question, but do any crucial system functions actually need ssse3? If not, the fix to this problem could be as simple as something like FakeSMC.kext, tricking OS X into thinking that everything is working.

EDIT: just noticed an ad for Opteron servers is at the top of this page while I'm writing this. I think AMD is hinting at something! ;)

#47
bcobco

bcobco

    InsanelyMac Geek

  • Members
  • PipPipPip
  • 142 posts
  • Gender:Male
  • Location:Argentina
Intel® 64 and IA-32 Architectures Software Developer Manuals
http://www.intel.com...er-manuals.html



the list of SupplementalSSE3 instructions
http://en.wikipedia.org/wiki/SSSE3
http://en.wikipedia....uction_listings

what can be the solution?
1- take all calls to an unsupported instructions set and implement them with avaliable instructions set
2- reimplement the part that calls that unsupported instruction with the avaliable instruction
3- fork darwin os to make full x86 compatible xnu branch. look at linux, bsd, ... kernels, they have options for a lot of specific processors and instructions sets, with the same code (?)
4- think more ideas that are always welcome
5- switch to intel and party hard



for example (option 1-), the instructions PHADDW/PHADDD that makes Packed Horizontal Add (Words or Doublewords):
takes registers A = [a0 a1 a2 …] and B = [b0 b1 b2 …] and outputs [a0+a1 a2+a3 … b0+b1 b2+b3 …]
a very trivial operation... this can perfectly be implemented with other instructions.
obviously it will require a little set of instructions that will always be slower than only one instruction designated for this task



i dont like CISC architecture

Edited by bcobco, 24 September 2012 - 06:58 AM.


#48
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male
Hey, bcobco!

In the end, when needed, we all turn ourselves into black magicians, huh?


I think this file, cpuid.c, is kinda an obligatory target for AMD patches - http://opensource.ap...mk/i386/cpuid.c

Take a look at one excerpt (notice that is another switch statement):


cpuid_set_cpufamily(i386_cpu_info_t *info_p)
{
uint32_t cpufamily = CPUFAMILY_UNKNOWN;

switch (info_p->cpuid_family) {
case 6:
switch (info_p->cpuid_model) {
#if CONFIG_YONAH
case 14:
cpufamily = CPUFAMILY_INTEL_YONAH;
break;
#endif
case 15:
cpufamily = CPUFAMILY_INTEL_MEROM;
break;
case 23:
cpufamily = CPUFAMILY_INTEL_PENRYN;
break;
case CPUID_MODEL_NEHALEM:
case CPUID_MODEL_FIELDS:
case CPUID_MODEL_DALES:
case CPUID_MODEL_NEHALEM_EX:
cpufamily = CPUFAMILY_INTEL_NEHALEM;
break;
case CPUID_MODEL_DALES_32NM:
case CPUID_MODEL_WESTMERE:
case CPUID_MODEL_WESTMERE_EX:
cpufamily = CPUFAMILY_INTEL_WESTMERE;
break;
case CPUID_MODEL_SANDYBRIDGE:
case CPUID_MODEL_JAKETOWN:
cpufamily = CPUFAMILY_INTEL_SANDYBRIDGE;
break;
case CPUID_MODEL_IVYBRIDGE:
cpufamily = CPUFAMILY_INTEL_IVYBRIDGE;
break;
}
break;
}

#49
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male

Something just occurred to me: this is likely a stupid question, but do any crucial system functions actually need ssse3? If not, the fix to this problem could be as simple as something like FakeSMC.kext, tricking OS X into thinking that everything is working.

EDIT: just noticed an ad for Opteron servers is at the top of this page while I'm writing this. I think AMD is hinting at something! ;)


Not a stupid question at all! But the answer is not so simple. In Lion, as Dave Eliott's XNU dev. site points, there is indeed only one crucial function that actually needs ssse3: just the bcopy.s assembly routine, precisely the one that enables 64-bit support. That's exactly why our Athlon/Phenom CPUs, despite being 64-bit capable, cannot load the 64-bit kernel.

Would your idea of workaround work for Lion? 1) Well, the kernel loads first, so by the time the patched kext load to try to bypass the check, the kernel would have already accessed all commpage routines to start itself to run. However, the kext could be a 32-bit one, since Lion still support it, and it could try to reload the kernel - like some weird two stage boot method - and this would lead to 2) the routine is itself the thing that makes Lion 64-bit tick. It's not a check. If bypassed, the system would hang anyway. Instead, we could simply write some kind of executable that runs at kernel level, like a kext, reloading it with 64-bit support enabled (which is different from loading a full 64-bit kernel, but would be something anyway). Taking this at prospect, honestly, to translate the ssse3 instructions or even to write an on-the-fly emulator seems way easier and more feasible.

As for Mountain Lion, the problem is the system as a whole is 64 bit: all the kexts were compiled in EMT64 architecture. Even if there's not a routine written specifically to enable 64-bit support (which would be redundant, since it's a full 64-bit kernel and probably there's a routine to enable 32-bit app support - probably the bcopy.s file in the commpage again), a workaround like you suggested would be almost impossible, because since the kernel loads itself firts and it has to run in 64-bit mode to load all the kexts, the patched kext would have to be itself a 64-bit one, but if we could load any 64-bit kext, it would mean that we already have a running 64-bit kernel. Got it? Of course, we could try something like a two stage boot again, loading first the Netkas 32-bit ML kernel, to load our executable which would force the 64-bit support to a second kernel which would load immediately after, this time a patched AMD 64 kernel. Seems complicated? Yes, because it is indeed! We have to cross fingers and hope that there's no crucial task, specially none related to 64-bit CPU support, requiring ssse3 instructions, or we'll have to face the same issues we have with Lion, with one difference: we would not have the option of a -legacy boot; it would be either to emulate the ssse3 instructions or to translate the ssse3 routines to sse3 or not to have Mountain Lion at all!

Now that i just saw Jones Bones Jones smashing Belfort just after have his arm almost broken, i'm back to the fight here. The XNU is almost breaking my brain, but i think it will submit in the end. ;)

#50
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male
Hi, folks!

It does appear that our russian fellas are doing their job, too: http://www.applelife...-amd-cpu.37071/

At least two developers are working on it: one of them is Bronzovka, a.k.a. Dmitrik, one of the first to develop a successful Lion AMD kernel - which is, by the way, the core for the latest and greatest, from RAWX86.

#51
PookyMacMan

PookyMacMan

    InsanelyMac Legend

  • Moderators
  • 1,445 posts
  • Gender:Male
  • Location:Earth–Western Hemisphere, specifically
  • Interests:Computer science, engineering, trumpet performance, and a host of others. :D
Aha! So it is only bcopy! :D I remember reading that bcopy was the main kernel issue back in the Leopard days and the Pentium/vanilla kernel discussion. But, if it is truly the only culprit, it makes our job that much easier! :) So, here's what I believe needs to be done:

1. Remove all cpuid checks and CPU instruction checks! Particularly with the former, that is mandatory to get the kernel to boot without a blank, black screen.
2. Investigate bcopy.s and translate all SSSE3 instructions to their SSE3 equivalents (harder than the first :P). Once that's done, most likely (I believe) all the kernel issues will be taken care of; further issues would probably be fixed by binary patching, like in the olden days. :)

(BTW, anyone with an AMD system and the Lion kernel care to try the binary patching? Look up the binary patches used, apply them to the files necessary, and then see if any issues are fixed? A user earlier mentioned I think the most crucial one, the dyld patch.)

#52
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male

Aha! So it is only bcopy! :D I remember reading that bcopy was the main kernel issue back in the Leopard days and the Pentium/vanilla kernel discussion. But, if it is truly the only culprit, it makes our job that much easier! :) So, here's what I believe needs to be done:


Well, this is a big "if" anyway, but it's indeed a feasible prospect. :)

1. Remove all cpuid checks and CPU instruction checks! Particularly with the former, that is mandatory to get the kernel to boot without a blank, black screen.


Not to say that, if there's no other obstacle other than these checks, our work will be done. This would be the best of all scenarios, because it's something quite simple to be done, but it's not likely to happen to be that easy for us. Cross fingers...

2. Investigate bcopy.s and translate all SSSE3 instructions to their SSE3 equivalents (harder than the first :P). Once that's done, most likely (I believe) all the kernel issues will be taken care of; further issues would probably be fixed by binary patching, like in the olden days. :)


Yeah, here lies the trouble. Not only because of the difficulty degree intrinsic to the task, but because even figure the ssse3 instructions out to start it is also very hard, at least to me. It's why i asked for a more skilled hand to help me: when i examine the bcopy.s file of the Mountain Lion XNU, all i see is ordinary general purpose instructions, like "mov", "add", etc. Okay, but maybe they don't exist at all, sending us to the best of all words, or maybe they're in another file or diffused in various files (the worst possible scenario). The concrete info i have is from Dave Elliot's page, that the bcopy.s routine in the Lion XNU requires ssse3. But even in that XNU, i cannot find anything but the same general purpose instructions, which mean that they are either not there, not visible or (the most probable case) i couldn't read one even looking at it directly because of my insufficient programming skills (i was expecting to find instructions like, say, PHSUBSW, but it obviously doesn't work like that).

(BTW, anyone with an AMD system and the Lion kernel care to try the binary patching? Look up the binary patches used, apply them to the files necessary, and then see if any issues are fixed? A user earlier mentioned I think the most crucial one, the dyld patch.)


Well, if you or somebody else know which binary patches exactly, link them to me here in this topic, tell me which are the necessary files i have to set as targets, and teach me how to apply a binary patch (i think FileMerge won't do this job, right?), i'll gladly do it. :)

#53
bcobco

bcobco

    InsanelyMac Geek

  • Members
  • PipPipPip
  • 142 posts
  • Gender:Male
  • Location:Argentina
if bcopy is the thing, maybe looking at this both gives some ideas...

bcopy.S from FreeBSD project
http://www.freebsd.o...te=1.13.2.1.4.1

bcopy.s from Apple
http://www.opensourc...mk/i386/bcopy.s



asm.h maybe has some things to look at

Edited by bcobco, 24 September 2012 - 06:47 AM.


#54
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male

if bcopy.s is the thing, maybe looking at this both gives some ideas...

bcopy.S from FreeBSD project
http://www.freebsd.o...te=1.13.2.1.4.1

bcopy.s from Apple
http://www.opensourc...mk/i386/bcopy.s


Thank you, bcobco. By the way, i forgot to tell you how useful was that link to Intel's manuals.

The bcopy.s from Apple is precisely the one i've been studying, trying to find where are the dreaded forbidden instructions and how they look like. But i'll take a look at the BSD one anyway (which is the base for Apple's). :)

#55
bcobco

bcobco

    InsanelyMac Geek

  • Members
  • PipPipPip
  • 142 posts
  • Gender:Male
  • Location:Argentina

Thank you, bcobco. By the way, i forgot to tell you how useful was that link to Intel's manuals.

i have updated that post. now is more legible because of its relevance in this task (SupplementalSSE3, SSE3, ...), and points to the correct manuals page. (before, it pointed only to volumen 2 instructions m to z... now points to the full manuals).

#56
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male
Oh, i know, and i already downloaded the combo manual, with 3000+ pages :o

Thank you anyway!

#57
bcobco

bcobco

    InsanelyMac Geek

  • Members
  • PipPipPip
  • 142 posts
  • Gender:Male
  • Location:Argentina
quote from intel developers manual (the pdf with 3020 pages)

12.7.2 Checking for SSSE3 Support
Before an application attempts to use the SSSE3 extensions, the application should follow the steps illustrated in Section 11.6.2, “Checking for SSE/SSE2 Support.” Next, use the additional step provided below:
• Check that the processor supports SSSE3 (if CPUID.01H:ECX.SSSE3[bit 9] = 1).


if CPUID.01H:ECX.SSSE3[bit 9] = 1
then mach_kernel
else legacy_kernel


#58
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,531 posts
  • Gender:Male

quote from intel developers manual (the pdf with 3020 pages)


if CPUID.01H:ECX.SSSE3[bit 9] = 1
then mach_kernel
else legacy_kernel


Hi, @bcobco!

I'm not sure if i understand your second quote. From where is this piece of code?

@PookyMacMan, here's the bcopy.s routine from the x86_64 section (/http://opensource.apple.com/source/xnu/xnu-1699.22.73/osfmk/x86_64/bcopy.s):


#include <i386/asm.h>

/* void *memcpy((void *) to, (const void *) from, (size_t) bcount) */
/* rdi, rsi, rdx */
/*
* Note: memcpy does not support overlapping copies
*/
ENTRY(memcpy)
movq %rdx,%rcx
shrq $3,%rcx /* copy by 64-bit words */
cld /* copy forwards */
rep
movsq
movq %rdx,%rcx
andq $7,%rcx /* any bytes left? */
rep
movsb
ret

/* void bcopy((const char *) from, (char *) to, (unsigned int) count) */
/* rdi, rsi, rdx */

ENTRY(bcopy_no_overwrite)
xchgq %rsi,%rdi
jmp EXT(memcpy)

/*
* bcopy(src, dst, cnt)
* rdi, rsi, rdx
* ws@tools.de (Wolfgang Solfrank, TooLs GmbH) +49-228-985800
*/
ENTRY(bcopy)
xchgq %rsi,%rdi
movq %rdx,%rcx

movq %rdi,%rax
subq %rsi,%rax
cmpq %rcx,%rax /* overlapping && src < dst? */
jb 1f

shrq $3,%rcx /* copy by 64-bit words */
cld /* nope, copy forwards */
rep
movsq
movq %rdx,%rcx
andq $7,%rcx /* any bytes left? */
rep
movsb
ret

/* ALIGN_TEXT */
1:
addq %rcx,%rdi /* copy backwards */
addq %rcx,%rsi
decq %rdi
decq %rsi
andq $7,%rcx /* any fractional bytes? */
std
rep
movsb
movq %rdx,%rcx /* copy remainder by 32-bit words */
shrq $3,%rcx
subq $7,%rsi
subq $7,%rdi
rep
movsq
cld
ret


Did you see any ssse3 instructions somewhere? Yeah, neither do i. Therefore:

1) They're not there, we're in deep trouble because they can be anywhere;

2) They're neither there nor anywhere, we just need to apply AMD patches, fix the CPU ID checks and we're good to go;

3) They're there, but we cannot see because... well, because we know not.

:)

#59
bcobco

bcobco

    InsanelyMac Geek

  • Members
  • PipPipPip
  • 142 posts
  • Gender:Male
  • Location:Argentina

Hi, @bcobco!

I'm not sure if i understand your second quote. From where is this piece of code?

@PookyMacMan, here's the bcopy.s routine from the x86_64 section (/http://opensource.apple.com/source/xnu/xnu-1699.22.73/osfmk/x86_64/bcopy.s):


that piece of code in the second quote is invented. i found
if CPUID.01H:ECX.SSSE3[bit 9] = 1
in Volume 1 Chapter 12.5 (page 266) of intel developers manual. But in that manual there are a lot of more things.

ps: please put code inside [ code ] [ /code ] marks to make posts easier to read.




is there a way to tell a compiler not to make use of specific set of instructions?
would be easier to tell the compiler "instead of SupplementalSSE3 please use SSE3 ok? i invite you a beer dude"

#60
PookyMacMan

PookyMacMan

    InsanelyMac Legend

  • Moderators
  • 1,445 posts
  • Gender:Male
  • Location:Earth–Western Hemisphere, specifically
  • Interests:Computer science, engineering, trumpet performance, and a host of others. :D

@PookyMacMan, here's the bcopy.s routine from the x86_64 section (/http://opensource.apple.com/source/xnu/xnu-1699.22.73/osfmk/x86_64/bcopy.s):

Did you see any ssse3 instructions somewhere? Yeah, neither do i. Therefore:

1) They're not there, we're in deep trouble because they can be anywhere;

2) They're neither there nor anywhere, we just need to apply AMD patches, fix the CPU ID checks and we're good to go;

3) They're there, but we cannot see because... well, because we know not.

:)

I have a feeling that it's #3. But I hope it's 2. :)

See if you can contact meklort on IRC (server irc.osx86.hu, channel will be either #lion or #mountainlion), if he's not there maybe someone else (conti or nawcom maybe) can help you out. :)





Also tagged with one or more of these keywords: Mountain Lion, AMD, legacy kernel, x64_86, ssse3, ssse3 emulator


4 user(s) are reading this topic

2 members, 2 guests, 0 anonymous users


© 2014 InsanelyMac  |   News  |   Forum  |   Downloads  |   OSx86 Wiki  |   Mac Netbook  |   Web hosting by CatN  |   Designed by Ed Gain  |   Logo by irfan  |   Privacy Policy