Jump to content

Lion kernel testing on AMD (don't ask help here: use the Help Topic)


  • Please log in to reply
605 replies to this topic

#81
Andy Vandijck

Andy Vandijck

    InsanelyMac Deity

  • Coders
  • 1,614 posts
  • Gender:Male
  • Location:Tienen
  • Interests:Programming stuff for Mac OS X...
    Hacking...
    Hard rock (also really big Metallica...
Like I said: I don't think it is the kernel...
It's weird...

#82
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,894 posts
  • Gender:Male
Hi, Andy!

I'm also convinced it's not the kernel itself, it boots just fine. It's just something missing in the kernel that prevents the userland processes to spawn in 64bit mode on AMD machines. Used to think it was a ssse3-related issue, thanks to an old paper written by David Elliott (dfe), but we have ssse3 emulation now, so what? I still think it's a CPUID issue elsewhere in the kernel that's preventing us to load the user land.

The obvious thing is to investigate kernel_exec.c and mach_loader.c (and h), but there's no reason the CPUID issue cannot occur elsewhere and prevent the user land to run, even if the kernel boots fine. That was the issue Sinetek dealt with to make his 64-bit Snow Leopard kernel a winner, and he couldn't repeat his success with Lion, just like us.

Thank you for the time you're investing on it.

Hey, Delta! Your debug version is very promising. I think it can be made even more accurate. Say, you done something like this:

{
printf("exec_add_user_string() started\n");
int error = 0;

I think it's cool to know when each function starts, but it would be even better if we know which value they return or which task they actually do, or which results from each statement, something like that:

return ERROR;
printf("the xxxxxx function returned the value \n", ERROR);
}

By the way, lots of good info already from the debug version you already made. Notice this:


goto bad_notrans; - 1
goto bad_notrans; - 2

exec_check_permissions() started
pal_kernel_announce
() started
goto bad; - 1
calling mountroot_post_hook
calling mountroot_post_hook
(again)
bsd_init() done?
goto bad; - 2
goto bad; - 3

in the for loop now...
exec_mach_imgact() started
in the for loop now...
exec_fat_imgact() started
goto bad; - 3
in the for loop now...
exec_mach_imgact() started
exec_add_user_string
() started
exec_apple_strings
() started
exec_add_user_string
() started
exec_add_user_string
() started
Setting security token
goto again; - 1
bad:proc_transend(p, 0);
bad_notrans:
returning error
check_for_signature
() started
skipping KERN_FAILURE
proc_lock
(p)
proc_unlock(p)
switch_protect
Err: 0
end of bsdinit_task()?


I would do a version myself with the suggestions i made, but my Xcode stopped working on a sudden, so i'll have to reinstall everything here.

Thank you all guys for your effort!

#83
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh

Hi, people!

It's not the kernel itself, it boots just fine. It's just something missing in the kernel that prevents the userland processes to spawn in 64bit mode on AMD machines. Used to think it was a ssse3-related issue, thanks to an old paper written by David Elliott (dfe), but we have ssse3 emulation now, so what? I still think it's a CPUID issue elsewhere in the kernel that's preventing us to load the user land. The obvious thing is to investigate kernel_exec.c and mach_loader.c (and h), but there's no reason the CPUID issue cannot occur elsewhere and prevent the user land to run, even if the kernel boots fine. That was the issue Sinetek dealt with to make his 64-bit Snow Leopard kernel a winner, and he couldn't repeat his success with Lion, just like us.


Yes, that is our problem. However, I think I've investigated the whole kern_exec.c, and it looks like it runs just fine.
I'll take look @ mach_loader.c later. :)

#84
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,894 posts
  • Gender:Male
Delta, i edited my post: take time to read it before doing anything with mach_loader, if you can.

I think kern_exec.c is not running as it should: it's returning errors (the "bad" function) where it should not. I think we should perhaps investigate why it's acting like that and correct the issues. Only after that, we should focus on another file. Or maybe solving these issues takes us necessarily to mach_loader.c or other file, who knows?

#85
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh

Delta, i edited my post: take time to read it before doing anything with mach_loader, if you can.

I think kern_exec.c is not running as it should: it's returning errors (the "bad" function) where it should not. I think we should perhaps investigate why it's acting like that and correct the issues. Only after that, we should focus on another file. Or maybe solving these issues takes us necessarily to mach_loader.c or other file, who knows?


Thanks for the great idea! I'll add return values and remove some unnecessary info... :)
Will post another diff & kernel soon! :)

EDIT: And btw, for example:
goto bad_notrans; - 1

means we got PAST goto bad_notrans; (first of them) :D
I should have made the messages a bit more clear... :D

EDIT2: lion-test-21 compiled: http://www.solidfile...m/d/fcf9be63ed/
Diff coming soon... :)

Diff: http://www.solidfile...m/d/ce042eba5a/

#86
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,894 posts
  • Gender:Male

means we got PAST goto bad_notrans; (first of them) :D


Delta, don't you see? These "bad" functions aren't to be accessed at all! If we're getting past them, it means the errors that justify them are happening. They should've been skipped altogether. Yet, take a look at the code, the "bad" function won't hang all processes at the scene of the crime: its output, though, can be perhaps prevent some important process to run later.

About the newest debug kernel, i'm going to test it now.

#87
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh

Delta, don't you see? These "bad" functions aren't to be accessed at all! If we're getting past them, it means the errors that justify them are happening. They should've been skipped altogether. Yet, take a look at the code, the "bad" function won't hang all processes at the scene of the crime: its output, though, can be perhaps prevent some important process to run later.

About the newest debug kernel, i'm going to test it now.


Ahh, now I get it... xD
It goes to bad and bad_notrans at some point... Needs more debugging. :D

#88
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,894 posts
  • Gender:Male
P.S.: No, i'm not suggesting us to artificially skip them or remove them from the source. Instead, they communicate us about issues that are happening, so we better take a look at them and fix them, and hopefully that will get us one step further. My bad my Xcode is screwed.

#89
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh

P.S.: No, i'm not suggesting us to artificially skip them or remove them from the source. Instead, they communicate us about issues that are happening, so we better take a look at them and fix them, and hopefully that will get us one step further. My bad my Xcode is screwed.


Yea, it's bad to just skip them... Like we tried with the EACCES error... However, I can't even get that far anymore... :D


EDIT: Too bad that "bad" doesn't have any arguments, the code just skips to it somewhere... I added some messages to find out the exact point like this:
if (--iterlimit == 0) {
printf("Going to bad (4)\n");
error = EBADEXEC;
goto bad;
}


lion-test-22: http://www.solidfile...m/d/e5a4e695a6/

#90
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,894 posts
  • Gender:Male
Even more important would be knowing if and where else the outputs of the "bad" functions are used. Are the "bad" functions being called somewhere else?

#91
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh

Even more important would be knowing if and where else the outputs of the "bad" functions are used. Are the "bad" functions being called somewhere else?


I think the "bad" functions are just like functions inside another function. Like if the "main" function does something wrong -> the code skips to "bad" part of the function.
The "bad" function I'm trying to figure out is located in kern_exec.c -> load_init_program() (the function that calls launchd).
I added those messages to all (gotta do a double check) "goto bad;" parts, but still it goes to bad, without giving me any of those "going to bad (x)" messages, so it must be called from outside?
This is damn weird... :D

EDIT: I'm sorry, I meant the exec_activate_image() function... :D Not load_init_program().

EDIT2: And the return of "bad" function is just like the return of it's main function? That's how I understand it.

EDIT3: I gotta go now, I'll be back in few hours. :D

#92
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,894 posts
  • Gender:Male
Thank you, Delta!

Andy, any ideas how much relevant this bad function could be? I'm looking at the source and found it nowhere but in kernel_exec.c. Perhaps the search tool here is malfunctioning...?

EDIT2: And the return of "bad" function is just like the return of it's main function? That's how I understand it.


Maybe the main function returns the value of Bad when certain conditions are not met. So when the main function is called elsewhere, it will give the value of bad and perhaps this would hang the processes.

#93
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh

Maybe the main function returns the value of Bad when certain conditions are not met. So when the main function is called elsewhere, it will give the value of bad and perhaps this would hang the processes.


Exactly what I was thinking. It must be done this way...

If we just could build verbose launchd? Or something to see if the code even tries to run it?

Okay, new kernel. This one has more specific debug messages about those "bad" functions like this:
bad:
printf("We are in bad of exec_mach_imgact()\n");
return(error);
}

lion-test-23: http://www.solidfile...m/d/9d673a70ae/



EDIT: How is this possible? The kernel seems to execute most (if not all) of the "bad" functions... Still needs some more work.


EDIT2: Meklort shared his wisdom in IRC... Bad functions will be executed. The problem is somewhere else... Or something.

#94
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh
We have some kind of progress, maybe...
If you're running AMD SL / Lion (or secretly even ML):

1. Download this: http://www.solidfile...m/d/428fa4efbc/
2. sudo su in terminal
3. chmod +x tiny
4. ./tiny
5. Post here what happened.

#95
Andy Vandijck

Andy Vandijck

    InsanelyMac Deity

  • Coders
  • 1,614 posts
  • Gender:Male
  • Location:Tienen
  • Interests:Programming stuff for Mac OS X...
    Hacking...
    Hard rock (also really big Metallica...

We have some kind of progress, maybe...
If you're running AMD SL / Lion (or secretly even ML):

1. Download this: http://www.solidfile...m/d/428fa4efbc/
2. sudo su in terminal
3. chmod +x tiny
4. ./tiny
5. Post here what happened.

What does this do?

#96
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,894 posts
  • Gender:Male
Hi, Andy!

It creates a mach-o static executable (that is, does not use dyld).

We intend to replace launchd with it, to see what's the effect.

This binary executable must be also able to run on an AMD machine, otherwise the experiment is DOA.

Best regards.

#97
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh

What does this do?


It's just an ultra-small Mach-O executable:
http://osxbook.com/b...h-o-executable/

nicertiny.asm.
Meklort told us to test if kernel starts launchd with that. It doesn't need dyld, so it eliminates it out... :)
but I get illegal instruction when running the nicertiny on my AMD...

Changed /sbin/launchd to /tiny on the source, put tiny on root of the HDD and boot. I got panic! :)

#98
Andy Vandijck

Andy Vandijck

    InsanelyMac Deity

  • Coders
  • 1,614 posts
  • Gender:Male
  • Location:Tienen
  • Interests:Programming stuff for Mac OS X...
    Hacking...
    Hard rock (also really big Metallica...
Good plan... then we can see if it is dyld :)

#99
Deltac0

Deltac0

    InsanelyMac Sage

  • Members
  • PipPipPipPipPip
  • 263 posts
  • Gender:Male
  • Location:Finland
  • Interests:Caffeine, OS X, AMD Hackintosh

Good plan... then we can see if it is dyld :)


but we both get "Illegal instruction" when trying to run the nicertiny...
I tried to boot with it, panic... Most likely somehow related to the illegal instruction when ran from terminal.
But now we know that the kernel DOES start the launchd. :)
Probably something about dyld.

#100
theconnactic

theconnactic

    Stubborn AMD user

  • Local Moderators
  • 2,894 posts
  • Gender:Male
64-bit kernel, Delta?

Maybe it's just the dyld indeed... that would be good news.





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

© 2014 InsanelyMac  |   News  |   Forum  |   Downloads  |   OSx86 Wiki  |   Mac Netbook  |   PHP hosting by CatN  |   Designed by Ed Gain  |   Logo by irfan  |   Privacy Policy