Jump to content

New Driver for Realtek RTL8111


Mieze
1,592 posts in this topic

Recommended Posts

netstat.txt

 

Here is a quick look at my netstat with the new driver, i'm back on lnx2mac one, Tried the debug version and still very slow upload speed, the download is normal.

 

I took a look in my system.log and no info regarding realtek, only that a link has been made with 100mbps and that i have EEE support on the controler, no errors, no nothing.

Link to comment
Share on other sites

netstat.txt

 

Here is a quick look at my netstat with the new driver, i'm back on lnx2mac one, Tried the debug version and still very slow upload speed, the download is normal.

 

I took a look in my system.log and no info regarding realtek, only that a link has been made with 100mbps and that i have EEE support on the controler, no errors, no nothing.

Hello RVXTM,

 

the netstat output shows evidence for a significant packet loss as the number of retransmissions compared to the total number of packets is extremely high:

 

tcp:
80345 packets sent
3764 data packets (6492888 bytes)
2589 data packets (1618611 bytes) retransmitted

 

This would also explain the low upload speed. The log messages are ok,.

 

Unfortunately you are the first user with an RTL8111C (RTL8168C/8111C: (Chipset 5)) to test the driver so that I have no experience with that chip but the issue reminds me of a report from a user with a MSI Z77MA-G45. Maybe you should take a look at this: http://www.tonymacx8...html#post556468

 

I would suggest to disable checksum offload as he did and see if it helps. In case it doesn't, check the network statistics of the machine on the other side. Especially look out for bad packets. Using Wireshark to create a packet dump might also be helpful.

 

By the way, are you using the driver on the GA EP-45-EXTREME like your signature suggests? Are both ports connected to the switch? What is their configuration (DHCP, static IP or ...)?

 

Mieze

Link to comment
Share on other sites

Hello RVXTM,

 

the netstat output shows evidence for a significant packet loss as the number of retransmissions compared to the total number of packets is extremely high:

 

tcp:
80345 packets sent
3764 data packets (6492888 bytes)
2589 data packets (1618611 bytes) retransmitted

 

This would also explain the low upload speed. The log messages are ok,.

 

Unfortunately you are the first user with an RTL8111C (RTL8168C/8111C: (Chipset 5)) to test the driver so that I have no experience with that chip but the issue reminds me of a report from a user with a MSI Z77MA-G45. Maybe you should take a look at this: http://www.tonymacx8...html#post556468

 

I would suggest to disable checksum offload as he did and see if it helps. In case it doesn't, check the network statistics of the machine on the other side. Especially look out for bad packets. Using Wireshark to create a packet dump might also be helpful.

 

By the way, are you using the driver on the GA EP-45-EXTREME like your signature suggests? Are both ports connected to the switch? What is their configuration (DHCP, static IP or ...)?

 

Mieze

 

I am using that MB, only port 0 connected to a router with DHCP ip allocations.

Link to comment
Share on other sites

I am using that MB, only port 0 connected to a router with DHCP ip allocations.

Ok, the interesting question is where did all the lost packets go to and why do they get lost? Did you follow the installation instructions?

 

Mieze

Link to comment
Share on other sites

Have an issue: when I boot to Windows and then restart into OSX (just restart, warm reboot, no shutdown), then net is not working any more. Requires shutdown and fresh start to get it working again.

 

Windows -> restart into OSX - not working

Windows -> restart into Ubuntu -> restart into OSX - working

OSX (when net is working) -> restart into OSX - still working

 

Logs: DebugLogs.zip

When not working:

ip:
543 total packets received
327 bad header checksums

  • Like 1
Link to comment
Share on other sites

Have an issue: when I boot to Windows and then restart into OSX (just restart, warm reboot, no shutdown), then net is not working any more. Requires shutdown and fresh start to get it working again.

 

Windows -> restart into OSX - not working

Windows -> restart into Ubuntu -> restart into OSX - working

OSX (when net is working) -> restart into OSX - still working

 

When not working:

ip:
543 total packets received
327 bad header checksums

 

Hello dmazar,

 

thanks for your feedback. The extremely high number of packets with bad header checksums might probably indicate that checksum offload isn't working properly after you have used Windows. As all drivers (Win, Linux and OS X) load firmware into the NIC, I assume that the Windows driver leaves the NIC with settings that are incompatible with the OS X driver and don't get replaced completely. Unfortunately the firmware has been provided by Realtek without any documentation so that there is little I can do to resolve the issue. I guess the problem doesn't show up when you shut down the system after using windows and do a cold boot into OS X?

 

Mieze

Link to comment
Share on other sites

Yes, cold boot solves the problem. Or ... booting into Linux and then warm reboot into OSX also does the trick. Looks like driver in my Ubuntu is able to "fix" after Windows.

At least we have a workaround that is really easy to apply. I will add this information to the troubleshooting section.

 

Mieze

Link to comment
Share on other sites

Did some more tests ...

 

If I am in Windows and then shutdown, wait few seconds and turn the comp on and boot to OSX -> it does not work. I have to shut it down from OSX again and then start into OSX and then it works.

 

Tested sequences from Windows:

1. Windows -> soft restart into Ubuntu -> soft restart into OSX, net works

2. Windows -> soft restart into OSX (net does not work) -> soft restart into OSX, net does not work

3. Windows -> soft restart into OSX (net does not work) -> shutdown and start into OSX, net works

4. Windows -> shutdown and start into OSX (net does not work) -> shutdown and start into OSX, net works

 

From sequences 3 and 4: Simple shutdown from Windows is not enough. Your driver needs to be started on my controller twice. At first boot (warm or cold) it does something, but not enough. Second cold restart (shutdown/start) is needed, and then it works fine.

 

Is there anything that can be learned from this and done?

 

Plus, it looks to me that the same thing is happening when removing some other driver from OSX and installing this one - required several restarts/shutdowns to get it working after install.

Link to comment
Share on other sites

Finally I can use WOL !!

My system have RTL8111E (GA X58A-UD3R motherboard) chip, it works perfectly now.

 

That sounds really promising since i have the same chip onboard !

 

@ Mieze

 

Thank you for this driver, can't wait for testing it when back at my hack next weekend...

I got a bunch of warnings about "unused variable flags" when i compiled for 10.7 with Xcode4.5.2.

Should i use 4.4.1 or is there nothing to worry about ? :worried_anim:

Link to comment
Share on other sites

I got a bunch of warnings about "unused variable flags" when i compiled for 10.7 with Xcode4.5.2.

Should i use 4.4.1 or is there nothing to worry about ? :worried_anim:

No need to worry. I will comment out those lines in the next release to get rid of the warnings.

 

Mieze

Link to comment
Share on other sites

Did some more tests ...

 

If I am in Windows and then shutdown, wait few seconds and turn the comp on and boot to OSX -> it does not work. I have to shut it down from OSX again and then start into OSX and then it works.

 

Tested sequences from Windows:

1. Windows -> soft restart into Ubuntu -> soft restart into OSX, net works

2. Windows -> soft restart into OSX (net does not work) -> soft restart into OSX, net does not work

3. Windows -> soft restart into OSX (net does not work) -> shutdown and start into OSX, net works

4. Windows -> shutdown and start into OSX (net does not work) -> shutdown and start into OSX, net works

 

From sequences 3 and 4: Simple shutdown from Windows is not enough. Your driver needs to be started on my controller twice. At first boot (warm or cold) it does something, but not enough. Second cold restart (shutdown/start) is needed, and then it works fine.

 

Is there anything that can be learned from this and done?

 

As the chip supports WoL it uses standby power so that it won't be off completely until you pull the plug off the wall or flick the PSU's switch. Maybe it's a firmware related problem?

 

Plus, it looks to me that the same thing is happening when removing some other driver from OSX and installing this one - required several restarts/shutdowns to get it working after install.

 

I had the suspect that the lnx2mac driver also causes problems when you switch over to my driver. This might be a firmware issue but it could be as well that the driver left something in the system preferences that is the reason for the strange behavior. As my driver is the only Realtek driver for OS X that makes use of the chip's advanced features (checksum offload and TCP segmentation offload) in order to improve performance, it's interaction with the network stack is far more complex.

 

Here is another funny thing I discovered during my tests. I removed the lnx2mac driver from my test system and installed my driver. Although it was working I noticed that https connections to Apple websites. e. g. iCloud, App Store, iTunes Store and developer.apple.com stopped working. The strange thing was that replies to connection requests from those servers where considered to have a bad IP header checksums by the NIC but everything else, including https connections to other servers, was working flawlessly. Ironically the problem disappeared after I wiped out the disk, reinstalled OS X and my driver. This time everything was working fine.

 

After all I came to the conclusion that rx checksum offload is responsible for many of the known problems. In the next release I will address this issue and change the way received packets are handled when checksum verification in hardware failed. Instead of marking these packets as bad, they will be considered as unchecked letting the network stack repeat verification in software. So far this strategy seems to work without speed impacts and might even be a practical solution for boards with broken NICs like the MSI Z77MA-G45.

 

Mieze

 

Tested Slice's version: http://www.insanelym...20#entry1900418

Same issue, but slightly worse regarding Windows.

That's funny! I've been in a personal conversation with Slice during the last days. He was trying to convince me that my driver has a power management issue causing this kind of trouble and that he already found a solution for his driver. Obviously the issue isn't related to power management at all but as we both started with Realtek's linux driver 8.035.0 its no wonder that both drivers are affected. Anyway, thanks for the information. At least we know now that PM is not the place to look for in order to find a solution.

 

Mieze

Link to comment
Share on other sites

Here is another funny thing I discovered during my tests. I removed the lnx2mac driver from my test system and installed my driver. Although it was working I noticed that https connections to Apple websites. e. g. iCloud, App Store, iTunes Store and developer.apple.com stopped working. The strange thing was that replies to connection requests from those servers where considered to have a bad IP header checksums by the NIC but everything else, including https connections to other servers, was working flawlessly. Ironically the problem disappeared after I wiped out the disk, reinstalled OS X and my driver. This time everything was working fine.

 

After all I came to the conclusion that rx checksum offload is responsible for many of the known problems. In the next release I will address this issue and change the way received packets are handled when checksum verification in hardware failed. Instead of marking these packets as bad, they will be considered as unchecked letting the network stack repeat verification in software. So far this strategy seems to work without speed impacts and might even be a practical solution for boards with broken NICs like the MSI Z77MA-G45.

 

Mieze

 

Glad you were able to figure out this bug! Let me know if you need any help testing

Link to comment
Share on other sites

As the chip supports WoL it uses standby power so that it won't be off completely until you pull the plug off the wall or flick the PSU's switch. Maybe it's a firmware related problem?

Tested this: booted to Windows, then shutdown, then unplugged comp from power for 10-15 secs, then started OSX - and all the same as before, net is connected, but not usable. Required another shutdown and start into OSX to get it working.

 

I'm willing to try to to compare initalization with r8169 Linux driver (https://github.com/t...realtek/r8169.c). What should I start with or what to try to compare? Carefully go through all init process or go straight to this rtl8168_hw_phy_config()? r8169 identifies my card as RTL_GIGA_MAC_VER_33, uses rtl8168e_1_hw_phy_config() and uses FIRMWARE_8168E_2 (tl_nic/rtl8168e-2.fw, have it from Ubuntu). What is a chance to brick my controller by experimenting with this?

 

EDIT: Just an update: shutdown after Windows and unplugging the power and ethernet cable and waiting for 30 secs did the trick. Next boot to OSX resulted in working net.

Link to comment
Share on other sites

Tested this: booted to Windows, then shutdown, then unplugged comp from power for 10-15 secs, then started OSX - and all the same as before, net is connected, but not usable. Required another shutdown and start into OSX to get it working.

 

I'm willing to try to to compare initalization with r8169 Linux driver (https://github.com/t...realtek/r8169.c). What should I start with or what to try to compare? Carefully go through all init process or go straight to this rtl8168_hw_phy_config()? r8169 identifies my card as RTL_GIGA_MAC_VER_33, uses rtl8168e_1_hw_phy_config() and uses FIRMWARE_8168E_2 (tl_nic/rtl8168e-2.fw, have it from Ubuntu). What is a chance to brick my controller by experimenting with this?

 

EDIT: Just an update: shutdown after Windows and unplugging the power and ethernet cable and waiting for 30 secs did the trick. Next boot to OSX resulted in working net.

Hello dmazar,

 

you'll have a hard time trying to track down the error, in particular because Realtek's 8.035.00 driver doesn't separate the firmware from the code at all. Everything is packed into that giant function rtl8168_hw_phy_config. The chance of bricking the NIC can't be ruled out and we don't know what the firmware does.

 

But I have a better idea. You could get a copy of Realtek's current Linux driver (version 8.035.00) and test it under Linux. http://218.210.127.1...3&GetDown=false

If it shows the same issue with regard to Windows as under OS X then you could contact their technical support and hopefully they will fix it for us.

 

Can you provide me two debug logs. One when network is working fine, and one after you rebooted from Windows and the network is dead. In the last case please also open Network Utility and watch the number of packets transferred as they are updated by a hardware statistics dump of the NIC. This allows us to take a look at the NIC's internal state.

 

Mieze

Link to comment
Share on other sites

you'll have a hard time trying to track down the error, in particular because Realtek's 8.035.00 driver doesn't separate the firmware from the code at all. Everything is packed into that giant function rtl8168_hw_phy_config. The chance of bricking the NIC can't be ruled out and we don't know what the firmware does.

Well, that was kind of naive from me. Like I could jump in, change few registers and try to get it working. :)

 

But I have a better idea. You could get a copy of Realtek's current Linux driver (version 8.035.00) and test it under Linux. http://218.210.127.1...3&GetDown=false

If it shows the same issue with regard to Windows as under OS X then you could contact their technical support and hopefully they will fix it for us.

Got it:

 

[ 1.154796] r8168 Gigabit Ethernet driver 8.035.00-NAPI loaded

[ 1.154894] r8168 0000:08:00.0: irq 53 for MSI/MSI-X

[ 1.298849] r8168: This product is covered by one or more of the following patents: US5,307,459, US5,434,872, US5,732,094, US6,570,884, US6,115,776, and US6,327,625.

[ 1.298852] r8168 Copyright © 2012 Realtek NIC software team <nicfae@realtek.com>

[ 17.289937] r8168: eth0: link down

[ 18.858229] r8168: eth0: link up

[ 19.284308] r8168: eth0: link up

Network still works fine in Ubuntu after restart from Windows. So the firmware theory is not valid any more, right? This thing is the same in yours and Linux drivers, right?

 

What changed now is that Windows -> restart into Linux -> restart into OSX results in non working net in OSX, while restart from Linux previously fixed it for OSX also. Shutdown and new start fixes it.

 

About additional logs: do you need something different from previous logs?

Link to comment
Share on other sites

Network still works fine in Ubuntu after restart from Windows. So the firmware theory is not valid any more, right? This thing is the same in yours and Linux drivers, right?

 

Correct! Maybe I should check the PCI config space setup as this had to be rewritten from scratch because the Linux code was not portable and as far as I know this could be preserved across a reboot or while standby power is still present.

 

About additional logs: do you need something different from previous logs?

 

The log messages of the driver (debug build) when network is not working would be really helpful.

 

Mieze

Link to comment
Share on other sites

The one from here is not ok: http://www.insanelymac.com/forum/topic/287161-new-driver-for-realtek-rtl8111/page__st__20#entry1899870 ?

 

Network Utility/Info: there is no really a difference here with working and 'non-working' net. Send and Receive errors and Collisions are 0. It's just that Sent and Recv packets number are much smaller with 'non-working' net. But they still rise with time. By the way, ping, lookup and

traceroute are working fine. Safari: does not report any connection errors, just waits to receive some data, which is not coming.

Link to comment
Share on other sites

The one from here is not ok: http://www.insanelym...20#entry1899870 ?

 

Network Utility/Info: there is no really a difference here with working and 'non-working' net. Send and Receive errors and Collisions are 0. It's just that Sent and Recv packets number are much smaller with 'non-working' net. But they still rise with time. By the way, ping, lookup and

traceroute are working fine. Safari: does not report any connection errors, just waits to receive some data, which is not coming.

 

According to the logs the NIC is working but rx checksum offload seems to be unreliable after a reboot from Windows which brings the firmware theory back into the game because my driver and the linux driver have different strategies. When checksum verification in hardware failed the linux driver treats the packet as unchecked and lets the network stack perform the check while my driver considers it to be a bad packet.

 

Please try the attached version in which I adopted the strategy of the linux driver. Good luck!

 

Mieze

 

PS: Do you need a binary or can you compile from source?

Link to comment
Share on other sites

Src is fine. Thanks!

 

Tested: still does not work after Windows.

Logs: NetDebug.zip

 

The error moved from bad checksum to data size error:

ip:
356 total packets received
0 bad header checksums
0 with size smaller than minimum
159 with data size < data length

Hope this will trigger some more ideas :) .

Link to comment
Share on other sites

Src is fine. Thanks!

 

Tested: still does not work after Windows.

Logs: NetDebug.zip

 

The error moved from bad checksum to data size error:

ip:
356 total packets received
0 bad header checksums
0 with size smaller than minimum
159 with data size < data length

Hope this will trigger some more ideas :) .

 

Hello dmazar,

 

according to the documentation the NIC transfers the packet including the ethernet CRC into memory but as the CRC isn't needed by the protocol stack, the driver removes the last 4 bytes of a received packet. Maybe your NIC (Chipset 14) is different?

 

Locate the following line in the source code (its in RTL8111::rxInterrupt())

 

pktSize = (descStatus1 & 0x1fff) - 4;

 

Change it into

 

pktSize = (descStatus1 & 0x1fff);

 

Good luck!

 

Mieze

 

Edit: In case this doesn't help you might also try to increase the packet size a little bit. The buffers are all 2000 bytes in size so that there is enough headroom.

Link to comment
Share on other sites

Tried, but does not help. I've dumped descStatus1 and descStatus2 from working and non working net and from linux (after Windows). Maybe it will help.

NetDebug2.zip

 

Mainly, non working system contains packets with descStatus1 bits 20-23 as 6, while working system and linux do not have that.

Not working:

 

rxInterrupt(): descStatus1=0x3462c500, descStatus2=0x40000000, pktSize=1536

rxInterrupt(): descStatus1=0x3462c500, descStatus2=0x40000000, pktSize=1536

(packet sizes are invalid, increased by 256 by me)

Link to comment
Share on other sites

Mainly, non working system contains packets with descStatus1 bits 20-23 as 6, while working system and linux do not have that.

Not working:

 

rxInterrupt(): descStatus1=0x3462c500, descStatus2=0x40000000, pktSize=1536

rxInterrupt(): descStatus1=0x3462c500, descStatus2=0x40000000, pktSize=1536

(packet sizes are invalid, increased by 256 by me)

The datasheet says:

  • Bit 22: Receive Watchdog Timer Expired: This bit is set whenever the received packet length exceeds 8192 bytes.
  • Bit 21: Receive Error summary: When set, indicates that at least one of the following errors has occurred: CRC, RUNT, RWT, FAE. This bit is valid only when LS (Last segment bit) is set.

Ok, we know now that these packets are really bad because of a reception error but I have no idea how to avoid this. :unsure:

 

Mieze

Link to comment
Share on other sites

Hello dmazar,

 

two more questions to narrow down the issue:

 

1) Which Windows driver do you use? Driver from board's manufacturer, Realtek, included in Win?

 

2) When you use Ubuntu's native driver without the firmware, is it still able to cure the problem caused by the Win driver?

 

Mieze

Link to comment
Share on other sites

×
×
  • Create New...