Jump to content

AppleIntelE1000e.kext for 10.8/10.7/10.6/10.5


hnak
 Share

751 posts in this topic

Recommended Posts

The code to copy mbuf is originally for jumbo frame and scatter-gather code exists for long ( I recently disabled it by a macro for simplicity ). IOBasicOutputQueue sounds easy to use. I will check it.

The only thing you have to keep an eye on are the descriptor indices for the tx ring. I maintain 3 variables in my code, next free index, next dirty index and the number of free descriptors. The number of free descriptors is the only variable which gets modified by outputPacket() and txInterrupt() so that I use atomic operations in order modify it.

 

Provided you stop the output queue before any operation in the work loop which also affects the transmitter (chip reset, disable, etc.) there should be no problem at all.

 

Mieze

Link to comment
Share on other sites

So I've been trying to figure out what @diddl14 did to enable WOL on 2.5.4 so I can replicate it on 2.4.14. So far 2.4.14 has been the only version to be stable in large (>5gb) transfer across the network. Any pointers or help in making this happen? (if its possible)

Link to comment
Share on other sites

I am also experiencing issues with large file copies and any driver version greater than 2.4.14.  My last attempt was with the 3.1.0 driver and the TSO property set to False.  I was attempting to use Time Machine to backup to a Time Capsule.  I have approximately 300GB of data that I was trying to move.  Looking back in the syslog I started seeing the following:

Aug 12 00:10:00 --- last message repeated 33 times ---

Aug 12 00:10:13 Roberts-Mac-Pro kernel[0]: failed to getphysicalsegment in outputPacket.
 
Finally it just gave up:
Aug 12 00:43:16 Roberts-Mac-Pro kernel[0]: AppleIntelE1000e(Err): Detected Hardware Unit Hang:
 
I have tried multiple configurations and kexts and get the same error with anything over 2.4.14.  With 2.4.14 I have been able to complete my backup without error.  Am I missing out on anything staying with the older driver?  And also, is there anything else I can try to make the 3.1.0 kext work correctly?
 
Thanks
Link to comment
Share on other sites

sudo kextunload /System/Library/Extensions/IONetworkingFamily.kext/Contents/PlugIns/AppleIntelE1000e.kext

 

sudo vi /System/Library/Extensions/IONetworkingFamily.kext/Contents/PlugIns/AppleIntelE1000e.kext/Contents/Info.plist

 

Type:

 

/NETIF_F_TSO<Enter>

 

to find the relevant setting. Change the associated <true/> to <false/>, by moving the cursor to the /, pressing the i key to switch to insert mode, backspace the word, and type in false. Then press Escape, then :wq to write out the file and quit.

 

Then you can reload it with:

 

sudo kextload /System/Library/Extensions/IONetworkingFamily.kext/Contents/PlugIns/AppleIntelE1000e.kext

 
You're welcome to copy the plist out and make the change with Xcode, but I think that goes a little overboard. Nano -w is also fine, if you already know your way around search, edit, and save with that.
 
As mentioned above, this feature will probably be disabled by default in future releases.
 
Further disputing the usefulness of TSO, when it's already proven to be quite buggy. The fact that it's enabled by default on Windows for some newer chipsets does not mean that it suddenly works properly on older chipsets. Or even that it has been fully tested on newer hardware. Intel probably likes to use their install base as a testbed, then cleans up the mess with later driver releases. You're welcome to argue this, but perhaps for now, you could put a notice in the OP to disable this option if TX unit hangs are encountered, and for future versions, enable it at your own risk.
 
You may also opt to make the edit and reboot instead of using the kextunload/kextload cycle, but I've found that a reboot is not entirely necessary to fix the hardware hang issue, now that I've already had to experience it a few times from attempting to back up over 40GB of data over even just a 100Mbps link. Since making this configuration change, I have been able to back up over 1TB of data over the same link without any errors.
  • Like 1
Link to comment
Share on other sites

sudo vi /System/Library/Extensions/IONetworkingFamily.kext/Contents/PlugIns/AppleIntelE1000e.kext/Contents/Info.plist

 

Type:

 

/NETIF_F_TSO<Enter>

 

to find the relevant setting. Change the associated <true/> to <false/>, by moving the cursor to the /, pressing the i key to switch to insert mode, backspace the word, and type in false. Then press Escape, then :wq to write out the file and quit.

 

Or this:

 

- Check the value, then set the value, then check the value again:

$sudo /usr/libexec/PlistBuddy -c "Print IOKitPersonalities:e1000e:NETIF_F_TSO" /System/Library/Extensions/IONetworkingFamily.kext/Contents/PlugIns/AppleIntelE1000e.kext/Contents/Info.plist
Password:
true

sudo /usr/libexec/PlistBuddy -c "Set IOKitPersonalities:e1000e:NETIF_F_TSO false" /System/Library/Extensions/IONetworkingFamily.kext/Contents/PlugIns/AppleIntelE1000e.kext/Contents/Info.plist

sudo /usr/libexec/PlistBuddy -c "Print IOKitPersonalities:e1000e:NETIF_F_TSO" /System/Library/Extensions/IONetworkingFamily.kext/Contents/PlugIns/AppleIntelE1000e.kext/Contents/Info.plist
false

That way you avoid having to teach "vi" to someone, hehe  :D

  • Like 1
Link to comment
Share on other sites

Further disputing the usefulness of TSO, when it's already proven to be quite buggy. The fact that it's enabled by default on Windows for some newer chipsets does not mean that it suddenly works properly on older chipsets. Or even that it has been fully tested on newer hardware. Intel probably likes to use their install base as a testbed, then cleans up the mess with later driver releases. You're welcome to argue this, but perhaps for now, you could put a notice in the OP to disable this option if TX unit hangs are encountered, and for future versions, enable it at your own risk.

 

Instead of complaining about a very useful feature you could just add a few lines of code which print the corresponding descriptors in case of a tx deadlock giving us a chance to find out what is wrong so that we can fix it.

 

Mieze

Link to comment
Share on other sites

Odd IPv6 issue when transferring data *from* this machine to another: The transfers start very slow and quickly stall completely. *Small* amounts of data, eg: visiting an ipv6 site, or logging into an interactive shell and not doing much, works, though possibly sluggish; it's when data is transferred in quantity that it stalls.

 

Data being transferred the other way, being *received* by this machine, is fine.

 

IPv4 is fine at all times.

 

AppleIntelE1000e.kext v3.1.0; upgraded to that just now, previously 3.0.4.1, but no change in symptoms.

 

Mavericks, 10.9.4.

 

DPCIManager and System Profile reports the interface is an I217-V. Vendor:8086; Device:153B; Sub-vendor: 1043; Sub-device: 859F. (This is an Asus Z87I-PRO mobo, using its onboard ethernet interface.) By the way until a couple of months ago this machine was running Linux (Ubuntu, up to Trusty) and so presumably was running the original of the driver this is based on. There were none of these problems.

 

Tried setting that TSO option to false, it made no apparent difference.

 

What *does* make a difference, is running VMWare (Fusion Professional 6). While the VMWare app is running, IPv6 works fine both ways. Then if I quit VMWare again, it fails again as described.

 

I have a VM that normally runs in the background in bridged networking mode, which is why it took me a long time to discover this problem; only discovering it when forgetting to restart that VM after an unrelated reboot earlier today. It doesn't need the VM itself to be running to "fix" the IPv6 transmission problem; just the VMWare app.

 

My guess is that VMWare is putting the ethernet interface into some operational mode where it all works, but which isn't default, and is (as you would want) restoring it to its former state when it quits. Whatever that is though, it's not obvious from ifconfig output which shows, both while VMWare is running, and after it's quit (when the problem returns), no visible difference in the settings for the ethernet interface.

 

NB: VMWare has a separate IPv6 problem, whereby it won't speak IPv6 between host and guest, even while host and guests are separately quite happy to speak IPv6 to the rest of the world. *That* also happens when running on a real Mac (Sandy Bridge Mac Mini Server); I'm just mentioning it on the offchance it's relevant.

 

There's *another* problem which I think is separate too, just mentioning in passing just in case, where large data transfers between host and guest (again, in bridged-mode networking) can cause the ethernet driver to crash, and all ethernet connections to fail, often forcing a hard reset to recover (machine won't shut down). Data transfer between guest and other machines is fine though. I'm not sure where the problem lies with this, whether the driver, vmware, at one point I thought the usb3 driver was implicated, though no more and I think it may have happened once with the real mac too though I haven't yet taken the time to re-test that thoroughly. So *this* isn't a bug-report about *that*, yet, I'm just mentioning it in case someone goes "ahh" in relation to the - i think separate - bug i am reporting. :-)

 

I have btw confirmed I don't get the above-described IPv6 behaviour on the real Mac Mini; with or without VMWare running, ipv6 works.

 

(Pasting *complete* ifconfig output from with-vmware and without-vmware - ipv6 prefix obscured because my ssh ports are open and, well just in case ;-)):

zecora:~ rachel$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
	options=3<RXCSUM,TXCSUM>
	inet6 ::1 prefixlen 128 
	inet 127.0.0.1 netmask 0xff000000 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 
	nd6 options=1<PERFORMNUD>
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=6b<RXCSUM,TXCSUM,VLAN_HWTAGGING,TSO4,TSO6>
	ether d8:50:e6:4d:50:a2 
	inet6 fe80::da50:e6ff:fe4d:50a2%en0 prefixlen 64 scopeid 0x4 
	inet6 xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 prefixlen 64 autoconf 
	inet6 xxxx:xxxx:xxxx::55b7:ec32:e521:d5be prefixlen 64 autoconf temporary 
	inet 192.168.1.4 netmask 0xffffff00 broadcast 192.168.1.255
	nd6 options=1<PERFORMNUD>
	media: autoselect
	status: active
vmnet1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	ether 00:50:56:c0:00:01 
	inet 172.16.215.1 netmask 0xffffff00 broadcast 172.16.215.255
vmnet8: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	ether 00:50:56:c0:00:08 
	inet 192.168.104.1 netmask 0xffffff00 broadcast 192.168.104.255
zecora:~ rachel$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
	options=3<RXCSUM,TXCSUM>
	inet6 ::1 prefixlen 128 
	inet 127.0.0.1 netmask 0xff000000 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 
	nd6 options=1<PERFORMNUD>
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=6b<RXCSUM,TXCSUM,VLAN_HWTAGGING,TSO4,TSO6>
	ether d8:50:e6:4d:50:a2 
	inet6 fe80::da50:e6ff:fe4d:50a2%en0 prefixlen 64 scopeid 0x4 
	inet6 xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 prefixlen 64 autoconf 
	inet6 xxxx:xxxx:xxxx::55b7:ec32:e521:d5be prefixlen 64 autoconf temporary 
	inet 192.168.1.4 netmask 0xffffff00 broadcast 192.168.1.255
	nd6 options=1<PERFORMNUD>
	media: autoselect
	status: active

In the system.log all I see is when the attempted transfer (by scp/sftp) fails, I hit ctrl-C and it logs:

Aug 14 22:48:22 zecora.local sshd: rachel [priv][428]: USER_PROCESS: 431 ttys001
Aug 14 22:48:25 zecora.local sshd: rachel [priv][428]: DEAD_PROCESS: 431 ttys001

And FWIW what's logged as VMWare starts up; because *something* it's doing, it likes. :-)

Aug 14 22:52:35 zecora kernel[0]: vmci: Loaded @ 0xffffff7f828412ad: Info 0xffffff7f8284c0f8 Name com.vmware.kext.vmci Version 90.5.7 build-1887983 at Jun  9 2014 21:46:09
Aug 14 22:52:35 zecora kernel[0]: vmci: Initializing module.
Aug 14 22:52:35 zecora kernel[0]: vmci: VMCI: shared components initialized.
Aug 14 22:52:35 zecora kernel[0]: vmcvmci: VMCI: hosti:  componentsBegi inin hetialized.lper queue t
Aug 14 22:52:35 zecora kernel[0]: hread.
Aug 14 22:52:35 zecora kernel[0]: vmci: Module initialized.
Aug 14 22:52:35 zecora kernel[0]: vsock: Loaded @ 0xffffff7f82851aa4: Info 0xffffff7f8285c030 Name com.vmware.kext.vsockets Version 90.5.7 build-1887983 at Jun  9 2014 21:46:16
Aug 14 22:52:35 zecora kernel[0]: vsock: Initializing module...
Aug 14 22:52:35 zecora kernel[0]: vsock: Begin workloop.
Aug 14 22:52:35 zecora kernel[0]: vsock: Module initialized.
Aug 14 22:52:35 zecora kernel[0]: vmnet: Loaded @ 0xffffff7f82861ab6: Info 0xffffff7f82869020 Name com.vmware.kext.vmnet Version 0188.79.83 build-1887983 at Jun  9 2014 21:46:23
Aug 14 22:52:35 zecora kernel[0]: vmnet: Initializing module.
Aug 14 22:52:35 zecora kernel[0]: vmnet: VMNet_Start allocated gOSMallocTag.
Aug 14 22:52:35 zecora kernel[0]: vmnet: VMNet_Start allocated vnetBigLock.
Aug 14 22:52:35 zecora kernel[0]: vmnet: Module initialized.
Aug 14 22:52:35 zecora kernel[0]: vmmon: Loaded @ 0xffffff7f8286ae66: Info 0xffffff7f82875040 Name com.vmware.kext.vmx86 Version 0188.79.83 build-1887983 at Jun  9 2014 21:45:58
Aug 14 22:52:35 zecora kernel[0]: vmmon: Instrumenting bug 151304...
Aug 14 22:52:35 zecora kernel[0]: vmmon: Cycles 60
Aug 14 22:52:35 zecora kernel[0]: vmmon: Timer thread started.
Aug 14 22:52:35 zecora kernel[0]: vmmon: Module initialized.
Aug 14 22:52:35 zecora.local vmnet-bridge[548]: Dynamic store changed
Aug 14 22:52:35 zecora kernel[0]: vmnet: VNetUserIf_Create: created userIf at 0xffffff8026345200.
Aug 14 22:52:35 zecora kernel[0]: vmnet: VMNetConnect: returning port 0xffffff8026345200
Aug 14 22:52:35 zecora kernel[0]: vmnet: Hub 2 does not exist, allocating memory.
Aug 14 22:52:35 zecora kernel[0]: vmnet: Allocated hub 0xffffff80276ce000 for hubNum 2.
Aug 14 22:52:35 zecora kernel[0]: vmnet: VMNET_SO_BINDTOHUB: port: paddr 00:50:56:f8:e5:2e
Aug 14 22:52:35 zecora kernel[0]: vmnet: Hub 2
Aug 14 22:52:35 zecora kernel[0]: vmnet: 	Port 0
Aug 14 22:52:35 zecora kernel[0]: vmnet: bridge-en0: media 20 dev 0xffffff8027854558 family 2
Aug 14 22:52:35 zecora kernel[0]: vmnet: bridge-en0: up
Aug 14 22:52:35 zecora kernel[0]: vmnet: bridge-en0: attached
Aug 14 22:52:35 zecora kernel[0]: vmnet: VNetUserIfFree: freeing userIf at 0xffffff8026345200.
Aug 14 22:52:35 zecora.local vmnet-bridge[548]: Started bridge for 2, en0
Aug 14 22:52:35 zecora kernel[0]: vmnet: VNetUserIf_Create: created userIf at 0xffffff8026345200.
Aug 14 22:52:35 zecora kernel[0]: vmnet: VMNetConnect: returning port 0xffffff8026345200
Aug 14 22:52:35 zecora kernel[0]: vmnet: Hub 0 does not exist, allocating memory.
Aug 14 22:52:35 zecora kernel[0]: vmnet: Allocated hub 0xffffff8027aff000 for hubNum 0.
Aug 14 22:52:35 zecora kernel[0]: vmnet: VMNET_SO_BINDTOHUB: port: paddr 00:50:56:e2:48:5f
Aug 14 22:52:35 zecora kernel[0]: vmnet: Hub 0
Aug 14 22:52:35 zecora kernel[0]: vmnet: 	Port 0
Aug 14 22:52:35 zecora kernel[0]: vmnet: bridge-en0: media 20 dev 0xffffff8027854558 family 2
Aug 14 22:52:35 zecora kernel[0]: vmnet: bridge-en0: up
Aug 14 22:52:35 zecora kernel[0]: vmnet: bridge-en0: attached
Aug 14 22:52:35 zecora kernel[0]: vmnet: VNetUserIfFree: freeing userIf at 0xffffff8026345200.
Aug 14 22:52:35 zecora.local vmnet-bridge[548]: Started bridge for 0, en0
Aug 14 22:52:36 zecora kernel[0]: vmnet: VNetUserIf_Create: created userIf at 0xffffff8026381400.
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNetConnect: returning port 0xffffff8026381400
Aug 14 22:52:36 zecora kernel[0]: vmnet: Hub 1 does not exist, allocating memory.
Aug 14 22:52:36 zecora kernel[0]: vmnet: Allocated hub 0xffffff8027366000 for hubNum 1.
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNET_SO_BINDTOHUB: port: paddr 00:50:56:ea:4d:38
Aug 14 22:52:36 zecora kernel[0]: vmnet: Hub 1
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 0
Aug 14 22:52:36 zecora kernel[0]: vmnet: VNetUserIfFree: freeing userIf at 0xffffff8026381400.
Aug 14 22:52:36 zecora kernel[0]: vmnet: netif-vmnet1: Adding protocol 2.
Aug 14 22:52:36 zecora kernel[0]: vmnet: netif-vmnet1: SIOCSIFFLAGS: 0x8863
Aug 14 22:52:36 --- last message repeated 1 time ---
Aug 14 22:52:36 zecora kernel[0]: vmnet: VNetUserIf_Create: created userIf at 0xffffff802625c200.
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNetConnect: returning port 0xffffff802625c200
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNET_SO_BINDTOHUB: port: paddr 00:50:56:e1:f6:de
Aug 14 22:52:36 zecora kernel[0]: vmnet: Hub 1
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 0
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 1
Aug 14 22:52:36 zecora kernel[0]: vmnet: VNetUserIf_Create: created userIf at 0xffffff802625c600.
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNetConnect: returning port 0xffffff802625c600
Aug 14 22:52:36 zecora kernel[0]: vmnet: Hub 8 does not exist, allocating memory.
Aug 14 22:52:36 zecora kernel[0]: vmnet: Allocated hub 0xffffff80273a0000 for hubNum 8.
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNET_SO_BINDTOHUB: port: paddr 00:50:56:ff:f7:86
Aug 14 22:52:36 zecora kernel[0]: vmnet: Hub 8
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 0
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNetSetopt: Set link state UP
Aug 14 22:52:36 zecora kernel[0]: vmnet: VNetUserIf_Create: created userIf at 0xffffff8026381a00.
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNetConnect: returning port 0xffffff8026381a00
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNET_SO_BINDTOHUB: port: paddr 00:50:56:fb:ea:02
Aug 14 22:52:36 zecora kernel[0]: vmnet: Hub 8
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 0
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 1
Aug 14 22:52:36 zecora kernel[0]: vmnet: VNetUserIfFree: freeing userIf at 0xffffff8026381a00.
Aug 14 22:52:36 zecora kernel[0]: vmnet: netif-vmnet8: Adding protocol 2.
Aug 14 22:52:36 zecora kernel[0]: vmnet: netif-vmnet8: SIOCSIFFLAGS: 0x8863
Aug 14 22:52:36 --- last message repeated 1 time ---
Aug 14 22:52:36 zecora kernel[0]: vmnet: VNetUserIf_Create: created userIf at 0xffffff8026224200.
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNetConnect: returning port 0xffffff8026224200
Aug 14 22:52:36 zecora kernel[0]: vmnet: VMNET_SO_BINDTOHUB: port: paddr 00:50:56:ee:78:03
Aug 14 22:52:36 zecora kernel[0]: vmnet: Hub 8
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 0
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 1
Aug 14 22:52:36 zecora kernel[0]: vmnet: 	Port 2
Aug 14 22:52:36 zecora kernel[0]: vmioplug: Loaded @ 0xffffff7f82878d00: Info 0xffffff7f8287b0a8 Name com.vmware.kext.vmioplug.12.1.17 Version 12.1.17 build-1887983 at Jun  9 2014 21:46:29
Aug 14 22:52:36 zecora kernel[0]: considerRebuildOfPrelinkedKernel prebuild rebuild has expired

In my half-informed way, I bet the word "bridge" matters. :-}

Link to comment
Share on other sites

Odd IPv6 issue when transferring data *from* this machine to another: The transfers start very slow and quickly stall completely. *Small* amounts of data, eg: visiting an ipv6 site, or logging into an interactive shell and not doing much, works, though possibly sluggish; it's when data is transferred in quantity that it stalls.

 

Data being transferred the other way, being *received* by this machine, is fine.

 

IPv4 is fine at all times.

 

AppleIntelE1000e.kext v3.1.0; upgraded to that just now, previously 3.0.4.1, but no change in symptoms.

 

Mavericks, 10.9.4.

 

DPCIManager and System Profile reports the interface is an I217-V. Vendor:8086; Device:153B; Sub-vendor: 1043; Sub-device: 859F. (This is an Asus Z87I-PRO mobo, using its onboard ethernet interface.) By the way until a couple of months ago this machine was running Linux (Ubuntu, up to Trusty) and so presumably was running the original of the driver this is based on. There were none of these problems.

 

Tried setting that TSO option to false, it made no apparent difference.

 

What *does* make a difference, is running VMWare (Fusion Professional 6). While the VMWare app is running, IPv6 works fine both ways. Then if I quit VMWare again, it fails again as described.

 

I have a VM that normally runs in the background in bridged networking mode, which is why it took me a long time to discover this problem; only discovering it when forgetting to restart that VM after an unrelated reboot earlier today. It doesn't need the VM itself to be running to "fix" the IPv6 transmission problem; just the VMWare app.

 

My guess is that VMWare is putting the ethernet interface into some operational mode where it all works, but which isn't default, and is (as you would want) restoring it to its former state when it quits. Whatever that is though, it's not obvious from ifconfig output which shows, both while VMWare is running, and after it's quit (when the problem returns), no visible difference in the settings for the ethernet interface.

 

NB: VMWare has a separate IPv6 problem, whereby it won't speak IPv6 between host and guest, even while host and guests are separately quite happy to speak IPv6 to the rest of the world. *That* also happens when running on a real Mac (Sandy Bridge Mac Mini Server); I'm just mentioning it on the offchance it's relevant.

 

There's *another* problem which I think is separate too, just mentioning in passing just in case, where large data transfers between host and guest (again, in bridged-mode networking) can cause the ethernet driver to crash, and all ethernet connections to fail, often forcing a hard reset to recover (machine won't shut down). Data transfer between guest and other machines is fine though. I'm not sure where the problem lies with this, whether the driver, vmware, at one point I thought the usb3 driver was implicated, though no more and I think it may have happened once with the real mac too though I haven't yet taken the time to re-test that thoroughly. So *this* isn't a bug-report about *that*, yet, I'm just mentioning it in case someone goes "ahh" in relation to the - i think separate - bug i am reporting. 

 

Sounds more like a routing problem or something's wrong with the bridge? Anyway, transfers between host and guest don't involve the ethernet interface at all. I've seen a user reporting a similar problem with my Realtek driver last year, while other users with the same configuration have no problems at all. I guess it's a messed up system that causes this to happen. As VMWare is hooked up tightly to the system it might be cause the for all the trouble, in particular while it is not running.

 

Mieze

Link to comment
Share on other sites

I have no problem in accessing IPv6 sites. (nic = i218v,82572)

 

I have no problem accessing IPv6 sites either. It's just when copious amounts of data get transmitted *out* through the interface on IPv6 that it stalls. Not much data tends to go in HTTP requests, unless you're uploading big files. *Incoming* data transfer is fine, so I could download as much as I liked from said IPv6 sites and see no apparent problem.

Sounds more like a routing problem or something's wrong with the bridge? Anyway, transfers between host and guest don't involve the ethernet interface at all. I've seen a user reporting a similar problem with my Realtek driver last year, while other users with the same configuration have no problems at all. I guess it's a messed up system that causes this to happen. As VMWare is hooked up tightly to the system it might be cause the for all the trouble, in particular while it is not running.

 

Mieze

 

I don't think it's a routing problem; given it affects transfers between machines on the same network segment, plugged into the same switch, and *small* amounts of network traffic (such as ping6, HTTP *requests*, ssh session) are working. Zeroconf can happily resolve an IPv6 address for a given .local name before an IPv4 one, so either one can be used interchangeably unless you force the issue. Of course it doesn't *matter* which you use as long as both are *working*...

 

As for "something wrong with the bridge"... except it all goes *right* when the bridge is set up.

 

And remember the same issue isn't occurring on a real Mac (mid 2011 mac mini server) with the same VMWare installed-but-not-running.

 

The host<>guest transfer problem; you're right I wouldn't expect any actual data to be going through the interface, but when the guest is on the *bridged* connection, I can believe the *driver* is involved, though I don't know if that's the case. But I haven't tested yet if it happens when the guest is, eg: using a NAT connection and I should.

 

(I *need* bridged for this VM though: This hackintosh used to be my main Linux workstation/server, and it has a Linux RAID5 array in it. The VM is serving that out to the network. But ironically because of this problem it can't serve it to the host without risking a system crash.)

 

But in any case the host<>guest transfer problem isn't actually *the* problem I'm reporting; I just included it as a possibly-relevant observation. :-)

 

Yes, I should consider if VMWare has left some misconfiguration in which only gets resolved when its own network interfaces are set up. I'll try a full, careful deinstallation of VMWare later to see if it makes a difference. VMWare Support sent a full set of instructions for completely removing VMWare when I reported another problem. I refused to do it then because I reckoned I was stuck in their support-drone script and it was obviously make-work for an obvious GUI bug. But for *this* I'll do it, as there's a rational basis for believing it could be a factor.

 

Hoping I don't have to reinstall OSX on this machine (because I won't, at least until Yosemite's out). :-)

Link to comment
Share on other sites

Removing VMWare entirely from the system (and rebooting) according to instructions here http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1017838 made no difference at all.

 

This is what it looks like (twilight.local is a Linux box about two feet away in ethernet cabling, via a stateless switch (Netgear GS108):

zecora:~ rachel$ scp -4 Desktop/S01E01\ Kalahari.mkv twilight.local:.
S01E01 Kalahari.mkv                           100%   17GB 109.9MB/s   02:43    
zecora:~ rachel$ scp -4 twilight.local:"S01E01\ Kalahari.mkv" .
S01E01 Kalahari.mkv                           100%   17GB 111.3MB/s   02:41    
zecora:~ rachel$ scp -6 twilight.local:"S01E01\ Kalahari.mkv" .
S01E01 Kalahari.mkv                           100%   17GB  87.4MB/s   03:25    
zecora:~ rachel$ scp -6 Desktop/S01E01\ Kalahari.mkv twilight.local:.
S01E01 Kalahari.mkv                             0% 2688KB  20.2KB/s - stalled -

And on the line where it's stalled, as usual for such situations, it keeps trying to adjust the ETA; the transfer rate rises and falls, rising sometimes as high as 25KB/s. And in fact the data is *very slowly* getting through.

 

But if you think about it, 20KB/s is plenty fast enough for most HTTP GET/POST requests and for even quite fast touch-typists on an ssh login session. So accessing IPv6 sites *would* pretty much appear to work on a quick test. Uploading a file OTOH...

 

I'm also noticing this time that IPv6 transfers coming back *into* this machine are slower than IPv4, which I hadn't noticed before; though it's still usable.

 

And then, after reinstalling VMWare - *just* that, reinstalling, not starting any VMs, but the application is running:

zecora:~ rachel$ scp -6 Desktop/S01E01\ Kalahari.mkv twilight.local:.
S01E01 Kalahari.mkv                           100%   17GB 106.0MB/s   02:49    
zecora:~ rachel$ scp -6 twilight.local:"S01E01\ Kalahari.mkv" .
S01E01 Kalahari.mkv                           100%   17GB  87.8MB/s   03:24    

IPv6 transmission speed up to full, matching IPv4 speeds, or close enough as makes no odds; IPv6 *receiving* speeds unchanged; not sure what that's about. I repeated doing it via IPv4 but got the same speeds there as before, so it's not that my SSD suddenly went go-slow on writing.

 

It certainly looks like the driver is *able* to receive and transmit fast on IPv6; just that it's not by default in the right state to do so, and VMWare, in something it's doing, is correcting that state.

 

BTW FWIW I have the VMWare setting to "require authentication to enter promiscuous mode" and it hasn't been asking me for authentication when it starts, so I don't *think* it's that, though I'd have expected it to need to do that for bridging to work?

 

Meanwhile, on the real mac (using driver AppleBCM5701Ethernet.kext) with VMWare not running, data transfers were a bit rubbish in comparison, especially considering it's right next to the other two machines The internet cable's a bit longer and older though, maybe that makes that much difference. There I'm getting:

celestia:~ rachel$ scp -6 twilight.local:"S01E01\ Kalahari.mkv" .
rachel@twilight.local's password: 
S01E01 Kalahari.mkv                           100%   17GB  77.9MB/s   03:50    
celestia:~ rachel$ scp -6 S01E01\ Kalahari.mkv twilight.local:.
rachel@twilight.local's password: 
S01E01 Kalahari.mkv                           100%   17GB  65.4MB/s   04:34    

But not stalling at all the way it is on this machine if VMWare's not running. So it's still usable, if clearly not as good as the hardware/driver on my hackintosh.  :)  (This simple file transfer provoked the fan into full-speed operation too!) The IPv4 timings were the same. The point here is that none of the timings were affected either way by VMWare running or not.

 

Also, rather annoyingly, today I have been unable to reproduce the network crash when transferring data between host and guest. I'd switched to NAT and it was all fine, so I switched back to bridged looking to confirm the positive, but it was all fine there today as well. Ho-hum.  :wacko:  It was always intermittent but I tested long enough to have expected a failure like those I'd had before. What *had* changed in the interim though was installing the newer AppleIntelE1000e.kext.

Link to comment
Share on other sites

@StrangeNoises: What I'm trying to tell you is that the network configuration of OS X seems to be quite delicate. During my Realtek and Atheros driver development for OS X I discovered that you can easily mess up the network system and get to a point where only a reinstall (or Timemachine Backup) can restore your system to full operation. This starts with things like App Store isn't working up to a complete loss of network connectivity on all interfaces.

 

A driver handles packets, with exception of promiscuous mode and a MTU change there is no operational mode you can put it into. But let's take a look on other driver related features that have an influence on performance:

  • TSO can seriously degrade performance when it's not implemented correctly or the remote host isn't configured properly (small buffers, timing, etc.).While implementation bugs can be easily tracked down by examining the log files (hardware deadlocks) or packet dumps with Wireshark (bunch of packets with bad checksums on the remote host), timing issues are much harder to find. 
  • Checksum offload issues will result in bad packets too.
  • Timing issues (interrupt handling, etc.) can interact with TCP's flow control mechanisms making it believe that the connection is slow although there is plenty of bandwidth headroom.
  • EEE (Energy Efficient Ethernet) sometimes leads to an unstable connection. For example my 2011 iMac isn't able to establish any connection to my switch when EEE is enabled.

Mieze

Link to comment
Share on other sites

You gave me a little checklist. ;)  It looks like some at least can be changed on the fly by ifconfig. I'm not terribly hopeful, given whatever VMWare may or may not do, it doesn't seem to change any of those.

 

Hm, no, trying to change any of those settings via ifconfig just gets "Operation not supported on socket" eg:

zecora:~ rachel$ sudo ifconfig en0 -tso
ifconfig: -tso: Operation not supported on socket
zecora:~ rachel$ sudo ifconfig en0 -rxcsum -txcsum
ifconfig: -rxcsum: Operation not supported on socket

Guessing driver needs to be written to support on-the-fly changes like that?

 

OK, anyway. MTU is 1500 throughout. It does enter promiscuous mode while the VM is running in bridged mode (PROMISC shows in the flags list, value 8963), but not when VMWare is running, but no VM is (which is when the IPv6 transmission works fine). I pasted the ifconfig output in the earlier post; nothing's changed since; except the addition of that PROMISC when the VM is actually running. Transfer rates are the same between these two states.

 

I tried disabling TSO (as reported earlier); but it had no apparent effect. Basically by changing

			<key>NETIF_F_TSO</key>
			<true/>

to

			<key>NETIF_F_TSO</key>
			<false/>

in /System/Library/Extensions/AppleIntelE1000e.kext/Contents/Info.plist and rebooting. I'm going to re-test that later (dinner beckons), as I remembered I never looked at ifconfig's output during that test. Presumably I'm expecting the TSO4 and TSO6 flags to disappear. I don't know how to try flipping other parameters in the Info.plist as examples aren't already there, so presumably one can't. :)

 

All the other things you list look like things that would have a wider effect than just on IPv6 in one direction. Also to characterise this as a "performance" issue is perhaps misleading. I'm slightly interested that *receiving* on IPv6 is a few tens of MB/s slower than on IPv4; *that* is a performance issue, but I don't really care that much, and I'd never noticed it until I was doing timings. When throughput drops to between 20KB/s and zero (stalled), that's a bit more serious. That's a "it's not working" issue. :)

 

It would be interesting to hear definitively that someone with a similar interface using this driver, whose machines have never seen VMWare, is sending big files to another machine over IPv6 without any difficulty. Browsing to ipv6 enabled sites does not count; that works for me too. Unless you're going to upload big files to them.

 

I don't really have a problem; unless I'm testing stuff like this I *am* running VMWare all the time, so the problem is solved for me, though I'm sure it's an accidental workaround. I reported it because I guessed other people might have this problem, obscured only by the relative rarity of operational IPv6 out there.

 

BTW this topic, http://www.insanelymac.com/forum/topic/295254-strange-intel-82579v-detected-hardware-unit-hang-when-under-heavy-load-with-appleintel1000e/ that came up while googling the "Operation not supported" error, precisely describes the symptoms I was getting in large transfers to a VMWare guest on the bridged interface. The fix then was a newer version of the driver. So perhaps more hope that there's a fix in the *newest* version that fixes it for me when a bridge is involved. Nothing else has changed, it's my best guess. :)

Link to comment
Share on other sites

In order to track down the issue it would be best to run some tests with iperf while making a packet dump with Wireshark on the remote machine. Don't for get to disable checksum offload on the remote machine as well as to enable TCP checksum verification in Wireshark's preferences. Start iperf in server mode on the remote machine.

 

After that you can start iperf in client mode on your "problem" machine connecting to the remote machine. When the transfer slows down or stalls, take a look at the packets Wireshark captured. Does the packet stream stop or are there many packets with bad TCP checksum's (in case there are, calculate the difference between the actual checksum and the correct one, is there any systematic relation between both?)? When iperf's stream stalled, ping the machine to verify that it's reachable over the network.

 

Mieze

Link to comment
Share on other sites

OK, I can try to gather some data, but the analysis you describe will be beyond me; I'm way out of my expertise zone here. ;)

 

Firstly on remote (Linux) machine before I do anything:

rachel@twilight:~$ ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: on
	tx-checksum-ipv4: on
	tx-checksum-ip-generic: off [fixed]
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
...

remembering that we're specifically looking at IPv6; that tx-checksum-ipv6: off mean anything?

 

Anyway:

rachel@twilight:~$ sudo ethtool -K eth0 tx off
Actual changes:
tx-checksumming: off
	tx-checksum-ipv4: off
tcp-segmentation-offload: off
	tx-tcp-segmentation: off [requested on]
rachel@twilight:~$ sudo ethtool -K eth0 rx off
rachel@twilight:~$ 

OK, twilight is headless, so logged into it from a laptop (real macbook air) specifically using IPv4 to run wireshark & iperf -s remotely. Started Wireshark recording on eth0 on twilight; started iperf -s -V on twilight, ran iperf -c twilight.local -V on zecora (the hackintosh) *four* times; this is the view from the *client* side, on the hackintosh:

zecora:~ rachel$ iperf -c twilight.local -V
------------------------------------------------------------
Client connecting to twilight.local, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  5] local xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 port 58721 connected with xxxx:xxxx:xxxx::4a5b:39ff:fe7e:7c port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec   426 MBytes   357 Mbits/sec
zecora:~ rachel$ iperf -c twilight.local -V
------------------------------------------------------------
Client connecting to twilight.local, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  5] local xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 port 58728 connected with xxxx:xxxx:xxxx::4a5b:39ff:fe7e:7c port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec  1.06 GBytes   912 Mbits/sec
zecora:~ rachel$ iperf -c twilight.local -V
------------------------------------------------------------
Client connecting to twilight.local, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  5] local xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 port 58740 connected with xxxx:xxxx:xxxx::4a5b:39ff:fe7e:7c port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-28.8 sec   256 KBytes  72.8 Kbits/sec
zecora:~ rachel$ iperf -c twilight.local -V
------------------------------------------------------------
Client connecting to twilight.local, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  5] local xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 port 58747 connected with xxxx:xxxx:xxxx::4a5b:39ff:fe7e:7c port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-29.3 sec   256 KBytes  71.6 Kbits/sec
zecora:~ rachel$

(Server-side console log shows exactly the same numbers.)

 

The first time, VMWare is running, and so is a (Linux) VM using bridged-mode networking.

 

The second time, that VM has been shutdown, but VMWare is still running.

 

The third time, VMWare has been quit.

 

The fourth time, VMWare is still quit, and I'm simultaneously running a ping6 twilight.local from zecora in another terminal. The stats at the end of that btw were:

zecora:~ rachel$ ping6 twilight.local
PING6(56=40+8+8 bytes) xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 --> xxxx:xxxx:xxxx::4a5b:39ff:fe7e:7c
...
--- twilight.local ping6 statistics ---
42 packets transmitted, 42 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.387/0.528/0.691/0.098 ms

The Wireshark file is 1.6GB long and being uploaded to my owncloud. I'll give you and/or hnak a link in PM if you want to look at it (given of course it has my raw IPv6 prefix in there) (I'll wait to be asked). It's the first time I've ever used this app; what I can see though as soon as the third test starts is *vast* numbers of checksum errors and retransmissions. The log is basically full of that. To take four that occurred in a sequence at random:

Checksum: 0x582b [incorrect, should be 0x512b (maybe caused by "TCP checksum offload"?)]
Checksum: 0x4ea3 [incorrect, should be 0x47a3 (maybe caused by "TCP checksum offload"?)]
Checksum: 0x4596 [incorrect, should be 0x3e96 (maybe caused by "TCP checksum offload"?)]
Checksum: 0x3c89 [incorrect, should be 0x3589 (maybe caused by "TCP checksum offload"?)]

Well, they're all 0x700 too high. That seems to be consistent as I look at a random sampling of others, except a couple (literally, two) right at the end of the last test which were different and more random-looking.

Link to comment
Share on other sites

@StrangeNoises: Thanks for the data. It looks like the issue is caused either by TSO6 or by TCP/IPv6 checksum offload and VMware seems to "heal" it by disabling the feature. What does "ifconfig en0" say? Is TSO6 enabled for the interface. I assume it is related to TSO6 as small amounts of TCP data are going through anyway and OS X offloads only large TCP packets (> MTU) without IP header options.

 

Mieze

Link to comment
Share on other sites

@StrangeNoises: Thanks for the data. It looks like the issue is caused either by TSO6 or by TCP/IPv6 checksum offload and VMware seems to "heal" it by disabling the feature. What does "ifconfig en0" say? Is TSO6 enabled for the interface. I assume it is related to TSO6 as small amounts of TCP data are going through anyway and OS X offloads only large TCP packets (> MTU) without IP header options.

 

Mieze

 

With vmware *and* VM running (actually running the VM puts it into promiscuous mode, that's the only difference I can see):

zecora:esparto rachel$ ifconfig en0
en0: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
	options=6b<RXCSUM,TXCSUM,VLAN_HWTAGGING,TSO4,TSO6>
	ether d8:50:e6:4d:50:a2 
	inet6 fe80::da50:e6ff:fe4d:50a2%en0 prefixlen 64 scopeid 0x4 
	inet6 xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 prefixlen 64 autoconf 
	inet6 xxxx:xxxx:xxxx::9d5b:af8c:89df:674c prefixlen 64 deprecated autoconf temporary 
	inet 192.168.1.4 netmask 0xffffff00 broadcast 192.168.1.255
	inet6 xxxx:xxxx:xxxx::48ee:7662:ea7:6928 prefixlen 64 autoconf temporary 
	nd6 options=1<PERFORMNUD>
	media: autoselect
	status: active

With the VM shut down, but VMWare still running:

zecora:esparto rachel$ ifconfig en0
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=6b<RXCSUM,TXCSUM,VLAN_HWTAGGING,TSO4,TSO6>
	ether d8:50:e6:4d:50:a2 
	inet6 fe80::da50:e6ff:fe4d:50a2%en0 prefixlen 64 scopeid 0x4 
	inet6 xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 prefixlen 64 autoconf 
	inet6 xxxx:xxxx:xxxx::9d5b:af8c:89df:674c prefixlen 64 deprecated autoconf temporary 
	inet 192.168.1.4 netmask 0xffffff00 broadcast 192.168.1.255
	inet6 xxxx:xxxx:xxxx::48ee:7662:ea7:6928 prefixlen 64 autoconf temporary 
	nd6 options=1<PERFORMNUD>
	media: autoselect
	status: active

And with VMWare quit as well (I can't see a difference - and essentially, TSO6 is enabled in both states):

zecora:esparto rachel$ ifconfig en0
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=6b<RXCSUM,TXCSUM,VLAN_HWTAGGING,TSO4,TSO6>
	ether d8:50:e6:4d:50:a2 
	inet6 fe80::da50:e6ff:fe4d:50a2%en0 prefixlen 64 scopeid 0x4 
	inet6 xxxx:xxxx:xxxx::da50:e6ff:fe4d:50a2 prefixlen 64 autoconf 
	inet6 xxxx:xxxx:xxxx::9d5b:af8c:89df:674c prefixlen 64 deprecated autoconf temporary 
	inet 192.168.1.4 netmask 0xffffff00 broadcast 192.168.1.255
	inet6 xxxx:xxxx:xxxx::48ee:7662:ea7:6928 prefixlen 64 autoconf temporary 
	nd6 options=1<PERFORMNUD>
	media: autoselect
	status: active

I noticed while doing the capture earlier, of course Linux has ethtool, which seems to have a *lot* more detail going on. A quick google isn't revealing an obvious equivalent for OSX though.

Link to comment
Share on other sites

As far as I know flags in options only show support for a certain feature by the driver in principal, even if it is disabled at the moment. Can you provide me a small extract of the captured data with a sequence of bad packets. 1.6GB is far too much to be useful.

 

Mieze

Link to comment
Share on other sites

My first guess that TSO6 wasn't implemented correctly has turned out to be true. Attached to this post you will find a patched version, source code as well as prebuilt binary for Mavericks. The problem was that the IPv6 pseudo header checksum provided to the NIC was incorrect so that it produced lots of packets with bad TCP checksums. Due to the lack of time I only implemented a quick fix but the code needs a general cleanup. I leave this up to hnak.  ;)

 

Here is what I changed:

/* The pseudo header checksum provided by the network stack includes
 * the IP payload length but Microsoft's specification says that only
 * source and destination address as well as the protocol number
 * should be included so that we have to adjust the checksum first.
 *
 * See: http://msdn.microsoft.com/en-us/library/windows/hardware/ff568840(v=vs.85).aspx
 */

static inline UInt16 adjustPseudoHdrCSumV6(struct ip6_hdr *ip6Hdr, struct tcphdr *tcpHdr)
{
    UInt32 plen = ntohs(ip6Hdr->ip6_ctlun.ip6_un1.ip6_un1_plen);
    UInt32 csum = ntohs(tcpHdr->th_sum) - plen;
    
    csum += (csum >> 16);
    
    return htons((UInt16)csum);
}

static int e1000_tso(struct e1000_ring *tx_ring, struct sk_buff *skb,
                     int* pSegs, int* pHdrLen)
{
#ifdef NETIF_F_TSO
	struct e1000_context_desc *context_desc;
	struct e1000_buffer *buffer_info;
	unsigned int i;
	u32 cmd_length = 0;
	u16 ipcse = 0, mss;
	u8 ipcss, ipcso, tucss, tucso, hdr_len;
    int rc = 0;

    mbuf_tso_request_flags_t request;
    u_int32_t value;
    if(mbuf_get_tso_requested(skb, &request, &value) || (request & (MBUF_TSO_IPV4|MBUF_TSO_IPV6)) == 0)
		return 0;
    mss = value;

    struct tcphdr* tcph;
	int ip_hlen;
	u_int16_t csum;
	u_int32_t skbLen = mbuf_pkthdr_len(skb);
	u8* dataAddr = (u8*)mbuf_data(skb);
    if (request & MBUF_TSO_IPV4) {
		struct ip *iph = ip_hdr(skb);
		tcph = tcp_hdr(skb);
		ip_hlen = ((u8*)tcph - (u8*)iph);
		iph->ip_len = 0;
		iph->ip_sum = 0;
		csum = in_pseudo(iph->ip_src.s_addr, iph->ip_dst.s_addr,
						 htonl(IPPROTO_TCP));
		cmd_length = E1000_TXD_CMD_IP;
		ipcse = ETH_HLEN + ip_hlen - 1;
        ipcso = ETH_HLEN + offsetof(struct ip, ip_sum);
        rc = 4;
	} else if (request & MBUF_TSO_IPV6) {
		struct ip6_hdr *iph = ip6_hdr(skb);
		tcph = tcp6_hdr(skb);
        
                csum = adjustPseudoHdrCSumV6(iph, tcph);
        
		ip_hlen = ((u8*)tcph - (u8*)iph);
		iph->ip6_ctlun.ip6_un1.ip6_un1_plen = 0;
		ipcse = 0;
        ipcso = 0;
        rc = 6;
	}
	tcph->th_sum = csum;
	hdr_len = (u8*)tcph - dataAddr + (tcph->th_off * 4);

    /* TSO Workaround for 82571/2/3 Controllers -- if skb->data
     * points to just header, pull a few bytes of payload from
     * frags into skb->data
     */
    /* we do this workaround for ES2LAN, but it is un-necessary,
     * avoiding it could save a lot of cycles
     */
    if (mbuf_len(skb) == hdr_len) {
        unsigned int pull_size = 4;
        if (mbuf_pullup(&skb, pull_size)) {
            IOLog("mbuf_pullup failed.\n");
            return -1;
        }
    }

	ipcss = ETH_HLEN;
	tucss = ETH_HLEN + ip_hlen;
	tucso = ETH_HLEN + ip_hlen + offsetof(struct tcphdr, th_sum);
    
	cmd_length |= (E1000_TXD_CMD_DEXT | E1000_TXD_CMD_TSE |
                   E1000_TXD_CMD_TCP | (skbLen - (hdr_len)));
    
	i = tx_ring->next_to_use;
	context_desc = E1000_CONTEXT_DESC(*tx_ring, i);
	buffer_info = &tx_ring->buffer_info[i];
    
	context_desc->lower_setup.ip_fields.ipcss = ipcss;
	context_desc->lower_setup.ip_fields.ipcso = ipcso;
	context_desc->lower_setup.ip_fields.ipcse = cpu_to_le16(ipcse);
	context_desc->upper_setup.tcp_fields.tucss = tucss;
	context_desc->upper_setup.tcp_fields.tucso = tucso;
	context_desc->upper_setup.tcp_fields.tucse = 0;
	context_desc->tcp_seg_setup.fields.mss = cpu_to_le16(mss);
	context_desc->tcp_seg_setup.fields.hdr_len = hdr_len;
	context_desc->cmd_and_length = cpu_to_le32(cmd_length);
    
	buffer_info->time_stamp = jiffies();
	buffer_info->next_to_watch = i;
    
	i++;
	if (i == tx_ring->count)
		i = 0;
	tx_ring->next_to_use = i;

    *pSegs = ((skbLen - hdr_len) + (mss-1))/mss;
    *pHdrLen = hdr_len;

	e_dbg("e1000_tso: skbLen=%d, hdr_len=%d(%d,%d), mss=%d, segs=%d, rc=%d\n",
		  (int)skbLen, (int)hdr_len,
		  (int)ip_hlen, (int)(tcph->th_off * 4),
		  (int)mss, *pSegs,
		  rc);
	return rc;
#else /* NETIF_F_TSO */
	return 0;
#endif /* NETIF_F_TSO */
}

With the patched version iperf delivered the following test results (Intel 82574L) for a connection to a 2011 iMac in my local network:

misi-2:~ laura$ iperf -V -c fe80::ca2a:14ff:fe23:cdd1%en0
------------------------------------------------------------
Client connecting to fe80::ca2a:14ff:fe23:cdd1%en0, TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  4] local fe80::6a05:caff:fe0e:4297 port 49198 connected with fe80::ca2a:14ff:fe23:cdd1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.03 GBytes   889 Mbits/sec

Mieze

AppleIntelE1000e-TSO6-fixed.zip

  • Like 3
Link to comment
Share on other sites

My first guess that TSO6 wasn't implemented correctly has turned out to be true. Attached to this post you will find a patched version, source code as well as prebuilt binary for Mavericks. The problem was that the IPv6 pseudo header checksum provided to the NIC was incorrect so that it produced lots of packets with bad TCP checksums. Due to the lack of time I only implemented a quick fix but the code needs a general cleanup. I leave this up to hnak.  ;)

 

Here is what I changed:

[etc]

 

Great. :) I'll test this for my system as soon as I get up.

Link to comment
Share on other sites

Great. :) I'll test this for my system as soon as I get up.

 

yep, that's working for me now. :) Oddly seemed to need to "warm up" as I got slower (but still usable, from 50-77MB/s) times immediately after boot, but after a few minutes settling to about 108MB/s; but that applied to both IPv4 and IPv6 so, odd as it may be, I expect it's not relevant. Maybe just the system doing post-boot activities on the sly after pretending it was all done. ;)

Link to comment
Share on other sites

 Share

×
×
  • Create New...