Oss/libnetdevname
From DellLinuxWiki
Contents |
Get Involved
- Join us in #biosdevname on FreeNode.
- File bugs against biosdevname behavior in the Red Hat Bugzilla.
- Email discussion on the linux-hotplug mailing list
- See the code in git
About
Linux Enumeration of NICs in the Enterprise
Servers in the Enterprise continue to have increasing number of network ports. Current Dell PowerEdge 11 Generation machines contain four onboard Lan-On-Motherboard (LOM) NICs and can take in additional PCIe interface cards that can contain up to four ports each. Given the large number of NICs, naming/ordering the interfaces is a major issue. There is currently no standard method for the operating system to enumerate these NIC ports in a particular order either in accordance to the BIOS or otherwise.
Dell addressed this with hardware design changes that ensure LOMs and add-on NICs appear in a desired order under PCIe.
There was also an attempt to address this problem by the BIOS passing the LOM ordering information via the SMBIOS table type 41 and the Dell developed open-source udev helper - biosdevname that reads the SMBIOS table and influences the naming of the interfaces.
Both these methods did not solve the problems entirely on all Linux distributions for all possible cases. NIC enumeration even with biosdevname and the restriction on the location of the NIC device on the PCI tree is largely non-deterministic. For instance, parallel udev threads loading network drivers is one case that biosdevname or the hardware design cannot address.
This has been a major concern for System Administrators and ISVs who spend efforts to map physical ports to the names that the OS identifies them with for large scale deployments. This also breaks images that Enterprise customers use on multiple systems that assume network device name to physical port mapping. Deployment scripts and firewall rules that have been developed with the assumption of network device name to physical port mapping would also break.
Previous Workarounds
- 2006 Dell PowerSolutions article and name_eths script to rename devices after your system is installed. Does not solve the problem at install-time.
- 2007 Whitepaper on NIC Enumeration (revised 2009)
Implementation
Proposal 1
Provide a character device interface to ethernet devices like /dev/netdev/eth0 and map them to alternative naming conventions like /dev/net/by-chassis-label/Embedded_NIC_1. Utilities that require interface names can be patched to use "Embedded_NIC_1" in addition to eth0 with the help of a library.
- Status: Rejected upstream
- Discussions
- - Char devices for network interfaces
- - Historic Details of the issue and factors affecting the issue
Proposal 2
This implementation is similar to Proposal 1, except that device nodes are created without any changes in kernel, implemented entirely in user space. POC from Dann Frazier.
Proposal 3
This is an installer based proposal where the installer would provide the user with options to rename network interfaces based on various policies such as
a) firmware names b) based on MAC addresses c) based on the driver
Status - Rejected.
Proposal 4
This solution exports system firmware provided SMBIOS strings of onboard devices to sysfs.
cat /sys/class/net/eth0/device/smbiosname Embedded NIC 2
cat /sys/bus/pci/devices/0000\:03\:00.0/smbiosname Embedded NIC 2
User space library like libnetdevname would map these SMBIOS names to eth names and eth names to smbiosnames.
For example:
/sbin/ethtool -p "Embedded NIC 1" (Might be eth0 or eth1) /sbin/ip link show "Embedded NIC 2"
Status - Rejected.
- - Export smbios strings associated with onboard devices tosysfs
- - Add firmware label support to iproute2
Proposal 5
This solution uses the firmare provided index to derive ethN names. System firmware can assign indexes to device to communicate BIOS designated order of the onboard network devices. dmidecode -t 41 can provided this information. If the firmware provides an index for the corresponding pdev, the N is derived from the index.
As an example, consider a PowerEdge R710 which has 4 BCM5709 Lan-On-Motherboard ports,1 Intel 82572EI port and 4 82575GB ports. The system firmware communicates the order of the 4 Lan-On-Motherboard ports by assigning indexes to each one of them.
eth0 -> index=1 eth1 -> index=2 eth2 -> index=3 eth3 -> index=4
The add-in interfaces will named beyond 3.
eth4 -> no index eth5 -> no index
Status - Rejected
Proposal 6 - Biosdevname - Accepted Upstream
Biosdevname renames network interfaces to a different names space. The policy is as follows -
- Embedded devices: em<port>
- Add-in PCI cards: pci<slot>p<port>_<virtual function instance>
Status - Accepted Upstream
For example, on PowerEdge system with 4 BCM5709 Lan-on-Motherboard devices and a single port Intel 82572EI add-in network adapter in pci slot 4 and single dual port Intel 82576 add-in network adapter in pci slot 3, the names look like -
- [root@fedora-14-r710 ~]# ls /sys/class/net/
- em1 em2 em3 em4 lo pci3p1 pci3p2 pci4p1
em1, em2, em3 and em4 (em1 -> Ethernet-on-motherboard 1, em2->
Ethernet-on-motherboard 2 and so on)
- pci3p1 - pci<slot 3>p<port 1>
- pci3p2 - pci<slot 3>p<port 2>
The intel 82576 on the pci slot 3 which supports SRIOV, when the igb is loaded with max_vfs=2, the names look like -
- [root@fedora-14-r710 ~]# ls /sys/class/net/
- em1 em2 em3 em4 lo pci3p1 pci3p1_0 pci3p1_1 pci3p2 pci3p2_0 pci3p2_1 pci4p1
Where
- pci3p1_0---> virtual function instance 0
- pci3p1_1---->virtual function instance 1
Discussions -
- - UDEV - Concensus on approach reached at Linux Plumbers Conf 2010
- - biosdevname v0.3.1
- - extended netdevice naming proposal
Download -
Kernel Components
Presentations at various forums
- - LinuxCon2010 by Matt Domsch
- - Linux Plumbers Conf 2010 by Matt Domsch
- - Linux Plumbers Conf 2010 by Matt Domsch