Programmable Internetworking & Communication Operating System Docs ... Click Spaces -> Space Directory to see docs for all releases ...
Page tree
Skip to end of metadata
Go to start of metadata

Definition

MLAG (Multi-chassis Link Aggregation Group) as the name suggests, deploys LAG (Link Aggregation Group) technology to different member ports on a pair of devices which appear to be on a single device to the downstream third device in Layer 2. The figure below shows the physical topology and the logical topology of the MLAG network in Layer 2. The two MLAG peer devices, SwitchA1 and SwitchA2, maintain communication by exchanging MLAG control plane messages and MAC address learning of the LAG interface to ensure MAC synchronization using L2 multicast packets. The downstream device could be any endpoint equipment (L2 switch or server) that supports LACP Link Aggregation technology. It won’t get a feel that there are two devices linked with it at the other end of the link when dual-homing to the network through the MLAG peer devices.

Figure 1. Physical Topology and Logical Topology of the MLAG Networking

MLAG is mainly applied in scenarios where a downstream switch or host has to or needs to dual-access to the network. In Figure 1, before deploying MLAG, suppose SwitchB single-accesses to the network through SwitchA1 when spanning tree is enabled. If SwitchA1 device fails or the link fails, SwitchB fails to communicate with the network. By using MLAG, the downstream switch or host can dual-access to the network through SwitchA1 and SwitchA2 which enables link-level and device-level redundancy and protection.

This provides redundancy by giving the downstream switch or host two uplink paths as well as full bandwidth utilization since the MLAG domain appears to be a single switch to Spanning Tree Protocol (STP). So, there are no blocked ports as the MLAG domain appears to STP as a single switch.

As MLAG has the following advantages, it can be used to build a highly resilient and highly reliable Layer 2 network.

  • Increased Bandwidth

MLAG aggregates multiple Ethernet ports across two switches, this increases the uplink bandwidth. The maximum bandwidth of the link aggregation interface can reach the sum of the bandwidths of individual MLAG member ports.

  • Higher Reliability

Dual-working mechanism to ensure high reliability. When a link or device fails, traffic can be switched to the other available member links or device to improve the reliability of the MLAG domain.

  • Load Balancing

       In an MLAG domain, you can achieve load balancing on each active aggregation interface link.

Basic Concepts

  • MLAG domain and domain ID

MLAG domain defines the topology range of the MLAG calculations and control. An MLAG domain includes a pair of MLAG peer switches, the MLAG peer-link and the MLAG member ports. The MLAG domain ID is a unique identifier for an MLAG domain, which should be configured identically on each MLAG peer device in the same MLAG domain.

Currently, only one MLAG domain is allowed to be configured on one MLAG device. A pair of MLAG peer devices can be connected to different third-party devices to form different MLAGs. An MLAG domain can hold multiple MLAGs.

Figure 2 shows an MLAG domain with multiple MLAGs, where Switch1, Switch2 and the MLAG member ports connected to Switch3 form an MLAG1; Switch1, Switch2 and the MLAG member ports connected to Switch4 form another MLAG2.

Figure 2. Multiple MLAGs Network

Use the run show mlag domain command to view the MLAG domain information:

admin@Xorplus# run show mlag domain summary

Domain ID: 1    Domain MAC: 48:6E:73:FF:00:01    Node ID: 0
----------------------------------------------------------------------------------------------
Peer Link  Peer IP          Peer Vlan  Neighbor Status    Config Matched       MAC Synced  # of Links
---------  ---------------  ---------  ---------------    --------------      ----------  ----------
ae23       1.1.1.2          4088        ESTABLISHED         Yes                Yes          2

NOTE:

  • MLAG domain ID is required to be unique within the Layer 2 network.
  • The maximum number of MLAG interfaces/ports supported by the system is subject to the maximum number of LAGs supported by the switch. The maximum number of LAGs supported by each model is described in the command reference set interface aggregate-ethernet <lag_name>, the link is https://intranet.pica8.com/pages/viewpage.action?pageId=34538482.
  • MLAG domain MAC

Each MLAG domain has a unique domain ID which should be different between different MLAG domains. Once configured, both MLAG peer devices use the MLAG domain ID to automatically produce a unique MLAG domain MAC address which is defined as 48:6E:73:FF:00:<MLAG domain ID in hexadecimal>. For example, if the MLAG domain ID is 12, then the corresponding MLAG domain MAC address would be 48:6E:73:FF:00:0C.

MLAG domain MAC address is identical on both MLAG peer devices, it is used by LACP as part of system ID and by STP as part of bridge ID to communicate with other L2 devices. Use the command run show mlag domain {<domain-id>| summary} to show the MLAG domain information which includes the MLAG domain MAC.

  • MLAG peer

MLAG peer devices are a pair of switches that enables the MLAG function, which are defined as MLAG Node 0 or Node 1. Users have to use the CLI command set protocols mlag domain <domain-id> node <0 | 1> to specify the Node ID for the MLAG peer devices. If one of the MLAG peer devices is configured as Node 0, the other one should be configured as Node 1. The two nodes are all active, providing a reliable dual-access to the network for the MLAG access device.

The two nodes function equally and are not distinguished as master or slave. In most application scenarios, the two nodes have no difference, except for the following two cases:

  1. The single-homed port uses the original port ID on Node 0 peer device, however, an offset 1024 is added to the Port Index as a new port ID on Node 1.
  2. The MLAG member ports use the original port ID on Node 0 peer device, however, an offset 512 is added to the Link ID as a new port ID on Node 1.

We can see the port ID information in the display of LACP/STP related show command and BPDU packets.

  • MLAG peer link

MLAG peer link is the direct link between MLAG peer devices, used for transmitting part of the data traffic, MLAG state and MLAG control plane messages. Use the set protocols mlag domain <domain-id> peer-ip <peer-ipv4-address> peer-link <peer-interface-name> command to configure the remote peer-link port IP and the local peer-link interface. The interfaces directly connected to the two ends of the peer-link are peer-link ports.

A specified VLAN MUST be assigned to the peer-link interface, MLAG peer VLAN, which is dedicated to transmitting MLAG control plane messages and not transmitting data messages. Peer VLAN is always set to forwarding in order to allow MLAG information negotiation between MLAG peers. The following CLI commands is used to configure MLAG peer VLAN, the recommended value is 4088.

set protocols mlag domain <domain-id> peer-ip <peer-ipv4-address> peer-vlan <vlan-id>

If peer-link is down for any reason, MLAG control plane messages cannot be exchanged properly, causing the MLAG system to operate abnormally. Especially when peer-link is down, but both the MLAG member ports are up, the split-brain failure scenario occurs. The system cannot be automatically recovered in this scenario.

Therefore, to ensure the reliability of peer-link, note the following points when configuring and deploying peer link:

  1. Only one peer link connecting the two peer devices is allowed in an MLAG domain.
  2. When configuring the peer link, only one LAG port can be used as peer link.
  3. Use a LAG port with at least two directly connected physical ports to guarantees reliable communication between the peer devices on the peer link. Use of any intermediate transmission device between the two peer devices on the peer link is not allowed. All of the directly connected physical ports should be added into one LAG port to form the peer-link. We don’t support more than one L2 connection between MLAG peer switches.
  4. 10G or 40G speed ports should be used for peer link to enough bandwidth is provided when the network is deployed.
  5. Any manual action to shut down the peer link is strictly forbidden.
  6. Any MLAG VLAN and non-MLAG VLAN traffic MUST be allowed on MLAG peer-link.
  7. When numerous rapid PVST+ instances are configured, exceeding the default BPDU queue processing rate in CPU will result in BPDU packets loss or network loops. To resolve this problem, you can use the following CoPP command to increase the maximum bandwidth of BPDU queue. The default value is 80pps.

   set class-of-service scheduler bpdu-scheduler max-bandwidth-pps <value>

8. When numerous MLAG instances are configured, exceeding the default MLAG queue processing rate in CPU will result in MLAG control packets loss. To resolve this problem, you can use the following CoPP command to increase the maximum bandwidth for MLAG and MLAG MAC SYNC queues. The default value is 80pps.

   set class-of-service scheduler mlag-scheduler max-bandwidth-pps <value>

   set class-of-service scheduler mlag-mac-sync-scheduler max-bandwidth-pps <value>

NOTE:

When spanning tree protocol is enabled, the peer link port is always in forwarding state and won’t participate in the spanning tree calculation after peer link is established.

  • MLAG member port

MLAG member port is the LAG port on the MLAG peer devices that interconnects to the downstream device.

Usually, we configure MLAG member ports on the MLAG peer devices with the same LAG ID to form an MLAG. However, this is not required.

We have to bind the MLAG member port to the MLAG link ID. The paired MLAG member ports of the same MLAG must be bound to the same MLAG link ID. Different MLAGs are identified by different link IDs. For example we have two MLAGs in an MLAG Domain then link ID 1 could identify the first MLAG while link ID 2 could identify the second MLAG in the MLAG Domain.

After all the MLAG configurations are finished, MLAG peer devices send MLAG control plane messages to each other to determine an MLAG pair. Upon receiving the MLAG Control message from the peer device, the local device determines whether the link ID carried in the MLAG Control message is the same as that of the local. If the link ID configured on the two devices is the same, the two devices make an MLAG pair successfully.

User can use command run show mlag link to show the information about each MLAG and the MLAG member ports status.

admin@XorPlus# run show mlag link summary
# of Links: 2
Link   Local LAG   Link Status   Local Status   Peer Status   Config Matched   Flood
----   ---------   -----------   ------------   -----------   --------------   -----
1      ae1         IDLE          UP             UNKNOWN       No               No  
2      ae2         IDLE          UP             UNKNOWN       No               No

Figure 3. MLAG Member Port

When accessing the MLAG domain, the access devices are required to support LAG protocol. As shown in Figure 3, SwitchB is required to configure a LAG interface to interconnect to the MLAG member ports. It is strongly recommended to use LACP protocol when configuring the LAG interface.

MLAG State Machine

The MLAG state machine describes the state of the MLAG peer link and the MLAG member ports on the local device and the remote peer device. The MLAG state machine facilitate link fault detection and recovery. The system defines MLAG neighbor state and MLAG interface state to establish peer link and different MLAGs configured in this MLAG Domain.

MLAG uses the TCP protocol for reliably transmitting the MLAG control messages between the two peer devices to exchange the MLAG state change. The system changes the state based on the local MLAG state and the received peer MLAG Control message. You can view the MLAG interface state, and MLAG neighbor state by using related show commands.

MLAG Neighbor State

MLAG neighbor state shows the global status of MLAG peer device and peer-link, including the following values:

  • IDLE: The initial state of the global neighbor state machine when MLAG peer-link is configured.
  • CONNECTING: The peer-link ports are up. The peer-link connection is started. Both MLAG peers try to setup a TCP connection to each other.
  • ESTABLISHED: This state indicates that peer-link connection between the MLAG peer devices is established, the peer session and neighbor relationship is setup.

You can use the run show mlag domain {<domain-id>| summary} command to view the MLAG peer-link configuration information and the neighbor state. For example,

admin@Xorplus# run show mlag domain summary
Domain ID: 1    Domain MAC: 48:6E:73:FF:00:01    Node ID: 0
----------------------------------------------------------------------------------------------
Peer Link  Peer IP          Peer Vlan  Neighbor Status    Config Matched       MAC Synced  # of Links
---------  ---------------  ---------  ---------------    --------------      ----------  ----------
ae23       1.1.1.2          4088        ESTABLISHED         Yes                Yes          2

MLAG Interface State

MLAG interface state defines the status of peer link and MLAG member port, including the following values:

  • INIT: The initial state of MLAG, MLAG is disabled and no information is exchanged in this state.
  • IDLE: In this state, peer-link is configured, MLAG peer device initiates a TCP connection with the peer and changes its state. However, the peer-link session has not been established, MLAG link state switches from INIT to IDLE.
  • DOWN: In this state, peer-link session is established, that is, the MLAG neighbor state is ESTABLISHED, but the MLAG member port is not configured on the MLAG peer device. If the local MLAG member port is down, then the MLAG interface state is DOWN.
  • STANDBY: In this state, peer-link session is established, that is, the MLAG neighbor state is ESTABLISHED, but the MLAG member port is not configured on the MLAG peer device. If the local MLAG member port is up, then the MLAG interface state is STANDBY.
  • AS_DOWN: In this state, peer-link session is established, that is, the MLAG neighbor state is ESTABLISHED. MLAG member ports are configured on both MLAG devices. If the MLAG member ports on both sides are down, the MLAG interface state is AS_DOWN.
  • AS_PEER: In this state, peer-link session is established, that is, the MLAG neighbor state is ESTABLISHED. MLAG member ports are configured on both MLAG devices. If the local MLAG member port is down but peer MLAG member port is up, then the MLAG interface state is AS_PEER.
  • AS_LOCAL: In this state, peer-link session is established, that is, the MLAG neighbor state is ESTABLISHED. MLAG member ports are configured on both MLAG devices. If the local MLAG member port is up but peer MLAG member port is down, then the MLAG interface state is AS_LOCAL.
  • FULL: Peer session is established and MLAG member ports on both peer devices are up.

In brief, it can be summarized as the following table:

MLAG Interface State

Peer link session is established

Peer MLAG member port is configured

Local MLAG member port is up

Peer MLAG member port is up

INIT

-

-

-

-

IDLE

-

-

-

-

DOWN

-

-

-

STANDBY

-

-

AS_DOWN

-

-

AS_PEER

-

AS_LOCAL

-

FULL

You can use the run show mlag link {<link-id>| summary} command to view the state of the MLAG interface. For example,

admin@XorPlus# run show mlag link summary
# of Links: 2
Link   Local LAG   Link Status   Local Status   Peer Status   Config Matched   Flood
----   ---------   -----------   ------------   -----------   --------------   -----
1      ae1         IDLE          UP             UNKNOWN       No               No  
2      ae2         IDLE          UP             UNKNOWN       No               No

In the output, Link Status shows the MLAG interface state, Local Status shows the status of local MLAG member port.

MLAG Control Plane Messages

The MLAG provides MLAG control plane messages, which is used to transmit the following information between the MLAG peer devices:

  • MLAG state information.
  • Synchronization information (including STP information synchronization and multicast control information synchronization).
  • Configuration consistency check.

The MLAG control plane messages can be divided into two categories: L2 and TCP packets.

  • For L2 packet, the destination MAC is 01:80:C2:00:00:0F and EtherType is 0x6666.
  • For TCP packets, the destination port is 0xE290.

The format of MLAG control plane messages common header is:

Field      

Descriptions

Version

This field specifies the MLAG version. Currently, the version is 0x1.

Type

This field specifies the type of MLAG control plane messages.

  • 0x1 indicates MLAG Control message.
  • 0x2 indicates MAC Sync message.
  • 0x3 indicates STP Sync message.
  • 0x5 indicates Multicast Control Sync message.
  • 0x6 indicates Configuration Consistency message.

MLAG control message includes the following five types:

  • MLAG Control message

The MLAG Control message is used to maintain the MLAG status.

MLAG device sends an MLAG Control message under the following conditions:

           1.  MLAG neighbor state changes to ESTABLISHED.

           2.   Any MLAG interface state changes.

MLAG Control messages are encapsulated and transmitted via TCP protocol.

  • STP Sync

The STP Sync message is used to sync up STP dynamic information, such as the calculated root priority and link cost from the received BPDUs to the peer switch. The STP Sync message is encapsulated and transmitted via TCP protocol.

  • Multicast Control Sync

The Multicast Control Sync message is used to sync up IGMP/PIM dynamic information from the received IGMP/PIM message to the peer switch. The Multicast Control Sync message is encapsulated and transmitted via TCP protocol.

IGMP sees the MLAG LAG link as a unique logical link, so IGMP packets are synced between the MLAG peer devices through peer-link by Multicast Control Sync message:

IGMP packet received by either of the MLAG peer switches from MLAG port is synced to the other peer switch through peer link as if it is received by local MLAG port.

  • Configuration Consistency

The Configuration Consistency message is used to check the MLAG related configuration consistency between MLAG peers. The Configuration Consistency message is encapsulated and transmitted via TCP protocol.

MLAG device sends a Configuration Consistency message under the following conditions:

1.  A new MLAG related configuration is committed.

2.  MLAG neighbor state changes to ESTABLISHED.

MAC Synchronization

In order to ensure that the traffic of the same user can be forwarded normally at both ends of the MLAG peer device, the MAC address table on both peer devices needs to be consistent with each other. This is accomplished by MAC synchronization mechanism which sends MAC synchronization message that is transferred by L2 multicast packets with destination address 01: 80: c2: 00: 00: 0f. MD5 checksum is added to the message to ensure that the MAC address table is correctly synchronized.

Meanwhile, in order to control bandwidth consumption of the MLAG peer link caused by flooding of unknown unicast traffic, the MLAG peer switches should synchronize MAC address table with each other.

Only when both of the following two conditions are satisfied, the MAC Sync message will be sent:

  • MLAG neighbor state changes to ESTABLISHED.
  • There is a change in the MAC table.

There are three types of MAC addresses defined in MLAG: Static, Dynamic, and Peer-Sync, where Peer-Sync represents the dynamic MAC address synchronized from the MLAG peer device, and its priority is lower than that of static MAC. If one of the MLAG peer switch fails, the Peer-Sync MAC address on the other switch will be deleted from the MAC address table.

Static and learned MAC addresses from any port except the peer link port are synced to MLAG peer switch through peer link. The MLAG peer’s system MAC address which is learned on peer link is internally configured as static MAC address. New learned MAC addresses are immediately synced to the peer switch.

  • How to update the MAC table with synced MAC addresses:
    • The MAC addresses learned on the single-homed port are synced to peer link port of the peer switch on the peer link.
    • The mac addresses learned on the MLAG member port are synced to the respective MLAG member port of the peer device through the MLAG peer link.
    • System MAC will be synchronized to the peer switch MLAG peer-link port as a static MAC address.
    • The MAC addresses learned on the peer link port are not synced.
  • How to define the type of the MAC addresses:
    • If a MAC address is not statically configured but only learned on local MLAG switch, it is marked as “Dynamic” on local switch and “Peer-Sync” on peer switch.
    • If a MAC address is not statically configured but learned on both MLAG peer switches, it is marked as “Dynamic” on Node 0 switch and “Peer-Sync” on Node 1 switch.
    • Static MAC address has a higher priority so that it is not overridden by “Dynamic” and “Peer-Sync” MAC, but can override the “Dynamic” and “Peer-Sync” MAC types .
    • Static MAC addresses are not synced automatically, they should be synced manually.
    • If a static MAC address bound to a single-homed port is configured on one MLAG device, the static MAC address entry should be manually configured to bind to the peer-link interface on peer switch.
    • If a static MAC address bound to an MLAG member port is configured on one MLAG device, the static MAC address entry should be configured to bind to the MLAG member port on peer switch.
  • How the MAC addresses age out:

If the MAC addresses (Dynamic or Peer-Sync) age out or are cleared by CLI command on one of the MLAG peer devices, it is synced to the peer switch and removed from the peer switch as well.

You can use the run show mac-address table command to view the information about MAC address table, such as MAC address statistics, VLAN ID, MAC address, MAC address type and outbound interface.

For example,

Figure 4. A MAC Sync Example

When showing the MAC table on Switch A and Switch B, we can see that the dynamic MAC entry learned from the MLAG member port will be synchronized to the corresponding MLAG member port on the peer device, and dynamic MAC learned from the single-homed port will be synchronized to the peer-link port on the peer device.

admin@SwitchA# run show mac-address table
Total entries in switching table:   3
Static entries in switching table:  0
Dynamic entries in switching table: 3 

VLAN      MAC address           Type         Age      Interfaces         User
----      -----------------     ---------    ----     ----------------   ------
1         08:9e:01:61:64:13     Dynamic      300      ge-1/1/2           xorp
1         cc:37:ab:4f:ad:01     Peer-Sync    300      ae1                xorp
4088      8c:ea:1b:88:5b:81     Static       300      ae3                xorp 

admin@SwitchB# run show mac-address table
Total entries in switching table:   3
Static entries in switching table:  0
Dynamic entries in switching table: 3

VLAN      MAC address           Type         Age     Interfaces         User
----      -----------------     ---------    ----    ----------------   ------
1         08:9e:01:61:64:13     Peer-Sync    300     ae3                xorp
1         cc:37:ab:4f:ad:01     Dynamic      300     ae1                xorp
4088      8c:ea:1b:88:5b:82     Static       300     ae3               xorp

When VXLAN is deployed in an MLAG domain, MAC sync between MLAG peer devices is different.

As shown in the following figure, the switches on the access side, SwitchC and SwitchD, are dual-homed to an MLAG domain. At the same time, a VXLAN tunnel is established between MLAG peer device SwitchA and SwitchB, so that Layer 2 devices on the access side can communicate over Layer 3 networks.

Figure 5. MLAG Topology with VXLAN

In this application, the MAC synchronization process is,

  • The MAC addresses learned on the local network side port (Ge-1/1/3) are synced to the respective network side port (Ge-1/1/3) of the peer device through the MLAG peer link, as the system regards the network side ports on SwitchA and SwitchB as the same port.
  • The MAC addresses learned on the local access side port, the MLAG member port in the figure, are synced to the respective access side port of the MLAG peer device through the MLAG peer link.
  • MAC addresses learned on the single-homed port will not be synced to the peer-link port of the remote MLAG Node except in the following two cases.
    • When MLAG interface state is AS_LOCAL, the MAC addresses learned on the local access side port, the MLAG member port in the figure, are synced to the peer-link port of the MLAG peer device.
    • If the network port on one VTEP device goes down, the corresponding network on the peer VTEP is regarded as a single port and the MAC addresses learned this single port will be synchronized to the peer-link port of the MLAG peer device.

When showing the MAC address table, the value in Interfaces column is vxlan for MAC addresses synced from the peer VTEP device. For example,

admin@Xorplus# run show mac-address table
Total entries in switching table:   3909
Static entries in switching table:  6
Dynamic entries in switching table: 3903 
VLAN      MAC address          Type         Age     Interfaces         User
----      -----------------    ---------   ----   ----------------   ----------
1         20:04:0f:0f:49:d1    Dynamic      300     ae2                xorp 
N/A       00:00:0a:11:11:11    Peer-Sync    300     vxlan              xorp    
N/A       00:00:0a:11:11:12    Peer-Sync    300     vxlan              xorp  

You can also use command run show vxlan address-table to show VXLAN MAC table. For example,

admin@Xorplus# run show vxlan address-table
VNID           MAC address          Type       Interface      VTEP
----------------------------------------------------------------------------------------
10000          00:00:0a:11:11:11    Sync                    145.145.145.145
10000          00:00:0a:11:11:12    Sync                    145.145.145.145
10000          20:04:0f:0f:49:d1    Sync         ae2  

The first two lines in the display result show the MAC addresses synced from the VXLAN network side port of the peer VTEP device, and the third line shows the MAC address synced from the access side port of the peer VTEP device.

Configuration Consistency Check

To ensure that the MLAG peer devices appear as one device to the downstream device, and to make the MLAG function operate normally and smoothly, the configuration on each MLAG peer device needs to be consistent.

PICOS automatically checks the configuration consistency of the MLAG peer devices by exchanging Configuration Consistency message.

MLAG device sends a Configuration Consistency message under the following conditions:

  • MLAG neighbor is established.
  • An MLAG related new configuration is committed.

Configuration consistency check is divided into two types: Global configuration and Per MLAG configuration.

Global configuration refers to the global configuration of the MLAG module, STP module, DHCP snooping module and the IGMP snooping module. The configuration inconsistency affects the overall establishment and operation of the MLAG domain, the peer-link establishment, and the entire network topology. Per MLAG configuration is mainly for the configuration of a single MLAG. Inconsistent configuration only affects the establishment and operation of a single MLAG, but does not affect other MLAGs.

Table 1 shows the MLAG configuration consistency check list.

Table 1. MLAG configuration consistency check list

Configuration to Check

Type

MLAG domain ID

Global

MLAG node ID

Global

MLAG link ID

Global

MLAG link count

Global

MLAG peer VLAN

Global

DHCP snooping enable or disable

Global

VLAN-based DHCP snooping enable or disable

Global

IGMP snooping enable or disable

Global

VLAN-based IGMP snooping enable or disable

Global

STP enable or disable

Global

STP mode

Global

STP MST region to VLAN mapping

Global

STP global settings (BPDU filter, BPDU Guard, Edge, Root Guard, TCN Guard, bridge priority, interface, Max age, Hello time, Forward delay)

Global/Per MLAG

Port LAG mode (LACP/Static)

Per MLAG

Port settings (MTU, Native VLAN, VLAN member)

Per MLAG

Port mode (Access/Trunk)

Per MLAG

MAC learning enable or disable

Per MLAG

MLAG will not function normally as long as there is inconsistency in any one of the following configuration items:

  • Global configuration: Domain ID, Node ID, Peer VLAN, Link Count, Link ID
  • Per MLAG configuration: MTU, MAC Learning, LAG Mode, Native VLAN, Port VLAN Mode

PICOS will clear all the MAC address entries at both MLAG peer devices, and MAC synchronization will not be performed between the MLAG peer until the configuration is modified to be consistent.

To ensure that the configuration parameters are consistent, we recommend that you run the MLAG consistency check command to display the configurations and the consistency check results for each MLAG peer device once you set a new MLAG related configuration.

You can use the run show mlag consistency-parameter {link <link-id>| summary} command to view the consistency check results. For example:

admin@Xorplus# run show mlag consistency-parameter link 3
Port Configurations:
-----------------------------------------------------------------
Property                 Local Value      Peer Value       Result
-----------------------  ---------------  ---------------  ------
MTU                      1514             1514             PASS 
Mac Learning             Yes              Yes              PASS 
Lag Mode                 LACP             LACP             PASS 
Native Vlan              3                3                PASS 
Port Vlan Mode           Access           Access           PASS 

Spanning-Tree Configurations:
-----------------------------------------------------------------
Property                 Local Value      Peer Value       Result
-----------------------  ---------------  ---------------  ------
Mode                     PVST             PVST             PASS 
BPDU Guard               No               No               PASS 
Root Guard               No               No               PASS 
Manual Forwarding        No               No               PASS 
Link Type                P2P              P2P              PASS 
Instance Count           1                1                PASS 
Instance Vlan 3                                                 
-- Port Priority         128              128              PASS 
-- Path Cost             0                0                PASS
 
admin@Xorplus# run show mlag consistency-parameter summary
Overall : PASS
--------------
Global  : PASS
Link 1  : PASS
Link 2  : PASS
Link 3  : PASS
 
MLAG Configurations:
-----------------------------------------------------------------
Property                 Local Value      Peer Value       Result
-----------------------  ---------------  ---------------  ------
Domain ID                1                1                PASS 
Node ID                  1                0                PASS 
Peer VLAN                4088             4088             PASS 
Link Count               3                3                PASS 
Link IDs                 1   2   3        1   2   3        PASS               
        
Spanning-Tree Configurations:
-----------------------------------------------------------------
Property                 Local Value      Peer Value       Result
-----------------------  ---------------  ---------------  ------
Enable                   Yes              Yes              PASS 
Mode                     PVST             PVST             PASS
Instance Count           65               65               PASS 
Instance Vlan 1                                                

-- Bridge Priority       32768            32768            PASS 
-- Hello Time            2                2                PASS 
-- Forward Delay         15               15               PASS 
-- Max Age               20               20               PASS 
Instance Vlan 2                                                
-- Bridge Priority       32768            32768            PASS 
-- Hello Time            2                2                PASS 
-- Forward Delay         15               15               PASS 
-- Max Age               20               20               PASS 
 
DHCP Snooping Configurations:
-----------------------------------------------------------------
Property                 Local Value      Peer Value       Result
-----------------------  ---------------  ---------------  ------
Enable                    No              No               PASS 
 
IGMP Snooping Configurations:
-----------------------------------------------------------------
Property                 Local Value      Peer Value       Result
-----------------------  ---------------  ---------------  ------
Enable                   No               No               PASS

Result in the show commands shows the consistency check results, the value could be PASS or FAIL:

  • If Result is PASS, the configurations of the MLAG peer-link devices are consistent.
  • If Result is FAIL, the configurations of the MLAG peer-link devices are inconsistent.

NOTE:

  • Inconsistent configuration may cause MLAG to run abnormally.
  • After the configuration is changed from inconsistent to consistent, you need to restart the MLAG peer devices to ensure that MLAG functions normally.

Single-homed Port

Single-homed port is a port on the MLAG peer device which provides access device single-access to the network through either MLAG Node 0 or Node 1 device. The single-homed port on the MLAG peer devices can connect to both hosts or servers and it can also be connected to other access switch devices. As shown in Figure 6, Switch 1 and Switch 3 are single-homed devices, the ports on the MLAG peer devices connected to Switch 1 and Switch 3 are called single-homed ports. Traffic between Switch1 and Switch3 always crosses the MLAG peer-link as Switch1 and Switch3 are active on different switches. With single-homed ports, servers and other standalone switches are able to single-home into the network.

Figure 6. MLAG network


The MAC address entries learned on the single-homed port will be synchronized to the MLAG peer-link port on the MLAG peer device, and the address type is Peer-Sync in the MAC address table. However, the MAC synchronization on the single-homed port will be done only when the MLAG neighbor state is ESTABLISHED. This MAC synchronization ensures that the devices connected to the single-homed port can communicate normally.

Physical ports and LAG ports could be a single-homed port, an MLAG member port matching the following conditions could be a single-homed port, but the peer-link port could not be a single-homed port.

An MLAG member port is a single-homed port when one LAG port of the dual-homed access device is down, then the other LAG port becomes a single-homed port. We can also say that when MLAG interface state is ASY_LOCAL, then MLAG member port on local MLAG device is a single-homed port. MAC address entry learned on this port will be synchronized to MLAG peer-link port on the MLAG peer device.

NOTE:

To make the single-homed port work normally, the peer-link ports should be added into the VLAN of the single-homed port.

Application Scenarios

As shown in Figure 7, PC 2 connects to the MLAG downlink switch (Switch 2), and communicates with PC 1 through the MLAG peer devices.

Figure 7. Network 1 of PC 1 and PC 2 Communication in MLAG Topology

Normally, the traffic from PC 1 to PC 2 will go out through Port 1 to Switch 2. Any packet received from peer-link on MLAG Node 1 device will be blocked on all MLAG member ports.

Traffic sent from PC 2 to PC1 will be hashed to one of the MLAG peer devices. If the traffic is hashed to the MLAG Node 1, the traffic is therefore forwarded by the Node 1 device across the peer-link to MLAG Node 0. This is because the MAC address learned on the single-homed port will be synchronized to the peer-link port Port 5 on MLAG Node 1 device.

When the topology changes, as shown in Figure 8, PC 2 changes location and accesses the network through the MLAG Node 1 on Port 3. The MAC address of PC 2 will be learned on Port 3 of MLAG Node 1 device. At this time, since Port 3 is a single-homed port, the MAC address entry learned on Port 3 will be synchronized to the peer-link port Port 4 on MLAG Node 0 device. The traffic sent from PC 1 to PC 2 will go out of Port 4 instead of Port 1 on MLAG Node 0 device and sent to the Node 1 device via peer-link.

Similarly, as Port 3 is a single-homed port, MAC addresses of single-homed hosts connected to the MLAG Node 1 device will automatically be learned by MLAG Node 0 device. The traffic flow path from PC 2 to PC 1 is similar. This ensures that the devices connected to the single-homed port can communicate normally.

 Figure 8. Network 2 of PC 1 and PC 2 Communication in MLAG Topology


When considering the case of IP routing communication, as shown in Figure 9, PC1 and PC2 belong to different subnets. In this scenario, you can apply VRRP in the MLAG topology to make PC1 and PC2 communicate with each other through IP routing. Configure two VRRP groups on the two VRRP group devices which belong to different L3 VLAN interfaces. Configure a different virtual IP address for each VRRP group, virtual IP address 10.10.10.1 is used as the gateway for PC1 access network, and virtual IP address 20.20.20.1 is used as the gateway for PC2 access network.

Figure 9. Network 3 of PC 1 and PC 2 Communication in MLAG with VRRP Topology

If VXLAN is deployed in an MLAG domain, MAC addresses learned on the single-homed port will not be synced to the peer-link port of the remote MLAG Node except in the following two cases.

  • When MLAG interface state is AS_LOCAL, the MAC addresses learned on the local access side port, the MLAG member port in the figure, are synced to the peer-link port of the MLAG peer device.
  • If the network port on one VTEP device goes down, the corresponding network on the peer VTEP is regarded as a single port and the MAC addresses learned this single port will be synchronized to the peer-link port of the MLAG peer device.

Flood Control

To prevent the downstream switches from receiving multiple copies from both ends of MLAG peer, a block mask is used to prevent forwarding all the traffics received on the MLAG peer link toward the MLAG member port, this is called Flood Control mechanism.

As shown in Figure 10, peer-link is usually not used to forward data traffic, the unicast traffic from the access device or the network side to the MLAG peer device will be forwarded locally. When receiving the traffic from the peer-link, the MLAG member ports start flood control and form a forwarding block mask. That is, traffic received from the peer-link port will not be forwarded out the MLAG port; this prevents loops in the MLAG network.

The forwarding block mask for a given MLAG will be cleared off if the MLAG member port goes down on the MLAG peer.

Figure 10. MLAG Flood Control


1.   Unknown unicast, multicast or broadcast received from the MLAG member ports will be flooded to any other ports in MLAG VLAN including the peer-link port.

2.   Any packets (Unicast, multicast or broadcast) received from peer-link will be forbidden to transfer through the MLAG member port.

You can run the run show mlag link {<link-id>| summary} command to view the status of flood control. For example:

admin@XorPlus# run show mlag link summary
# of Links: 2
Link   Local LAG   Link Status   Local Status   Peer-Status   Config Matched   Flood
----   ---------   -----------   ------------   -----------   --------------   -----
1      ae1         IDLE          UP             UNKNOWN       No               No  
2      ae2         IDLE          UP             UNKNOWN       No               No

In the show result, Link Status indicates MLAG interface state. Generally, Flood is No, indicates that all the traffic received on the MLAG peer-link port will be blocked to all MLAG member ports on MLAG peer device except the DHCP Offer/Ack packets.

However, in one case, when peer MLAG member port is down, the MLAG interface state changes to AS_LOCAL, then Flood changes to Yes, indicating that traffic received on the MLAG peer-link can be transferred through the MLAG member port.

NOTE:

All the packets received from peer-link shall be blocked to all MLAG member ports except the DHCP Offer/Ack packets.

Multi-layer MLAG Application Networking

A two-layer MLAG network is shown in Figure 11. Access devices dual-homed to the network through the lower MLAG peer devices at the access layer, and the upper MLAG peer devices are used as the active-active gateway at the aggregation layer. In the two-layer MLAG topology, MLAG member ports of the MLAG peer device in the same MLAG domain MUST belong to the same MLAG, that is, they should be configured with the same link ID.

NOTE:

It is mandatory that different pairs of MLAG peer devices should use different domain IDs. In the topology below, domain ID of Aggregation Layer MLAG peer switches and domain ID of Access Layer MLAG peer switches should be different.

Figure 11. Multi-layer MLAG Application Networking Diagram

Compared with one layer MLAG, multi-layer MLAG has the following advantages:

  • Expanded layer 2 range.
  • Provides a highly flexible architecture. In the multi-layer MLAG, both access devices and access layer switches devices are dual-homed to the network, which increases network reliability.

Provides greater network bandwidth from the access layer to the aggregation layer.

Interoperability with Other Features

LACP

LAG (Link Aggregation Group) is a way of binding multiple physical links into a combined logical link. MLAG domain MAC address will be used for LACP negotiation when performing link aggregation with the downlink access device.

We recommend that you enable LACP on the interfaces of each link aggregation group when configuring peer-link port and the LAG ports connected to the downlink access devices. This allows you to more easily detect compatibility between devices, link failures, and provides dynamic reaction to configuration changes and link failures.

Rapid PVST+

MLAG itself has an anti-loop feature for MLAG member ports, but for non-MLAG ports, the possibility of a loop still exists in the network. There could be a number of networking scenarios that could lead to loops in the network, so to avoid unexpected loops forming in the network, it is strongly recommended to enable Rapid PVST+ protocol on all devices in the MLAG domain.

The two MLAG peer switches are seen as a single device to the PVST+ instance. Rapid PVST+ configuration should be identical on both MLAG peer devices. For example, rapid PVST+ enable or disable, rapid PVST+ mode and rapid PVST+ parameters (such as bridge priority, hello time, forward delay) should be identical on both MLAG peer devices.. See Table 1 in section 1.1.6 Configuration Consistency Check to find the MLAG configuration consistency check list.

After the rapid PVST+ protocol is enabled, MLAG peer devices will automatically send STP Sync messages, which are used to sync up rapid PVST+ dynamic information from the received BPDUs to the peer switch. Once the peer-link is successfully established, the two peers are virtualized into one device to perform port role calculation and fast convergence calculation by using the rapid PVST+ protocol. The peer link port is always in forwarding state and does not participate in the spanning tree calculation.

NOTE: To avoid network loop, it is strongly recommended all the VLANs on the peer-link port to enable with rapid PVST+, including MLAG peer VLAN.

An offset has been added to the port index which is encapsulated in the BPDUs, see the following table:


MLAG Port Index

Non-MLAG Port Index

MLAG Node 0

512 + MLAG Link ID

Local Port Index

MLAG Node 1

512 + MLAG Link ID

1024 + Local Port Index

Supported Root Bridge Topologies

Root bridge could be any bridge in the L2 domain, connected through the following three types of link:

1.  Non-MLAG link, i.e. single-homed connection to either one or both of the peer nodes.

In the figure above, Switch D is single-homed to MLAG Node 0 device, suppose Switch D is the root bridge, then ge-1/1/2 port is the root port of MLAG Node 0, and the peer-link port ae3 is the implicit root port of MLAG node 1 device.

In the figure above, Switch D is single-homed to both MLAG Node 0 and Node 1 devices. Suppose Switch D is the root bridge, in the upper side of the topology, one of the single-homed ports will be blocked after spanning tree calculation as there is a loop in the topology.

2.  MLAG link

In the figure above, Switch D is dual-homed to MLAG peer devices, suppose Switch D is the root bridge, LAG port ae1 is the root port of MLAG Node 0. After STP synchronization between MLAG peer devices, LAG port ae1 becomes the root port of MLAG node 1 device.

If one of the MLAG member port goes down, the role of paired MLAG member port remains unchanged in the MLAG domain.

For example,

In the figure above, if the MLAG member port ae1 on MLAG Node 0 goes down, the role of the paired MLAG member port in the MLAG domain is still the root port.

3.  MLAG nodes as the root bridge

In the figure above, Switch C and Switch D are dual-homed to MLAG peer devices, MLAG peer devices functions as a root bridge. One of the MLAG peer devices transmits configuration BPDUs.

Configuration recommendation:

  • For a small scale network, it is recommended to disable spanning tree protocol to reduce the network convergence delay caused by spanning tree calculation.
  • If the network topology changes, MAC address table will be cleared.

DHCP Snooping

A device in an MLAG topology can enable DHCP snooping function, to send DHCP requests from users (DHCP clients) to legitimate DHCP servers through trusted ports. The device then generates a DHCP snooping binding table according to the DHCP ACK packet information returned by the DHCP server. When the devices receive subsequent DHCP packets from the clients through the DHCP snooping-enabled interface, they will check the DHCP snooping binding table to effectively prevent attacks from illegal clients.

Configuration notes:

  • DHCP snooping configuration should be identical on both MLAG peer devices.
  • The peer link port should be configured as trust port on demand.

The ports toward the DHCP server should be configured as trust port on the network device between the DHCP Client and the Server, including the MLAG peer devices.

IGMP Snooping

IGMP snooping can be deployed in an MLAG topology to shield hosts on a local network from receiving traffic for a multicast group they have not explicitly joined. The multicast traffic is forwarded according to the L2 multicast forwarding table which is generated by IGMP snooping. If there is no matching entry in the L2 multicast forwarding table, the multicast traffic is forwarded to the mrouter ports except the one the multicast traffic is received on.

IGMP sees the MLAG LAG link as a single logical link, so IGMP packets are synced between the MLAG peer devices through peer-link by Multicast Control Sync message:

IGMP packet that is received on either MLAG peer switch from their MLAG port is synced to the other peer device through peer link as if it is received by local MLAG port.

Configuration notes:

  • IGMP snooping configuration should be identical on both MLAG peer devices.
  • Generally, it is not necessary to configure mrouter port on the peer link.

VXLAN

Implementing VXLAN technology on the MLAG peer devices provides overlay network on top of existing layer 2 and layer 3 technologies to support elastic compute architectures, thus makes it easier for network engineers to scale out a cloud computing environment while logically isolating cloud apps and tenants.

The two MLAG peer switches are seen as a single device in the VXLAN network. VXLAN configurations should be identical on each VTEP device. This includes the VNI value, VLAN included in the same VNI and all the configurations on the VXLAN network side ports and access side ports.

To avoid duplicate packets being sent from MLAG and VXLAN networks, traffic from the peer link will not be forwarded on the VXLAN network side ports or access sides ports, unless in the case that the uplink or downlink of the peer VTEP device fails.

When VXLAN is deployed in MLAG domain, pay attention to the following notes:

  • When deploying VXLAN, peer link should not be the outcoming interface of VXLAN tunnel for routing VXLAN traffic, otherwise packet loss may occur.
  • In MLAG with VXLAN network, it is not supported to use set vxlans vni <text> interface <port> command to set VNI mapping to port, which may cause MLAG network flapping.
  • Untagged packets do not support VNI mapping to the PVID of a port by using command set vxlans vni <text> interface <port> vlan <vlan-id>.
  • In MLAG with VXLAN network, configuring static MAC address entries on VXLAN tunnel interfaces is not supported, so the command set vxlans vni<text> flood vtep <ipv4-addr>  mac-address <macaddr> is useless.
  • Traffic from the single-homed port cannot be sent from the peer spine's network side port of VXLAN because of the limitation that the traffic from the peer link will not be forwarded through the VXLAN network side ports.
  • Only one VLAN can be configured for one VNI when you use the command set vxlans vni <text> interface <port> vlan <vlan-id> to configure the VLAN permitted to pass through the VXLAN tunnel.
  • If the local VXLAN access port status is Down, packets from the VXLAN network side will be discarded on the local MLAG peer-link port if they become packets without a VLAN tag after VXLAN decapsulation.
  • The single-homed port is not supported as VXLAN access port, only dual-homed MLAG member port can be configured as VXLAN access port.

Traffic Forwarding in Typical Fault Scenarios

Downstream Link from Access Switch Down

This scenario includes the case in which an MLAG member port goes down or downstream link develops a fault. In this case, all traffic will be transmitted to and from the active MLAG member port of the peer device as illustrated in Figure 12.

From Figure 12, when the member port on Node 0 goes down, MLAG interface state on MLAG Node 1 changes to AS_LOCAL. MLAG member port on Node 1 becomes an MLAG single-homed port. Frames received from the peer link are then forwarded to the MLAG single-homed port.

Figure 12. Typical Fault Scenario of Downstream Link Down

Upstream Link to Layer-3 Device Down

In this case, traffic load-sharing to MLAG Node 0 is sent to the MLAG Node 1 device via MLAG peer-link, and then forwarded to the upstream device.

Figure 13. Typical Fault Scenario of Upstream Link Down

MLAG Node Fault

If one of the MLAG Node reboots, shuts down or develops some unforeseen fault, traffic will be transmitted through the other active MLAG Node.

Figure 14. Typical Fault Scene of MLAG Node 0 Fault

MLAG Peer-link Down

Any manual action to shut down the peer-link is strictly forbidden. However, if the peer-link goes down for some reason, MLAG control plane messages cannot be exchanged, causing the MLAG system to operate abnormally. Especially when the peer-link is down, but both the MLAG member ports are up will create the split-brain failure scenario. The system cannot be automatically recovered in this scenario.

Figure 15. Typical Fault Scenario of Peer-link Down

Therefore, we have to ensure the reliability of the peer-link, follow the points described in section 1.1.2 Basic Concepts when configuring and deploying the peer-link.

When the peer-link is down, the MLAG peer relationship cannot be established. You can run the run show mlag domain command to check the peer-link status, the Neighbor Status is not ESTABLISHED when peer-link is in process of establishment or is down.

For example,

admin@Xorplus# run show mlag domain summary
Domain ID: 1    Domain MAC: 48:6E:73:FF:00:01    Node ID: 1
----------------------------------------------------------------------------------------------
Peer Link  Peer IP          Peer Vlan  Neighbor Status    Config Matched       MAC Synced  # of Links
---------  ---------------  ---------  ---------------    --------------      ----------  ----------
ae65       1.1.1.1          4088        ESTABLISHED         Yes                Yes          64

Backward Compatibility

Compared with existing CLI interface, the new CLI is not backward compatibility as follows:

  • “set interface mlag” is changed to “set protocols mlag”.
  • “disable”/“hello-interval”/”priority”/“reload-delay”/”source”/”system-id” are obsoleted.
  • “set interface aggregate-ethernet xx aggregated-ether-options mlag domain-id” is changed to “set protocols mlag domain xx interface xx link”.
  • No MLAG link level “peer-ip” and “peer-link” anymore. Instead only global level configuration for all MLAG links within the same domain are provided.
  • “set interface mlag peer x.x.x.x peer-link xx” is changed to “set protocols mlag peer-ip x.x.x.x peer-link xx”.
  • By default, all existing MLAG configuration is moved under MLAG domain 1.

Compared with existing MLAG behavior, the major differences are as follows:

  • No MLAG master/slave election. Instead the MLAG primary/secondary role is determined by the MLAG node configuration.
  • No interval-based Hello message. Instead the MLAG Control message is triggered on-demand.
  • The MLAG message encapsulation is changed from UDP to TCP.
  • The MLAG message format is changed to TLV style.
  • Configuration Consistency check is introduced and there is no more configuration sync from master to slave.
  • MLAG state machine is changed more user friendly.
  • MLAG domain MAC is always used in LACP and STP instead of master’s MAC.
  • When VXLAN is enabled, the peer link port is always set to access port.
  • No labels