<?xml version="1.0" encoding="iso-8859-1" ?>
<!--<!DOCTYPE rfc SYSTEM "rfc4748.dtd"> -->
    <!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [ 
    <!ENTITY rfc2629 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml'> 
	<!ENTITY rfc2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'> 
    <!ENTITY rfc8279 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.8279.xml'> 
    <!ENTITY rfc7761 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.7761.xml'>
    <!ENTITY rfc7450 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.7450.xml'>
    <!ENTITY rfc4875 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4875.xml'>
    <!ENTITY rfc6388 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6388.xml'>
    <!ENTITY rfc9026 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.9026.xml'>
    <!ENTITY rfc5880 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5880.xml'>
    <!ENTITY rfc5884 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5884.xml'>
    <!ENTITY rfc8562 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.8562.xml'>
    <!ENTITY rfc0792 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.0792.xml'>
    <!ENTITY rfc4443 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4443.xml'>
    <!ENTITY rfc8029 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.8029.xml'>
    <!ENTITY rfc6513 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6513.xml'>
    <!ENTITY I-D.ietf-pim-sr-p2mp-policy PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-pim-sr-p2mp-policy.xml'>
    <!ENTITY I-D.ietf-bier-bfd PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-bier-bfd.xml'>
    <!ENTITY I-D.ietf-bier-ping PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-bier-ping.xml'>
		]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>		

<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc disable-output-escaping="yes"?>

<rfc category="info"  docName="draft-ietf-mboned-redundant-ingress-failover-07"
     ipr="trust200902">
  <!-- ***** FRONT MATTER ***** -->
  <front>
    <title abbrev="Multicast Redundant In-Router Failover">Multicast Redundant Ingress Router Failover</title>
	
    <author fullname="Greg Shepherd" initials="G" surname="Shepherd">
      <organization>Cisco Systems, Inc.</organization>
      <address>
        <postal>
          <street>170 W. Tasman Dr.</street>
          <city>San Jose</city>
          <region></region>
          <code></code>
          <country>US</country>
        </postal>
        <email>gjshep@gmail.com</email>
      </address>
    </author>

    <author fullname="Zheng Zhang" initials="Z" role="editor" surname="Zhang">
      <organization>ZTE Corporation</organization>
      <address>
        <postal>
          <street></street>
          <city>Nanjing</city>
          <region></region>
          <code></code>
          <country>China</country>
        </postal>
        <email>zhang.zheng@zte.com.cn</email>
      </address>
    </author>

    <author fullname="Yisong Liu" initials="Y" surname="Liu">
      <organization>China Mobile</organization>
      <address>
        <postal>
          <street/>
          <city>Beijing</city>
          <region/>
          <code/>
          <country/>
        </postal>
        <email>liuyisong@chinamobile.com</email>
      </address>
    </author>
    
    <author fullname="Ying Cheng" initials="Y" surname="Cheng">
      <organization>China Unicom</organization>
      <address>
        <postal>
          <street></street>
          <city>Beijing</city>
          <region></region>
          <code></code>
          <country>China</country>
        </postal>
        <email>chengying10@chinaunicom.cn</email>
      </address>
    </author>

     <author fullname="Gyan Mishra" initials="G" surname="Mishra">
      <organization>Verizon Inc.</organization>
      <address>
        <postal>
          <street></street>
          <city></city>
          <region></region>
          <code></code>
          <country></country>
        </postal>
        <email>gyan.s.mishra@verizon.com</email>
      </address>
    </author>
    
	<date year="2025"/>	
    <area>OPS</area>
    <workgroup>MBONED WG</workgroup>
    <keyword>Multicast Redundant Ingress Router, Failover</keyword>
    <abstract>
     <t>This document analyzes the redundant ingress router failover problem of a multicast domain, 
	 and analyzes the possible backup modes and advantages of each mode when deploying multiple ingress devices to 
	 forward the same multicast flow in a multicast domain.
     </t>
    </abstract>
  </front>

  <!-- ***** MIDDLE MATTER ***** -->

  <middle>
  <section title="Introduction">
    <t>Multicast redundant ingress router failover is an important issue in multicast deployments, 
	especially in backbone multicast domains or multicast provider domains. 
	Backbone multicast domains or multicast provider domains are referred to as multicast domains in the following sections. 
	A multicast domain is a domain used to forward multicast flow based on specific multicast technologies, 
	such as PIM <xref target="RFC7761"/>, BIER <xref target="RFC8279"/>, P2MP TE tunnel <xref target="RFC4875"/>, 
    MLDP <xref target="RFC6388"/>, etc. 
	Static configuration, tunnel based technologies, such as AMT <xref target="RFC7450"/>, 
	SR P2MP policies <xref target="I-D.ietf-pim-sr-p2mp-policy"/> can also be used. 
	The domain may or may not be directly connected to the actual multicast source and receivers.</t>

    <t>The ingress device of the multicast domain, such as the ingress router, 
	can be connected to the multicast source by a single hop or multiple hops. 
	In PIM, it is also called the first hop router, in BIER, it is called the BFIR, 
	and in P2MP TE tunnel or MLDP, it is called the ingress LSR.</t>

    <t>The egress device of the multicast domain, such as the egress router, 
	may be connected to the multicast receiver by a single hop or multiple hops. 
	In PIM, it is also called the last hop router, in BIER, it is called the BFER, 
	and in P2MP TE tunnel or MLDP, it is called the egress LSR.</t>

    <t>In order to ensure the reliability of multicast flow, 
	there may be two or more ingress devices or egress devices in the multicast domain. 
	That means the same multicast flow may enter the multicast domain from multiple ingress devices of the multicast domain.
	This draft does not discuss the protection method between the ingress device and the multicast source, 
	between the egress device and the receiver, nor does it discuss the details of the technologies such as PIM and BIER. 
	It only discusses the issue of failover of the ingress router of the multicast domain.</t>

    <t>This document discusses the deployment of multiple ingress devices in a multicast domain. 
	When a fault occurs, the switching method from the primary ingress device to the backup ingress device 
	and the common fault detection methods are discussed. 
	The advantages and disadvantages of the switching methods are analyzed to provide a reference for multicast deployment.</t>

  </section>

  <section title="Terminology">
    <t>The following abbreviations are used in this document:</t>
    <t>IR: The ingress router for multicast flows in a multicast domain.</t>
    <t>ER: The egress router for multicast flows in a multicast domain.</t>
    <t>SIR: The IR responsible for sending the multicast flow, or the IR whose flow is received by the ER, 
	is called Selected-IR, or SIR for short.</t>
    <t>BIR: The IR may or may not send multicast flows. Multicast flows from IR will not be accepted by ER.
    Once SIR fails, IR will replace the role of SIR and multicast flows from IR will be accepted by ER.
    This IR is called backup IR, or BIR for short.</t>
  </section>
 
  <section title="Multicast Redundant Ingress Router Failover"> 
    <figure  align="center">
          <artwork align="center"><![CDATA[			
                source
                 ...
           +-----+      +-----+
+----------+ IR1 +------+ IR2 +---------+
|multicast +-----+      +-----+         |
|domain            ...                  |
|                                       |
|          +-----+      +-----+         |
|          | Rm  |      | Rn  |         |
|          ++---++      +--+--+         |
|           |   |          |            |
|     +-----+   +---+      +-----+      |
|     |             |            |      |
|   +-v---+      +--v--+      +--v--+   |
+---+ ER1 +------+ ER2 +------+ ER3 +---+
    +-----+      +-----+      +-----+
     ...           ...          ...
   receiver      receiver     receiver
                Figure 1
            ]]></artwork>
       <postamble></postamble>
    </figure>

    <t>This is a common multicast networking scenario. The multicast domain includes the area from IR to ER. 
	The flow sent by the multicast source enters the multicast domain from at least one IR, 
	is forwarded in the multicast domain, reaches the ER, is forwarded by the ER, and finally the receiver receives the multicast flow.</t>
	
	<t>The ingress device IR of the multicast domain is a key node for the normal forwarding of multicast flows. 
	When two or more IRs are deployed, there may be multiple protection modes for IR, such as cold standby, warm standby and hot standby. 
	These modes are also described in <xref target="RFC9026"/>. 
	However, <xref target="RFC9026"/> mainly focuses on signaling notifications in MVPN scenarios 
	and does not involve the protection mode of multiple ingress devices in the multicast domain 
	and the impact on multicast flow transmission in the multicast domain.</t>

	<t>As shown in Figure 1, a same multicast flow enters the multicast domain from two IRs. 
	Both IRs are UMH (Upflow Multicast Hop) candidates of ER. 
	Different multicast technologies may be used in the multicast domain according to the deployment of the network administrator. 
	Assuming that PIM technology is used, two multicast trees can be pre-established with two IRs as roots.</t>

    <section title="Swichover"> 
      <t>When a node or link in the multicast domain fails, the forwarding of multicast flow may be affected. 
	  However, it is not necessary to switch multicast flow from SIR to BIR in all cases.
	  The following are situations where switching is not required:</t>
	  
	  <t>
      <list style="symbols">
        <t>
		When PIM is used as the multicast forwarding protocol in a domain, 
		a forwarding tree of (S, G) or (*, G) is pre-built. 
		When a node other than SIR or a link in the forwarding tree fails, 
		the tree is partially rebuilt.</t>

        <t>
		When BIER is used as the multicast forwarding protocol in a multicast domain,
        when a node other than SIR or a link in the domain fails, there is no need to rebuild the forwarding path,
        BIER forwarding will be restored as the IGP route converges.</t>
        
        <t>When P2MP TE tunnel or MLDP is used as the multicast forwarding protocol in a multicast domain, 
		a forwarding LSP is pre-established.
        When a node other than the SIR in the LSP or a link in the domain fails, the LSP may be partially rebuilt.</t>

        <t>When a static multicast tree or SR P2MP policy is used in a multicast domain, 
		when a node other than the SIR on the forwarding path or a link has a problem, 
		the controller needs to recalculate a new forwarding path to bypass the faulty node or link.</t>
      </list>
      </t>
  
      <t>When a critical failure occurs, it is necessary to switch from SIR to BIR, 
	  for example: SIR encounters a device failure, or the forwarding channel between SIR and ER fails, 
	  causing ER to be unable to receive multicast flows from SIR, and this failure cannot be restored in a short time.
      At this time, the multicast flow will be forwarded by BIR.
      ER receives the flow forwarded by BIR and forwards it to the receiver.</t>
  
      <figure  align="center">
          <artwork align="center"><![CDATA[			
                  source
                   ...
           +-----+      +-----+
+----------+ IR1 +------+ IR2 +---------+
|          +--+--+      +--+--+         |
|             |            |            |
|          +--+--+      +--+--+         |
|          | Rx  |      | Ry  |         |
|          +-+-+-+      ++---++         |
|            | |         |   |          |
|            | +-----------+ |          |
|            |           | | |          |
|            | +---------+ | |          |
|            | |           | |          |
|          +-v-v-+      +--v-v+         |
|          | Rm  |      | Rn  |         |
|          ++---++      +--+--+         |
|           |   |          |            |
|     +-----+   +---+      +-----+      |
|     |             |            |      |
|   +-v---+      +--v--+      +--v--+   |
+---+ ER1 +------+ ER2 +------+ ER3 +---+
    +-----+      +-----+      +-----+
     ...           ...          ...
   receiver      receiver     receiver
                Figure 2
            ]]></artwork>
       <postamble></postamble>
    </figure>

      <t>For example, in Figure 2, there is only one path in some areas of the network.
      IR1 and Rx are key nodes in the domain. 
	  When IR1 or Rx fails, there is no other path between IR1 and ER.</t>

      <t>
      <list style="symbols">
        <t>When PIM is used in the multicast domain, Rm and Rn can select Ry as the upflow node, 
		send Join messages, and build a new tree with IR2 as the root. </t>

        <t>When BIER is used in the multicast domain, IR2 should be responsible for the forwarding 
		role and forward flow to ER. </t>

        <t>When P2MP TE tunnel or MLDP is used in the domain, LSP initiated from IR2 can 
		be built and replace the LSP initiated from IR1. </t>

        <t>When a static multicast tree or SR P2MP policy is used in the multicast domain, 
		the controller should build a new forwarding path with IR2 as the root to forward the multicast flow to ER.</t>
      </list>
      </t> 
    </section>
    
    <section title="Failure detection">
    <t>The IR node itself and the key forwarding link between IR and ER are factors that affect traffic forwarding within the multicast domain.</t>


	<t>In order to achieve fast switching, BIR can establish a forwarding channel with ER in advance and monitor the status of SIR. 
	When the SIR node fails, it will take over the work of SIR. 
	BIR can establish a BFD <xref target="RFC5880"/> session with SIR to detect the SIR status, 
	or it can be detected by ping and other methods. 
	However, it should be noted that the detection between BIR and SIR does not represent the actual forwarding path status between SIR and ER. 
	When SIR is working normally, only the link between BIR and SIR fails, which may cause BIR to make wrong judgments and switch, 
	thereby generating unnecessary duplicate flow. 
	In this case, ER must support selective reception and be compatible with IR switching errors.</t>

    <t>There may be problems with the forwarding path between SIR and ER, 
	but the link between BIR and SIR is normal and cannot be detected by BIR. 
	Therefore, ER can also detect the forwarding path between SIR and ER and actively switch to BIR to forward flow when problems are found.
    The detection between SIR and ER can be based on multipoint BFD <xref target="RFC8562"/>. 
	When BIER is used to forward flow in the multicast domain, 
	the detection between SIR and ER can also be based on BIER BFD <xref target="I-D.ietf-bier-bfd"/>. 
	When MPLS is used to forward flow in the multicast domain, BFD <xref target="RFC5884"/> based on MPLS LSP can be used for detection.</t>

    </section>
  </section>

  <section title="Stand-by Modes"> 
    <t>Detection and IR switching can be three modes: cold standby, warm standby, and hot standby.
	When the three modes are used to protect IR, 
	the transmission mode of multicast flow in the multicast domain is different, 
	and the impact on the network is also different.</t>

    <t>When the multicast domain uses the PIM protocol to forward flow, 
	ER will establish a multicast tree to BIR through signaling. 
	When the multicast domain uses BIER to forward flow, 
	ER will notify BIR the request to receive multicast flow through the BIER overlay protocol. 
	When the multicast domain uses P2MP TE or MLDP to forward flow, a multicast forwarding channel is established from BIR to ER. 
	The PIM multicast tree with BIR as the root and the P2MP TE or MLDP tunnel from BIR to ER can also be established in advance, 
	and ER directly notifies BIR to use the multicast tree or tunnel for forwarding.</t>

    <section title="Cold Standby Mode">
      <t>In cold standby mode, ER selects a SIR (e.g. IR1 in Figure 1) as the SIR and signals it to obtain the multicast flow.</t>
      <t>When ER finds that it cannot receive the flow from IR1 through the detection means in Section 3.2, 
	  ER signals IR2 to obtain the multicast flow.</t>
      <t>
      <list style="symbols">
        <t>For IR, IR (including SIR and BIR) only performs the normal operation of forwarding the flow according to ER request. </t>

        <t>For ER, ER must select an IR as the SIR and signal it. When the SIR fails or the path between SIR and ER fails, 
		ER must signal BIR to obtain the flow.</t>

        <t>For intermediate routers, they know nothing about the role of IR, they only forward packets. 
		There is no duplicate packets in the domain.</t>
      </list>
      </t>
  
      <t>In this scenario, the BIR does not need to detect the status of the SIR. 
	  During the IR switching process, packet loss may occur because of the need for signaling interaction. 
	  Even if a PIM multicast tree or P2MP TE/MLDP tunnel is established in advance, packet loss may still occur.</t>
    </section>
    
    <section title="Warm Standby Mode">
      <t>In warm standby mode, the ER will signal to the SIR and BIR, such as IR1 and IR2 in Figure 2, that it needs to receive flow. 
	  The SIR (such as IR1) forwards the flow to the ER.
      The BIR (such as IR2) must not forward flow to the ER before the SIR fails.
      The BIR can detect the SIR status by the method described in Section 3.2, and automatically forward flow to the ER when the SIR fails.</t>
      <t>
      <list style="symbols">
        <t>Normally, the SIR forwards flow to the ER.
           When the SIR fails or the path between the SIR and the ER fails, the BIR must start forwarding flow to the ER.
           The BIR can detect node failures in the SIR using the method described in Section 3.2, 
		   but may lack the method to detect path failures from the SIR to the ER.</t>
        
        <t>The ER does not distinguish between the SIR and the BIR. 
		The ER only signals to both that it needs to receive a certain flow.</t>
        
        <t>For the intermediate routers, they do not know the difference between the IRs, 
		and they are only responsible for packet forwarding. 
		There are no duplicate packets in the domain.</t>
      </list>
      </t>
    
      <t>When the BIR detects the SIR failure and starts forwarding flow, packet loss will occur during the switchover.</t>

      <t>In some deployments, the SIR and BIR may be responsible for different multicast flows to share the load.
      For a certain multicast flow, the SIR may be IR1, and for another multicast flow, the SIR may be IR2. 
	  For example, IR1 sends some multicast flows to ERs and IR2 sends other multicast flows to ERs.
      Another possible deployment is that two IRs can be responsible for different ERs for the same multicast flow.
      If IR1 detects a failure between IR1 and ERs, IR1 may notify IR2 to forward flow to these ERs.</t>
    </section>
    
    <section title="Hot Standby Mode">
      <t>In hot standby mode, the ER signals both IRs that it wants to receive a certain flow.
	  Both IRs send flows to the ER. The ER must discard duplicate flows from one of the IRs.
	  In this case, there is no SIR or BIR. Only the ER knows which IR is the SIR.</t>
      <t>
      <list style="symbols">
        <t>In this mode, the IR does not need to know the role of the SIR or BIR,
        IR only forwards the flow based on the request received from the ER.</t>
        
        <t>ER will send flow reception signals to both IRs and discard the duplicate flow from the backup BIR when it receives a duplicate flow.
        After switching the ER receives and forwards the flow from the BIR.
        It should be noted that the ER may choose different SIRs or BIRs for different multicast flows. </t>
        
        <t>Intermediate routers do not know the role of the IR, they only forward packets.
        There are duplicate packets within the domain. </t>
      </list>
      </t>
      
      <t>In this mode, BIR does not need to detect the status of SIR. ER will detect the failure of SIR.
      Since duplicate flow packets arrive at ER, although packet loss may occur when ER switches to receive and forward flow from BIR, 
	  the packet loss is very small compared to the previous two modes.</t>
    </section>
    
   <section title="Summary"> 
      <t>The following table is a simple comparison of the three modes.
      "SIR failover" means that the SIR fails or the path between the SIR and the ER fails.</t>
      <texttable anchor="TABLE_1" title="">

      <ttcol align="left">role</ttcol>
      <ttcol align="left">Cold Mode</ttcol>
      <ttcol align="left">Warm Mode</ttcol>
      <ttcol align="left">Hot Mode</ttcol>
     
      <c>IR</c>
      <c>Forwards flow based on ER's request. </c>
      <c>Acting as either SIR or BIR, BIR must not forward flow to ER until SIR fails over. </c>
      <c>Does not need to know SIR or BIR role, just forwards flow based on ER's request. </c>
      
      <c>ER</c>
      <c>Must select an IR as SIR to signal request, signals BIR to request flow when SIR fails over. </c>
      <c>Does not select SIR or BIR, just signals both of them. </c>
      <c>Signals both SIR and BIR. Drops duplicate flow from BIR until SIR fails over. </c>
      
      <c>Intermediate routers</c>
      <c>Know nothing about SIR or BIR. Do not forward duplicate flow. </c>
      <c>Know nothing about SIR or BIR. Do not forward duplicate flow. </c>
      <c>No knowledge of SIR or BIR. Forward duplicate flow. </c>
	  </texttable>
  
      <t>Cold standby mode is the easiest to implement, but has the longest convergence time.</t>
	  
      <t>Warm standby mode has a moderate packet loss rate and convergence time, 
	  but it is difficult for BIR to know the path failure between SIR and ER. </t>

      <t>Hot standby mode has the lowest packet loss rate, but there is duplicated packet forwarding within the domain, which consumes more bandwidth.
	  For example, in the MVPN scenario, the hot root standby mode described in Section 5 <xref target="RFC9026"/> 
	  is the best recommended method for MVPN fast failover optimization. 
	  There may be duplicated packet forwarding within the domain, 
	  which will be discarded according to the provisions of <xref target="RFC9026"/> Section 6 and <xref target="RFC6513"/> Section 9.1. </t>
 
      <t>For network administrators, the most appropriate standby mode should be selected based on the actual network deployment, 
	  such as whether there is enough bandwidth to accommodate duplicate flow.</t>
	</section>
  </section>

  <section title="IANA Considerations">
  <t>  This document does not have any requests for IANA allocation.</t>
  </section>
  
	<section title="Security Considerations"> 
	  <t>This document adds no new security considerations.</t>
	</section>	
	
  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>
    <references title='Normative References'>
    &rfc8279;
    &rfc7761;
    &rfc7450;
    &rfc4875;
    &rfc6388;
    </references>
    
    <references title='Informative References'>
    &rfc9026;
    &rfc6513;
    &rfc5880;
    &rfc5884;
    &rfc8562;
    <?rfc include="reference.I-D.ietf-pim-sr-p2mp-policy"?>
    <?rfc include="reference.I-D.ietf-bier-bfd"?>
    </references>
	</back>
</rfc>
