<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?Pub Inc?>
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-ietf-idr-5g-edge-service-metadata-18"
     ipr="trust200902">
  <front>
    <title abbrev="Metadata Path">BGP Extension for 5G Edge Service
    Metadata</title>

    <author fullname="Linda Dunbar" initials="L. " surname="Dunbar">
      <organization>Futurewei</organization>

      <address>
        <postal>
          <city>Dallas, TX</city>

          <country>USA</country>
        </postal>

        <email>ldunbar@futurewei.com</email>
      </address>
    </author>

    <author fullname="Kausik Majumdar" initials="K." surname="Majumdar">
      <organization>Microsoft Azure</organization>

      <address>
        <postal>
          <street/>

          <city>California</city>

          <country>USA</country>
        </postal>

        <email>kausikm.ietf@gmail.com</email>
      </address>
    </author>

    <author fullname="Cheng Li" initials="C. " surname="Li">
      <organization>Huawei Technologies</organization>

      <address>
        <postal>
          <street/>

          <city>Beijing</city>

          <code/>

          <country>China</country>
        </postal>

        <email>c.l@huawei.com</email>
      </address>
    </author>

    <author fullname="Gyan Mishra" initials="G. " surname="Mishra">
      <organization>Verizon</organization>

      <address>
        <postal>
          <street/>

          <code/>

          <country>USA</country>
        </postal>

        <email>gyan.s.mishra@verizon.com</email>
      </address>
    </author>

    <author fullname="Zongpeng Du" initials="Z" surname="Du">
      <organization>China Mobile</organization>

      <address>
        <postal>
          <street/>

          <city>Beijing</city>

          <region/>

          <country>China</country>
        </postal>

        <email>duzongpeng@chinamobile.com</email>
      </address>
    </author>

    <date year="2024"/>

    <abstract>
      <t>This draft describes a new Metadata Path Attribute and some Sub-TLVs
      for egress routers to advertise the Metadata about the attached edge
      services (ES). The edge service Metadata can be used by the ingress
      routers in the 5G Local Data Network to make path selections not only
      based on the routing cost but also the running environment of the edge
      services. The goal is to improve latency and performance for 5G edge
      services.</t>

      <t>The extension enables an edge service at one specific location to be
      more preferred than the others with the same IP address (ANYCAST) to
      receive data flow from a specific source, like a specific User Equipment
      (UE).</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
      "OPTIONAL" in this document are to be interpreted as described in <xref
      target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
      appear in all capitals, as shown here.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>This document describes a new Metadata Path Attribute added to a BGP
      UPDATE message [RFC4271] for egress routers to advertise the Metadata
      about 5G low latency edge services directly attached to the egress
      routers. 5G [TS.23.501-3GPP] is characterized by having edge services
      closer to the Cell Towers reachable by Local Data Networks (LDN). From
      IP network perspective, the 5G LDN is a limited domain [RFC8799] with
      edge services a few hops away from the ingress nodes. Only selective UE
      services are considered as 5G low latency edge services.</t>

      <t>Note: The proposed edge service Metadata Path Attribute are not
      intended for the best-effort services reachable via the public internet.
      The information carried by the Metadata Path Attribute can be used by
      the ingress routers to make path selections for selective low latency
      services based on not only the network distance but also the running
      environment of the edge cloud sites. The goal is to improve latency and
      performance for 5G ultra-low latency services.</t>

      <t>The extension is targeted for a single domain with RR controlling the
      propagation of the BGP UPDATE. The edge service Metadata Path Attribute
      is only attached to the low latency services (routes) hosted in the 5G
      edge cloud sites, which are only a small subset of services initiated
      from UEs, not for UEs accessing many internet sites.</t>
    </section>

    <section title="Conventions used in this document">
      <t>The following conventions are used in this document.</t>

      <list style="hanging">
        <t hangText="Edge DC:">Edge Data Center, which provides the hosting
        environment for the edge services. An Edge DC might host 5G core
        functions in addition to the frequently used edge services.</t>

        <t hangText="gNB:">next generation Node B [TS.23.501-3GPP]</t>

        <t hangText="RTT:">Round-trip Time</t>

        <t hangText="PSA:">PDU Session Anchor (UPF) [TS.23.501-3GPP]</t>

        <t hangText="UE:">User Equipment</t>

        <t hangText="UPF:">User Plane Function [TS.23.501-3GPP]</t>
      </list>

      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
      "OPTIONAL" in this document are to be interpreted as described in BCP 14
      [RFC8174] when, and only when, they appear in all capitals, as shown
      here.</t>
    </section>

    <section title="Metadata Influenced Ingress Node Behavior">
      <t>The goal of this edge service Metadata Path Attribute is for egress
      routers to propagate the metrics about the running environment for a
      subset of edge services to ingress routers so that the ingress routers
      can make path selections based on not only the routing cost but also the
      running environment for those edge services. The BGP speakers that do
      not support the Metadata Path Attribute can ignore the Metadata Path
      Attribute in a BGP UPDATE Message. All intermediate nodes can forward
      the entire BGP UPDATE as it is. Multiple metrics can be attached to one
      Metadata Path Attribute. One Metadata Path Attribute can contain
      computing service capability information, computing service states,
      computing resource states of the corresponding edge site, or more.
      Computing service capability information can be used to record
      information of the computing power node or initialization deployment
      information for computing service initialization. Computing service
      states can include one of the service connection numbers, service
      duration, and so on. Computing resource states can be detailed
      information on computing resources such as CPU/GPU. They can also be an
      abstract metric from these detailed parameters to indicate the resource
      status of the edge site. There could be more metrics about the running
      environment being attached to the Metadata Path Attribute, e.g., some of
      the metrics being discussed by the CATS WG. This document illustrates a
      few examples of Sub-TLVs of the metrics under the edge service Metadata
      Path Attribute:</t>

      <list style="hanging">
        <t hangText="-">the site physical availability index</t>

        <t hangText="-">the site preference index</t>

        <t hangText="-">the service delay predication index x, and</t>

        <t hangText="-">the raw load measurement.</t>
      </list>

      <t>This section specifies how those Metadata impact the ingress node's
      path selections.</t>

      <section title="Metadata Influenced BGP Path Selection">
        <t>When an ingress router receives BGP updates for the same IP prefix
        from multiple egress routers, all these egress routers' loopback
        addresses are considered as the next hops for the IP prefix. For the
        selected low latency edge services, the ingress router BGP engine
        would call an edge service Management function that can select paths
        based on the edge service Metadata received. Section 5.1 has an
        exemplary algorithm to compute the weighted path cost based on the
        edge service Metadata carried by the Sub-TLV(s) specified in this
        document.</t>

        <t>Section 5 has the detailed description of the edge service Metadata
        influenced optimal path selection.</t>
      </section>

      <section title="Ingress Router Forwarding Behavior">
        <t>When the ingress router receives a packet and does a lookup on the
        route in the FIB, it gets the destination prefix's whole path. It
        encapsulates the packet destined towards the optimal egress node.</t>

        <t>For subsequent packets belonging to the same flow, the ingress
        router needs to forward them to the same egress router unless the
        selected egress router is no longer reachable. Keeping packets from
        one flow to the same egress router, a.k.a. Flow Affinity, is supported
        by many commercial routers. Most registered EC services have
        relatively short flows.</t>

        <t>How Flow Affinity is implemented is out of the scope for this
        document.</t>
      </section>

      <section title="Forwarding Behavior when UEs Move">
        <t>When a UE moves to a new 5G gNB which is anchored to the same UPF,
        the packets from the UE traverse to the same ingress router. Path
        selection and forwarding behavior are same as before.</t>

        <t>If the UE maintains the same IP address when anchored to a new UPF,
        the directly connected ingress router might use the information passed
        from a neighboring router to derive the optimal Next Hop for this
        route. The detailed algorithm is out of the scope of this
        document.</t>
      </section>
    </section>

    <section title="Edge Service Metadata Encoding">
      <section title="Metadata Path Attribute">
        <t>The Metadata Path Attribute is an optional non-transitive BGP Path
        attribute to carry metrics and metadata about the edge services
        attached to the egress router. The Metadata Path Attribute, to be
        assigned by IANA [RFC2042], consists of a set of Sub-TLVs, and each
        Sub-TLV contains information for specific metrics of the edge
        services.</t>

        <section title="Metadata Path Attribute Characteristics">
          <t>Only a small subset of BGP UPDATE messages include the Metadata
          Path Attribute. The choice of which prefix to carry the Metadata
          Path Attribute is determined by local policies. The Metadata Path
          Attribute can be included in a BGP UPDATE message [RFC4271] together
          with other BGP Path Attributes [IANA-BGP-PARAMS], such as
          Communities [RFC4360], NEXT_HOP, Tunnel Encapsulation Path Attribute
          [RFC9012], etc.</t>

          <t>The metadata Path Attribute has the following
          characteristics:</t>

          <list style="hanging">
            <t hangText="-">Non-transitive</t>

            <t hangText="-">Boundary node filtering SHOULD be deployed to
            remove the BGP Metadata Path attribute at the administrative
            boundary to prevent the distribution of the BGP Metadata Path
            Attribute beyond its intended scope of applicability.</t>

            <t hangText="-">Can be packed with NLRI(AFI/SAFI) Unicast (1/1,
            2/1), Label Unicast (AFI/SAFI - ) [RFC8277], IPv6 Anycast
            [RFC4786]</t>

            <t hangText="-">MUST contain at least one metadata Sub-TLV.
            Multiple Metadata Sub-TLVs can be included in a Metadata Path
            Attribute in one BGP UPDATE message. The choice of the Sub-TLVs
            present in the BGP Metadata Path attribute is determined by the
            local policies. Multiple Sub-TLVs may be carried by a single BGP
            Metadata Path Attribute.</t>
          </list>

          <t>The metrics Sub-TLVs included in the Metadata Path Attribute
          apply to all the address families carried in the NLRI field of the
          BGP UPDATE message [RFC4271]. For a multi-protocol BGP UPDATE
          message [RFC4760] [RFC7606], the metrics Sub-TLVs included in the
          Metadata Path Attribute apply to all the AFIs/SAFIs address families
          carried by the MP_REACH_NLRI.</t>
        </section>

        <section title=" Metadata Path Attribute Processing">
          <t>A BGP speaker that advertises a path received from one of its
          neighbors SHOULD advertise the BGP Metadata Path attribute received
          with the path without modification as long as the BGP Metadata Path
          attribute was acceptable. If the path did not come with a BGP
          Metadata Path attribute, the speaker MAY attach a BGP Metadata
          Attribute to the path if configured to do so.</t>

          <t>A BGP Peer receiving a BGP Metadata Path attribute should ignore
          Sub-TLVs with unknown types and process the recognized Sub-TLVs. BGP
          Peers should not delete any Sub-TLV from the BGP Metadata Path
          Attribute.</t>
        </section>

        <section title="Sub-TLVs Data Processing">
          <t>By default, a BGP speaker does not report any unrecognized
          Sub-TLVs within a Metadata Path Attribute unless configured to send
          a notification to its management system. The ingress node should be
          configured with an algorithm to combine the recognized metrics
          carried by the Sub-TLVs within a Metadata Path Attribute of the
          received BGP UPDATE message.</t>
        </section>

        <section title="Metadata Path Attribute Handling Procedure">
          <t>The Metadata Path Attribute MUST contain at least one metadata
          Sub-TLV. Multiple Metadata Sub-TLVs can be included in a Metadata
          Path Attribute in one BGP UPDATE message. The content of the
          Sub-TLVs present in the BGP Metadata Path attribute is determined by
          the configuration. When a BGP Speaker does not recognize some of the
          Sub-TLVs within one Metadata Path Attribute in a BGP UPDATE message,
          the BGP Speaker should forward the received BGP UPDATE message
          without any change if the transitive bit is set to 1 [RFC4271]. The
          domain ingress nodes should process the recognized Sub-TLVs carried
          by the Metadata Path Attribute and ignore the unrecognized Sub-TLVs.
          By default, a BGP speaker does not report any unrecognized Sub-TLVs
          within a Metadata Path Attribute unless configured to send a
          notification to its management system. The ingress node should be
          configured with an algorithm to combine the recognized metrics
          carried by the Sub-TLVs within a Metadata Path Attribute of the
          received BGP UPDATE message.</t>

          <t>The metrics Sub-TLVs included in the Metadata Path Attribute
          apply to all the address families carried in the NLRI field of the
          BGP UPDATE message [RFC4271]. For a multi-protocol BGP UPDATE
          message [RFC4760] [RFC7606], the metrics Sub-TLVs included in the
          Metadata Path Attribute apply to all the AFIs/SAFIs address families
          carried by the MP_REACH_NLRI.</t>
        </section>
      </section>

      <section title="Metadata Path Attribute TLV">
        <figure anchor="Metadata Path Attribute"
                title="Metadata Path Attribute">
          <artwork><![CDATA[  
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |AttFlag|Reserve|MetaDataPathAtt|Length(1 Octet)| Reserved      |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     |         Value (multiple Metadata Sub-TLVs)                    |
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	]]></artwork>
        </figure>

        <t><list style="hanging">
            <t hangText="AttFlag:">Attribute flags, defined as:</t>
          </list> <list style="hanging">
            <t hangText="-">The high-order bit (bit 0): set to 1.</t>

            <t hangText="-">The second high-order bit (bit 1): set to 0 to
            indicate that the service-metadata is not transitive. Only
            intended for the receiving router.</t>

            <t hangText="-">The third high-order bit (bit 2): same as
            specified by [RFC4271].</t>

            <t hangText="-">The fourth high-order bit (bit 3): set to 0 to
            indicate there is one octet for the Length field [RFC4271].</t>

            <t hangText="-">The fifth-eighth high-order bits (bit 4~7) are
            reserved.</t>
          </list> <list style="hanging">
            <t hangText="MetaDataPathAtt:">Metadata Path Attribute: TBD1
            (assigned by IANA).</t>

            <t hangText="Length:">The total number of octets of the value
            field.</t>

            <t hangText="Reserved:">For future expansion.</t>
          </list></t>

        <t>All values in the Sub-TLVs are unsigned 32 bits integers.</t>
      </section>

      <section title="The Site Preference Index Sub-TLV">
        <t>Different services might have different preference index values
        configured for the same site. For example, Service-A requires high
        computing power, Service-B requires high bandwidth among its
        microservices, and Service-C requires high volume storage capacity.
        For a DC with relatively low storage capacity but high bisectional
        bandwidth, its preference index value for Service-B is higher and
        lower for Service-C. Site Preference Index can also be used to achieve
        stickiness for some services.</t>

        <t>It is out of the scope of this document how the preference index is
        determined or configured.</t>

        <t>The Preference Index Sub-TLV has the following format:</t>

        <figure anchor="Preference Index Sub-TLV"
                title="Preference Index Sub-TLV">
          <artwork><![CDATA[ 
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |Site-Preference-Index Sub-Type | Length        | Reserved      |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                   Preference Index value                      |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
	]]></artwork>
        </figure>

        <t><list style="hanging">
            <t hangText="-">Site-Preference-Index Sub-Type =1 (specified in
            this document).</t>

            <t hangText="-">Preference Index value: 1 .. (2^32-1); the higher
            the value, the more preference the site. Preference Index value ==
            0 is reserved.</t>
          </list></t>
      </section>

      <section title="Site Physical Availability Index Metadata">
        <t>The Site Physical Availability Index indicates the percentage of
        impact on a group of routes associated with a common physical
        characteristic, for example, a pod, a row of server racks, a floor, or
        an entire DC. The purpose is to use one UPDATE message to indicate a
        group of routes of different NLRIs impacted by a physical event. For
        example, a power outage to a pod can cause the Site Physical
        Availability Index to be 0% for all the routes in the pod. Partial
        fiber cut to a row of shelves can cause the Site Physical Availability
        Index to 50% for all the routes in those shelves. The value is 0-100,
        with 100% indicating the site is fully functional, 0% indicating the
        site is entirely out of service, and 50% indicating the site is 50%
        degraded.</t>

        <t>It is recommended to assign each route with one Site-ID. Depending
        on deployment, one DC can use POD number as Site-ID, another DC can
        use Row of Shelves as the Site-ID.</t>

        <t>Cloud Site/Pod failures and degradation include but are not limited
        to, a site degradation or an entire site going down caused by a
        variety of reasons, such as fiber cut connecting to the site or among
        pods, cooling failures, insufficient backup power, cyber threats
        attacks, too many changes outside of the maintenance window, etc.
        Fiber-cut is not uncommon within a Cloud site or between sites.</t>

        <t>When those failure events happen, the edge (egress) router is
        running fine. Therefore, the ingress routers with paths to the egress
        router can't use BFD to detect the failures.</t>

        <t>When there is a failure occurring at an edge site (or a pod), many
        instances can be impacted. In addition, the routes (i.e., the IP
        addresses) in the site might not be aggregated nicely. Instead of many
        BGP UPDATE messages to the ingress routers for all the instances, i.e.
        routes, impacted, the egress router can send one single BGP UPDATE to
        indicate the capacity availability of the site. The ingress routers
        can switch all or a portion of the instances associated with the site
        depending on how much the site is degraded.</t>

        <t>The BGP UPDATE for the individual instances (i.e., the routes) can
        include the Capacity Availability Index solely for ingress routers to
        associate the routes with the Side-ID. The actual Capacity
        Availability Index value, i.e., the percentage for all the routes
        associated with the Side-ID, is generated by the egress routers with
        the egress routers' loopback address as the NLRI.</t>

        <t>The Site Physical Availability Index Sub-TLV has fixed length of 4
        Octets. Therefore there is no Length field.</t>

        <figure anchor="Site Physical Availability Index Sub-TLV"
                title="Site Physical Availability Index Sub-TLV">
          <artwork><![CDATA[ 
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      PhyAvailIdx Sub-Type     |I|         Reserved            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Site-ID (2 octets)     | Site Availability Percentage  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	]]></artwork>
        </figure>

        <t><list style="hanging">
            <t hangText="- PhyAvailIdx:">Site-Physical-Availability-Index
            Sub-Type=2 (Specified in this document).</t>

            <t hangText="- Site ID:">is an identifier for a group of routes
            associated with a common physical characteristic, for example, a
            pod, a row of server racks, a floor, or an entire DC. The purpose
            is to use one UPDATE message to indicate a group of routes
            impacted by a physical event. Those routes might be from different
            address families or NLRIs. There could be multiple sites connected
            to one egress router (a.k.a. Edge DC GW).</t>

            <t hangText="Route-Flag I:">is a flag bit. When set to 1, the Site
            Availability Index is for BGP speakers (receivers) to associate
            the routes with the Site-ID. The Site Availability Percentage
            value is ignored. When set to 0, the BGP speakers (receivers)
            should apply the Site Availability Index value to all the routes
            associated with the Site-ID.</t>

            <t hangText="Site Availability Percentage:">When the RouteFlag-I
            is 1, the Site Availability Percentage is ignored by the Ingress
            routers. When the RouteFlag I is set to 0, the Site Availability
            Percentage represents the percentage of the site availability for
            all the routes associated with the Site-ID, e.g., 100%, 50%, or
            0%. When a site goes dark, the Index is set to 0. 50 means 50%
            functioning. When the value is outside the 0-100% range, the value
            carried in this Sub-TLV is ignored.</t>

            <t hangText="Reserved:">this is a 7-bit that is reserved for
            future use. The bits are set to zero upon transmission, and
            ignored upon reception.</t>
          </list></t>

        <section title="Site Index Associated to Routes">
          <t>An egress router sets itself as the next hop for a BGP peer
          before sending an UPDATE with the Metadata Path Attribute that
          includes the Site Physical Availability Index Sub-TLV. The Site
          Physical Availability Index Sub-TLV (with RouteFlag-I=1) is for
          ingress routers to associate the Site Identifier with the
          prefixes.</t>

          <t>However, it is not necessary to include the Site Physical
          Availability Index Sub-TLV for every BGP Update message if there is
          no change to the Site Identifier or the Site Physical Availability
          value for the prefixes.</t>
        </section>

        <section title="BGP UPDATE with standalone Site Availability Index">
          <t>Upon receiving a BGP update message from Router-X, containing the
          Metadata Path Attribute with the Site Physical Availability Index
          Sub-TLV, the next-hop should be the loopback of Router-X. The local
          BGP peer uses local policy to evaluate the route (prefix and path
          attributes). When the local policy processes the Metadata Attribute
          with the Site Physical availability, it will use the site
          availability index to efficiently reduce or increase the preference
          for all BGP routes with the Router-X next-hop (loopback).</t>

          <t>The BGP UPDATE with a standalone Site Availability Index is NOT
          intended for resolving NextHop.</t>
        </section>
      </section>

      <section title="Service Delay Prediction Index">
        <t>It is desirable for an ingress router to select a site with the
        shortest processing time for an ultra-low latency service. But it is
        not easy to predict which site has "the fastest processing time" or
        "the shortest processing delay" for an incoming service request
        because:</t>

        <t><list style="hanging">
            <t hangText="-">The given service instance shares the same
            physical infrastructure with many other applications and service
            instances. Service requests by other applications, UEs, or
            applications running behavior can impact the processing time for
            the given service instance.</t>

            <t hangText="-">The given service instance can be served by a
            cluster of servers behind a Load Balancer. To the network, the
            service is identified by one service ID.</t>

            <t hangText="-">The service complexity is different. One service
            may call many microservices, need to access multiple backend
            databases, and need to go through sophisticated security scrubbing
            functions, etc. Another service can be processed by a few simple
            steps. Without the application internal logic, it is not easy to
            estimate the processing time for future service requests.</t>
          </list></t>

        <t>Even though utilization measurements, like those below, are
        collected by most data centers, they cannot indicate which site has
        the shortest processing time. A service request might be processed
        faster on Site-A even if Site-A is overutilized.</t>

        <t><list style="hanging">
            <t hangText="o">Server utilization for the server where the
            instance is instantiated.</t>

            <t hangText="o">The network utilization for the links to the
            server where the instance is instantiated.</t>

            <t hangText="o">The number of databases that the service instance
            will access.</t>

            <t hangText="o">The memory utilization of the databases</t>
          </list></t>

        <t>The remaining available resource at a site is a more reasonable
        indication of process delay for future service requests.</t>

        <t><list style="hanging">
            <t hangText="o">The remaining available Server resources.</t>

            <t hangText="o">The remaining available network utilization for
            the links to the server where the instance is instantiated.</t>

            <t hangText="o">The number of databases that the service instance
            will access.</t>

            <t hangText="o">The remaining storage available for the
            databases.</t>
          </list></t>

        <t>The Service Delay Prediction Index is a value that predicts
        processing delays at the site for future service requests. The higher
        the value, the longer of the delay.</t>

        <section title="Service Delay Prediction Sub-TLV">
          <t>While out of scope, we assume there is an algorithm that can
          derive the Service Delay Prediction Index that can be assigned to
          the egress router. When the Service Delay Prediction value is
          updated, which can be triggered by the available resources change,
          etc., the egress router can attach the updated Service Delay
          Predication value in a Sub-TLV under the Metadata Path Attribute of
          the BGP Route UPDATE message to the ingress routers.</t>

          <figure anchor="Service Delay Prediction Index Sub-TLV"
                  title="Service Delay Prediction Index Sub-TLV">
            <artwork><![CDATA[ 
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ServiceDelayPredict Sub-Type  |   Length      |F| Reserved    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         Service Delay Predication Value                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
]]></artwork>
          </figure>

          <list style="hanging">
            <t hangText="- ServiceDelayPredict:">(Service Delay Predication)
            Sub-type=3 (specified in this document).</t>

            <t hangText="- Flag (F):">A single bit flag to indicate the
            specific condition of the Service Delay Predication Value.</t>

            <t
            hangText="- Service Delay Predication Value (when the Flag bit is set to 1):">an
            integer in the range of 0-100, with 0 indicating that the service
            delay is negligible and 100 indicating that the site has the most
            significant delay compared to all other sites for the same
            service. When the value is outside the 0-100 range, the value
            carried in this Sub-TLV is ignored.</t>

            <t
            hangText="- Service Delay Predication Value (when the Flag bit is set to 0):">the
            estimated delay time as defined in RFC5905.</t>
          </list>
        </section>

        <section title="Service Delay Prediction Based on Load Measurement">
          <t>When data centers detailed running status are not exposed to the
          network operator, historic traffic patterns through the egress nodes
          can be utilized to predict the load to a specific service. For
          example, when traffic volume to one service at one data center
          suddenly increases a huge percentage compared with the past 24 hours
          average, it is likely caused by a larger than normal demand for the
          service. When this happens, another data center with
          lower-than-average traffic volume for the same service might have a
          shorter processing time for the same service.</t>

          <t>Here are some measurements that can be utilized to derive the
          Service Delay Predication for a service ID:</t>

          <list style="hanging">
            <t hangText="-">Total number of packets to the attached service
            instance (ToPackets);</t>

            <t hangText="-">Total number of packets from the attached service
            instance (FromPackets);</t>

            <t hangText="-">Total number of bytes to the attached service
            instance (ToBytes);</t>

            <t hangText="-">Total number of bytes from the attached service
            instance (FromBytes);</t>

            <t hangText="-">The actual load measurement to the service
            instance attached to an egress router can be based on one of the
            metrics above or including all four metrics with different weights
            applied to each, such as:</t>

            <t>LoadIndex =
            w1*ToPackets+w2*FromPackes+w3*ToBytes+w4*FromBytes</t>

            <t>Where w1/w2/w3/w4 are between 0-1. w1+ w2+ w3+ w4 = 1;</t>

            <t>The weights of each metric contributing to the index of the
            service instance attached to an egress router can be configured or
            learned by self-adjusting based on user feedbacks.</t>
          </list>

          <t>The Service Delay Prediction Index can be derived from
          LoadIndex/24Hour-Average. A higher value means a longer delay
          prediction. The egress router can use the ServiceDelayPred sub-TLV
          to indicate to the ingress routers of the delay prediction derived
          from the traffic pattern.</t>

          <t>Note: The proposed IP layer load measurement is only an estimate
          based on the amount of traffic through the egress router, which
          might not truly reflect the load of the servers attached to the
          egress routers. They are listed here only for some special
          deployments where those metrics are helpful to the ingress routers
          in selecting the optimal paths.</t>
        </section>

        <section title="Raw Load Measurement Sub-TLV">
          <t>When ingress routers have embedded analytics tool relying on the
          raw measurements, it is useful for the egress router to send the raw
          measurement.</t>

          <t>Raw Load Measurement Sub-TLV has the following format:</t>

          <figure anchor="Raw Load Measurement Sub-TLV"
                  title="Service Delay Prediction Raw Measurements Sub-TLV">
            <artwork><![CDATA[ 
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  | Raw-Load-Measurement Sub-Type | Length        |  Reserved     |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                   Measurement Period                          |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
  |           total number of packets to the Edge Service         |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
  |           total number of packets from the Edge Service       |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  
  |           total number of bytes to the Edge Service           |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
  |           total number of bytes from the Edge Service         |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
          </figure>

          <t>- Raw-Load-Measurement Sub-Type =4 (specified in this document):
          Raw measurements of packets/bytes to/from the edge service
          address.</t>

          <t>- The receiver nodes can compute the Service Delay Prediction for
          the Service based on the raw measurements sent from the egress node
          and preconfigured algorithms.</t>

          <t>- Measurement Period: BGP Update period in Seconds or
          user-specified period.</t>
        </section>
      </section>

      <section title="Service-Oriented Capability Sub-TLV">
        <t>The service-oriented capability Sub-TLV is for distributing information regarding the capabilities of a specific service in a deployment environment. Depending on the deployment, a deployment environment can be an edge site or other types of environments. This information provides ingress routers or controllers with the available resources for the specific service in each deployment environment. It enables them to make well-informed decisions for the optimal paths to the selected deployment environment. Currently, the Sub-TLV only has an abstract value derived from various metrics, although the specifics of this derivation are beyond the scope of this document. Importantly, this value is significant only when comparing multiple data center sites for the same service; it is not meaningful when comparing different services, meaning the capability value relevant to Service A cannot be directly compared with that for  Service B. Future enhancements may expand this sub-TLV to include more types of metrics or even raw data that represents direct metrics. This information is important in 5G network environments where efficient resource utilization is crucial for enhancing performance and service quality.</t>

        <figure anchor="Service-Oriented Capability Sub-TLV"
                title="Service-Oriented Capability Sub-TLV">
          <artwork><![CDATA[ 
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ServiceOriented Cap Sub-Type  |   Length      |A| Reserved    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  SO-CapValue  |            Reserved                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
]]></artwork>
        </figure>

        <list style="hanging">
          <t hangText="- ServiceOriented Cap:">(Service-Oriented Capability)
          Sub-type=5 (specified in this document).</t>

          <t hangText="- Flag (A):">Abstract value flag. When it is set to 1, it indicates that the SO-CapValue field's value is the Service-Oriented Capability Abstract Value. When the "A" flag is set to 0, the Capability Abstract Value is not included. In the future, other values can be added when the flag is set to 0.</t>

          <t
          hangText="- SO-CapValue (when the Flag bit is set to 1):">Service-Oriented Capability Abstract Value is an integer in the range of 0-100, with
          0 indicating that the site has the relatively lowest capability for
          the service and 100 indicating that the site has the highest
          capability for the service compared with all other sites that host
          the service. When the value is outside the 0-100 range, the value
          carried in this Sub-TLV is ignored.</t>
        </list>
      </section>

      <section title="Service-Oriented Utilization Sub-TLV">
        <t>The "Service-Oriented Utilization Sub-TLV" is for distributing a
        metric that measures the real-time utilization of resources allocated
        for processing specific services or applications at an edge site. This
        Sub-TLV complements the "Service-Oriented Capability Sub-TLV"
        described in Section 4.6, which addresses the static resource
        capability of a site for a service. While the Capability Abstract
        Value provides a baseline understanding of a site's potential to
        handle a service, the Utilization metric offers a dynamic perspective
        by quantifying how much of this capacity is currently being consumed.
        This distinction is crucial for managing resource efficiency and
        responsiveness in network operations, ensuring that capabilities are
        not only available but also optimally used to meet the actual service
        demands.</t>

        <figure anchor="Service-Oriented Utilization Sub-TLV"
                title="Service-Oriented Utilization Sub-TLV">
          <artwork><![CDATA[ 
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ServiceOriented Util Sub-Type |   Length      |P| Reserved    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  SO-UtilValue |            Reserved                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
]]></artwork>
        </figure>

        <list style="hanging">
          <t hangText="- ServiceOriented Util Sub-Type:">(Service-Oriented
          Utilization Sub-Type) Sub-type=6 (specified in this document).</t>

          <t hangText="- Flag (P):">is a single-bit Percentage flag. When it
          is set to 1, it indicates the value is the Service-Oriented
          Utilization in percentage. When the "P" flag is set to 0, the value
          in this Sub-TLV is the abstract value of the consumed resource.</t>

          <t
          hangText="- SO-UtilValue (when the Flag bit is set to 1):">Service-Oriented
          Utilization Value is a percentage (0-100), with 0 indicating that 0%
          of the capability has been consumed and 100 indicating that 100% has
          been consumed. When the value is outside the 0-100 range, the value
          carried in this Sub-TLV is ignored. For example, Capacity value is
          50 and the SO-Utilvalue is 50 when P-flag is set, it means 50% of 50
          unit of resource is consumed, while 25 unit of resource is availble
          in this site for the service. When the P-flag is 0, then the value
          of this filed is the abstract value of the consumed resource. For
          example, When the capacity value is 50, and the SO-Utilvalue is 50,
          it means all the resource has been consumed.</t>
        </list>
      </section>
    </section>

    <section title="Service Metadata Influenced Decision Process">
      <section title="Egress Node Behavior">
        <t>Multiple instances of the same service could be attached to one
        egress router. When all instances of the same service are grouped
        behind one application layer load balancer, they appear as one single
        route to the egress router, i.e., the application loader balancer's
        prefix. Under this scenario, the compute metrics for all those
        instances behind one application layer balancer are aggregated under
        the application load balancer's prefix. In this case, the compute
        metrics aggregated by the Load Balancer are visible to the egress
        router as associated with the Load Balancer's prefix. However, how the
        application layer Load Balancers distribute the traffic among
        different instances is out of the scope of this document. When
        multiple instances of the same service have different paths or links
        reachable from the egress router, multiple groups of metrics from
        respective paths could be exposed to the egress router. The egress
        router can have preconfigured policies on aggregating various metrics
        from different paths and the corresponding policies in selecting a
        path for forwarding the packets received from ingress routers. The
        aggregated metrics can be carried in the BGP Update messages instead
        of detailed measurements to reduce the entries advertised by the
        control plane and dampen the routes update in the forwarding plane.
        Upon receiving packets from ingress routers, the egress router can use
        its policies to choose an optimal path to one service instance. It is
        out of the scope of this document how the measurements are aggregated
        on egress routers and how ingress routers are configured with the
        algorithms to integrate the aggregated metrics with network layer
        metrics.</t>

        <t>Many measurements could impact and correspondingly reflect service
        performance. In order to simplify an optimal selection process, egress
        routers can have preconfigured policies or algorithms to aggregate
        multiple metrics into one simple one to ingress routers. Though out of
        the scope of this document, an egress router can also have an
        algorithm to convert multiple metrics to network metrics, an IGP cost
        for each instance, to pass to ingress nodes. This decision-making
        process integrates network metrics computed by traditional IGP/BGP and
        the service delay metrics from egress routers to achieve a
        well-informed and adaptive routing approach. This intelligent
        orchestration at the edge enhances the service's overall performance
        and optimizes resource utilization across the distributed
        infrastructure. When the egress has merged the compute metrics from
        the local sites behind it, it can include one or more aggregated
        compute metrics in the Metadata Path Attribute in the BGP UPDATE to
        the Ingress. Also, an identifier or flag can be carried to indicate
        that the metrics are merged ones. After receiving the routes for the
        Service ID with the identifier, the ingress would do the route
        selection based on pre-configured algorithms (see Section 3 of this
        document).</t>
      </section>

      <section title="Integrating Network Delay with the Service Metrics">
        <t>As the service metrics and network delays are in different units,
        here is an exemplary algorithm for an ingress router to compare the
        cost to reach the service instances at Site-i or Site-j.</t>

        <artwork><![CDATA[
                ServD-i * CP-j               Pref-j * NetD-i
Cost-i=min(w *(----------------) + (1-w) *(------------------))
                ServD-j * CP-i               Pref-i * NetD-j  

	]]></artwork>

        <list style="hanging">
          <t hangText="CP-i:">Capacity Availability Index at Site-i. A higher
          value means higher capacity available.</t>

          <t hangText="NetD-i:">Network latency measurement (RTT) to the
          Egress Router at the site-i.</t>

          <t hangText="Pref-i:">Preference Index for Site-i, a higher value
          means higher preference.</t>

          <t hangText="ServD-i:">Service Delay Predication Index at Site-i for
          the service, i.e., the ANYCAST address [RFC4786] for the
          service.</t>

          <t hangText="w:">Weight is a value between 0 and 1. If smaller than
          0.5, Network latency and the site Preference have more influence;
          otherwise, Service Delay and capacity availability have more
          influence.</t>
        </list>

        <t>When a set of service Metadata is converted to a simple metric, a
        decision process is determined by the metric semantics and deployment
        situations. The goal is to integrate the conventional network decision
        process with the service Metadata into a unified decision-making
        process for path selection.</t>
      </section>

      <section title="Integrating with BGP decision process">
        <t>When an ingress router receives BGP updates for the same IP address
        from multiple egress routers, all those egress routers are considered
        as the next hops for the IP address. For the selected services
        configured to be influenced by the edge service Metadata, the ingress
        router BGP Decision process [IDR-CUSTOM-DECISION] would trigger the
        edge service Management function to compute the weight to be applied
        to the route's next hop in the forwarding plane. The decision process
        is influenced by the edge service Metadata associated with the client
        routes, such as Capacity Availability Index, Site Preference, and
        Service Delay Prediction Index, in addition to the traditional BGP
        multipath computation algorithm, such as the Weight, Local preference,
        Origin, MED, etc., shown below:</t>

        <figure anchor="Metadata Influenced Decision"
                title="Metadata Influenced Decision">
          <artwork><![CDATA[ 
		     BGP ANYCAST Update 
   +--------+ with Metadata    +---------------+
   | BGP    |----------------->| EdgeServiceMgn| 
   |Decision|< - - - - - - - - |               | 
   +---^-|--+                  +-------|-------+ 
       | | BGP ANYCAST                 | Update Anycast
       | | Route                       | Route Nexthops
       | | Multi-path NH install       | with weight 
   +---|-V--+                          |
   |   RIB  |                          |
   +----+---+                          |
        |                              |
    +---V------------------------------V-------+ 
    |               Forwarding Plane           | 
    |                                          | 
    +------------------------------------------+  
]]></artwork>
        </figure>

        <t>When any of those metadata value goes to 0, the effect is the same
        as the routes becoming ineligible via the egress router who originates
        the metadata UPDATE. But when any of those metadata just degrade,
        there is possibility, even though smaller, for the egress router to
        continue as the optimal next hop.</t>

        <t>Suppose a destination address for aa08::4450 can be reached by
        three next hops (R1, R2, R3). Further, suppose the local BGP's
        Decision Process based on the traditional network layer policies and
        metrics identifies the R1 as the optimal next hop for this destination
        (aa08::4450). If the edge service Metadata results in R2 as the
        optimal next hop for the prefix, the Forwarding Plane will have R2 as
        the next-hop for the destination address of aa08::4450.</t>

        <t>The edge service Metadata influencing next hop selection is
        different from the metric (or weight) to the next hop. The metric to a
        next hop can impact many (sometimes, tens of thousands) routes that
        have the node as their next hop. while as the edge service Metadata
        only impact the optimal next hop selection for a subset of client
        routes that are identified as the edge services.</t>

        <t>When the BGP custom decision [idr-custom-decision] is used, the
        edge service Management function would have algorithm to combine the
        edge service Metadata attributes with the custom decision to derive
        the optimal next hop for the Edge service routes.</t>

        <t>Note: For a BGP UPDATE message that includes the edge service
        Metadata Path Attribute with the RouteFlag-I=0 and the egress router's
        loopback prefix as the NLRI, the Site Capacity Availability Index
        value is applied to all the routes associated with the Site-ID.</t>
      </section>
    </section>

    <section title="Service Metadata Propagation Scope">
      <t>Service Metadata are only distributed to the relevant ingress nodes
      interested in the Service, which can be configured or automatically
      formed.</t>

      <t>For each registered low-latency Service, BGP RT Constrained
      Distribution [RFC4684] can be used to form the Group interested in the
      Service. The "Service ID", an IP address prefix, is the Route Target.
      When an ingress router receives the first packet of a flow destined to a
      Service ID (i.e., IP prefix), the ingress router sends a BGP UPDATE that
      advertises the Route Target membership NLRI per [RFC4684]. The ingress
      router must assign a Timer for the Service ID, as the UE that uses the
      Service ID might move away. Upon receiving a packet destined for the
      Service ID, the ingress router must refresh the Timer. The ingress
      router must send a BGP Withdraw UPDATE for the Service ID upon
      expiration of the Timer.</t>

      <t>[RFC4684] specifies SAFI=132 for the Route Target membership NLRI
      Advertisements.</t>
    </section>

    <section title="Minimum Interval for Metrics Change Advertisement">
      <t>As the metrics change can impact the path selection, the Minimum
      Interval for Metrics Change Advertisement is configured to control the
      update frequency to avoid route oscillations. Default is 30s.</t>

      <t>Significant load changes at EC data centers can be triggered by
      short-term gatherings of UEs, like conventions, lasting a few hours or
      days, which are too short to justify adjusting EC server capacities
      among DCs. Therefore, the load metrics change rate can be in the
      magnitude of hours or days.</t>
    </section>

    <section title="Validation and Error Handling">
      <t>In addition to the Error Handling procedure described in [RFC7606], a
      BGP speaker should ignore the Metadata Path Attribute if more than one
      Metadata Path Attribute is within one BGP Update message.</t>

      <t>The Metadata Path Attribute contains a sequence of Sub-TLVs. The
      Metadata Path Attribute's length determines the total number of octets
      for all the Sub-TLVs under the Metadata Path Attribute. The sum of the
      lengths from all the Sub-TLVs under the Metadata Path Attribute should
      equal the length of the Metadata Path Attribute. If this is not the
      case, the TLV should be considered malformed, and the
      "Treat-as-withdraw" procedure of [RFC7606] is applied.</t>

      <t>When more than one sub-TLV is present in a Metadata Path Attribute,
      they are processed independently. Suppose a Metadata Path attribute can
      be parsed correctly but contains a Sub-TLV whose type is not recognized
      by a particular BGP speaker; that BGP speaker MUST NOT consider the
      attribute malformed. Instead, it MUST interpret the attribute as if that
      Sub-TLV had not been present. Logging the error locally or to a
      management system is optional. If the route carrying the Metadata path
      attribute is propagated with the attribute, the unrecognized Sub-TLV
      remains in the attribute.</t>
    </section>

    <section title="Manageability Considerations">
      <t>The edge service Metadata described in this document are only
      intended for propagating between Ingress and egress routers of one
      single BGP domain, i.e., the 5G Local Data Networks, which is a limited
      domain with edge services a few hops away from the ingress nodes. Only
      the selective services by UEs are considered as 5G edge services. The 5G
      LDN is usually managed by one operator, even though the routers can be
      by different vendors.</t>
    </section>

    <section title="Security Considerations">
      <t>The proposed edge service Metadata are advertised within the trusted
      domain of 5G LDN's ingress and egress routers. The ingress routers
      should not propagate the edge service Metadata to any nodes that are not
      within the trusted domain.</t>

      <t>To prevent the BGP UPDATE receivers (a.k.a. ingress routers in this
      document) from leaking the Metadata Path Attribute by accident to nodes
      outside the trusted domain [ATTRIBUTE-ESCAPE], the following practice
      should be enforced:</t>

      <list style="hanging">
        <t hangText="-">The Metadata Path Attribute originator sets the
        attribute as Non-transitive when sending the BGP UPDATE message to its
        correspoinding RR. According to [RFC4271], Non-transitive Path
        Attributes are only guaranteed to be dropped during BGP route
        propagation by implementations that do not recognize them.</t>

        <t hangText="-">The RR (Route Reflector) can append the NO-ADVERTISE
        well-known community to the BGP UPDATE message with Metadata Path
        Attribute when forwarding to the ingress routers. By doing so, the
        Route Reflector signals to ingress nodes that the associated route's
        Metadata Path Attribute should not be further advertised beyond their
        scope. This precautionary measure ensures that the receiver of the BGP
        UPDATE message refrains from forwarding the received update to its
        peers, preventing the undesired propagation of the information carried
        by the Metadata Path Attribute.</t>
      </list>

      <t>BGP Route Filtering or BGP Route Policies [RFC5291] can also be used
      to ensure that BGP update messages with Metadata Path Attribute attached
      do not get forwarded out of the administrative domain. BGP route
      filtering [RFC5291] allows network administrators to control the
      advertisements and acceptance of BGP routes, ensuring that specific
      routes do not leak outside the intended administrative domain. Here are
      the steps to achieve this:</t>

      <list style="hanging">
        <t hangText="-">Use Route Filtering: Implement route filtering
        policies on the ingress routers to restrict the propagation of BGP
        update messages for the registered 5G edge services beyond the
        administrative domain. You can use access control lists (ACLs), prefix
        lists, or route maps to filter the BGP routes classified as the 5G
        edge services, which need the Metadata Path Attributes to be
        distributed from egress routers to ingress routers.</t>

        <t hangText="-">Filter by Prefix: Use prefix filtering to specify
        which IP prefixes should be advertised to peers and which should be
        suppressed. This step ensures that only authorized routes are sent to
        external peers.</t>

        <t hangText="-">Use Route Maps: Route maps provide a flexible way to
        filter and manipulate BGP route advertisements. You can create route
        maps to match specific conditions and then apply them to the BGP
        configuration.</t>
      </list>
    </section>

    <section title="IANA Considerations">
      <section title="Metadata Path Attribute">
        <t>IANA is requested to assign a new path attribute from the "BGP Path
        Attributes" registry. The symbolic name of the attribute is
        "Metadata", and the reference is [This Document].</t>

        <artwork><![CDATA[
    +=======+======================================+=================+
    | Value |             Description              |    Reference    |
    +=======+======================================+=================+
    |  TBD1 |      Metadata Path Attribute         | [this document] |
    +-------+--------------------------------------+-----------------+ 
	]]></artwork>
      </section>

      <section title="Metadata Path Attribute Sub-Types">
        <t>IANA is requested to create a new sub-registry under the Metadata
        Path Attribute registry as follows:</t>

        <list style="hanging">
          <t hangText="Name:">Sub-TLVs under the "Metadata Path Attribute"</t>

          <t hangText="Registration Procedure:">Expert Review [RFC8126].</t>

          <t>Detailed Expert Review procedure will be added per RFC8126.</t>

          <t hangText="Reference:">[this document]</t>
        </list>

        <artwork><![CDATA[
+========+==========================+=================+
|Sub-Type|   Description            | Reference       |
+========+==========================+=================+
|      0 | reserved                 | [this document] |
+--------+--------------------------+-----------------+
|      1 | Site Preference Index    | [this document] |
+--------+--------------------------+-----------------+
|      2 | Site Availability Index  | [this document] |
+--------+--------------------------+-----------------+
|      3 | Service Delay Predication| [this document] |
+--------+--------------------------+-----------------+
|      4 | Raw Load Measurement     | [this document] |
+--------+--------------------------+-----------------+ 
|      5 | Service-Oriented         |                 |
|        | Capability               | [this document] |
+--------+--------------------------+-----------------+
|      6 | Service-Oriented         |                 |
|        | Utilization              | [this document] |
+--------+--------------------------+-----------------+
|  7-254 | unassigned               | [this document] |
+--------+--------------------------+-----------------+
|    255 | reserved                 | [this document] |
+--------+--------------------------+-----------------+
	]]></artwork>
      </section>
    </section>

    <section title="Acknowledgements">
      <t>Acknowledgements to Jeff Hass, Tom Petch, Adrian Farrel, Alvaro
      Retana, Robert Raszuk, Sue Hares, Shunwan Zhuang, Donald Eastlake, Dhruv
      Dhody, Cheng Li, DongYu Yuan, and Vincent Shi for their suggestions and
      contributions.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include="reference.RFC.4271"?>

      <?rfc include="reference.RFC.4360"?>

      <?rfc include="reference.RFC.4760"?>

      <?rfc include="reference.RFC.4786"?>

      <?rfc include="reference.RFC.5291"?>

      <?rfc include="reference.RFC.7606"?>

      <?rfc include="reference.RFC.8126"?>

      <?rfc include="reference.RFC.8277"?>

      <?rfc include="reference.RFC.9012"?>
    </references>

    <references title="Informative References">
      <reference anchor="IANA-BGP-PARAMS">
        <front>
          <title>BGP Path Attributes</title>

          <author>
            <organization abbrev="IANA">Internet Assigned Numbers
            Authority</organization>
          </author>
        </front>

        <seriesInfo name="BGP Path Attributes"
                    value="https://www.iana.org/assignments/bgp-parameters/"/>
      </reference>

      <reference anchor="ATTRIBUTE-ESCAPE"
                 target="https://datatracker.ietf.org/doc/draft-haas-idr-bgp-attribute-escape/">
        <front>
          <title>BGP Attribute Escape</title>

          <author>
            <organization>J. Haas</organization>
          </author>

          <date month="July" year="2023"/>
        </front>
      </reference>

      <?rfc include="reference.RFC.2042"?>

      <?rfc include="reference.RFC.4684"?>

      <?rfc include="reference.RFC.8174"?>

      <?rfc include="reference.RFC.8799"?>

      <reference anchor="TS.23.501-3GPP">
        <front>
          <title>System Architecture for 5G System; Stage 2, 3GPP TS 23.501
          v2.0.1</title>

          <author>
            <organization>3rd Generation Partnership Project
            (3GPP)</organization>
          </author>

          <date month="December" year="2017"/>
        </front>
      </reference>

      <reference anchor="IDR-CUSTOM-DECISION"
                 target="https://datatracker.ietf.org/doc/draft-ietf-idr-custom-decision/">
        <front>
          <title>BGP Custom Decision Process</title>

          <author>
            <organization>A. Retana, R. White</organization>
          </author>

          <date month="August" year="2017"/>
        </front>
      </reference>
    </references>
  </back>
</rfc>
