Network design

How we set up our network

        |      +-+-----+
        |        |       +-------+
  +-----+-+      | +-----|pl-klc1|
  |us-pno1|----+ | |     +-------+
  +-------+    | | |


Our eBGP sessions are mostly set up using extended next hop. We are currently only setting the route origin country and region BGP communities. Route selection is mostly shortest AS path. To find out more, read about frr’s route selection algorithm.

IGP and iBGP

The iBGP sessions in our network are set up with extended next hop aswell. At least I am planning to set up MPLS between our nodes.

IGP sessions are set up with the babel routing protocol. We choose babel, because of the ability it has to route IPv4 and IPv6 over a single session. The rxcost is based on the ping to the respective node. The ping from at-vie1 to us-pno1 is about 143ms for example, so the rxcost for the interface is 143.

Important: if you want to use babel in your network, make sure to only export routes that don’t originate from BGP. It will cause route hijacking, because the AS information are stripped when the routes are send over babel.

If you want to know more about route hijacking, read this blog post from lantian.

To add routes originating from the node, we are using a dummy interface (provided by the dummy kernel module, we spend way too much time trying to find it).

To learn more about babel and IGP, jlu5 has an excellent blog post, which also helped us massively to setup our internal routing.

Configuration examples

To separate internal and external routes, we are rejecting all routes that originate from BGP:

router babel
 distribute-list internal in
 distribute-list internal out
 ipv6 distribute-list internalv6 in
 ipv6 distribute-list internalv6 out
 redistribute ipv4 connected
 redistribute ipv6 connected

ip prefix-list internal seq 1001 permit ge 27 le 32
ip prefix-list internal seq 9999 deny le 32
ipv6 prefix-list internalv6 seq 1001 permit fd42:deca:fbad::/48 ge 48 le 128
ipv6 prefix-list internalv6 seq 9999 deny ::/0 le 128

An example of the rxcost:

famfo@frog :: ~ » ping                        
PING ( 56(84) bytes of data.
64 bytes from ( icmp_seq=1 ttl=45 time=142 ms
64 bytes from ( icmp_seq=2 ttl=45 time=144 ms
64 bytes from ( icmp_seq=3 ttl=45 time=143 ms
64 bytes from ( icmp_seq=4 ttl=45 time=144 ms
--- ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3001ms
rtt min/avg/max/mdev = 141.549/142.846/143.765/0.882 ms
interface igp_bagpipe
 babel rxcost 143

The (potentially outdated) configuration to all of our nodes can be found on our git. But please do us the favor, understand the basic concepts of BGP and FRR, and don’t just copy someones config.