E2EE VPN setup across CG-NAT using Wireguard, nftables, radvd and keepalived

Created at: Feb 23, 2025 / Last updated at: Apr 27, 2025

This site describes a provider, network and environment agnostic site-to-site tunnel through which an end-to-end encrypted Wireguard tunnel can be routed.

There is a tunnel-in-tunnel version documented by Pro Custodibus that uses a wireguard tunnel between every hop in a hub and spoke configuration. This version has a similar idea, but instead of using wireguard tunnels for each hop (as it is quite costly due to re-encryption) it uses a combination of ULA advertisements and static routes instead.

High Level Overview

The problem

When attempting to create a site-to-site VPN on residential networks there’s usually a few caveats that need to be considered:

Some networks are behind CG-NAT
Some networks only support either IPv4 or IPv6
Some networks dont hand out static IPs
Some networks might have multiple WAN access points (think SIM switch).

The configuration on this page can be used to create a mesh VPN that works across all these scenarios.

Conceptual overview

There are multiple sites which have a standard ISP internet router, possibly behind CG-NAT. On each site there are nodes on a single link.

Physical Networks

A /48 ULA range is chosen for which an edge router is placed on each site that advertises a unique /64 ULA subnet on the LAN, for example:

ULA Range: <Overlay ULA Prefix>::/48
Site A: <Site Overlay ULA Prefix>:A::/64
Site B: <Site Overlay ULA Prefix>:B::/64

The advertisement allows other machines on the same link to assign themselves a stable IPv6 in the respective /64 range via SLAAC. As such, each server is going to have a unique ULA address across all sites. The edge router additionally advertises itself as router for the entire /48 subnet, such that all traffic targeting the /48 and that is on link (i.e. the /64) is routed through the edge router automatically.

Between the edge routers there is a relay. Traffic destined for a range outside the site’s /64 handled by the respective edge router is forwarded to the relay which then forwards it to the destination site’s edge router (if any) that handles the respective destination range. To avoid issues with CG-NAT or similar, Wireguard tunnels are configured between the edge routers and the relay with a PersistentKeepalive configured on the edge router’s end so that bi-directional communication is possible continuously. The IP used within this routing tunnel is irrelevant, but it should be a /128 ULA address for each device, with the ULA address being outside the chosen /48 range.

Site-To-Site VPN

Once connectivity between the sites has been established, the servers can address each other via their ULA IP addresses and use them to build a wireguard mesh, where the endpoint for each peer is the respective ULA IP.

With this configuration, servers do not need to know about the backing routing using the edge routers and relay. This allows an edge router to be portable and easily replaceable, without the need for additional configuration on the servers.

Wireguard Mesh

Potential drawbacks

The following sections describe some drawbacks or additional considerations that should be taken into account with this setup.

Limited client access

Clients can not access the tunnel easily if the default gateway is not the edge router itself. Android does SLAAC and assigns an IP, but it seems that it does not accept router advertisements. As such, it can establish tunnels to servers on the same network, but not on other sites. Tunnel-in-tunnel only works with rooted devices. For computers it works fine if they’re on the same link (and registered as peers on the servers). Outside the link they either need a tunnel-in-tunnel or port forwarding on the relay server.

A portable router can be used as “mobile edge gateway”, such as a Raspberry Pi that provides an access point.

Relay failure

This configuration contains no failover for the relay (it does for the edge routers however). If the relay fails, the site-to-site tunnel no longer works. Servers at the same site can still communicate with each other.

One could create secondary relays (it should be as easy as simply creating one). This has not been tested, however.

Information exposure and traffic isolation

Traffic between servers is end-to-end-encrypted. Source/destination addresses (between edge routers and relays) are only encrypted in transit. As such, routing information is visible on each relay node that routes the respective traffic. This design has no reason to hide this information.

Any node participating in the network can see every other node over the overlay network. “Node” includes any device that is on the same link as one of the edge router. This is by design, however.

VLAN separation could be used to isolate server networks and mitigate some issues in regards to spoofing or foreign network access (for rogue relays).

Inter-Site communication

Assume a source server at site A wants to communicate with a destination server at site B. The communication is end-to-end encrypted with a tunnel-in-tunnel for each hop between sites.

Server A configures a wireguard interface which contains the destination server B as peer, with the public/endpoint IP being server B’s ULA address. When a message is sent, it is encrypted using the public key of the destination server and routed to the edge router, which encapsulates it and forwards it to the relay through the routing tunnel. The relay then re-encapsulates the packet to send it to the edge router at site B. The edge router at site B decrypts the outer layer and sends it to the destination server. The destination server can then decrypt the message using its key.

Intra-Site communication

When two servers at the same site want to communicate with each other, they will do so directly on layer 2 through neighbor discovery.

Configuration

The sections below describe the basic configuration for the VPN to work. Any up to date Linux-based OS should work for the edge routers and the relay, the configuration was only tested on an Ubuntu-24.04 minimal server, however.

Prerequisites

For this configuration the following is required:

A server on each site’s link used to advertise an ULA range
An internet-addressable server to handle relaying traffic between sites, addressable via IPv4 and IPv6
A DNS A and AAAA entry pointing to the relay server

In addition, the design requires at lesat 2 ranges:

A /48 Mesh ULA range, used for addressing servers (further down referred to as Overlay ULA Prefix)
A unique /128 ULA address for each server, outside the /48 range of (1), used within the wireguard tunnel mesh (further down referred to as Mesh Tunnel IP)
A unique /128 ULA address for each edge router and relay, outside the /48 range of (1), used within the routing wireguard tunnels (further down referred to as Routing Tunnel IP)

Note that GUA addresses can be used as well (if any are available), however, LLA addresses should be avoided as they are not routable (and require interfaces to be supplied for communication since they aren’t necessarily unique).

VPN Relay

The relay can be a cheap VPS somewhere on the internet. Optimally, it would be geographically close / in-between the sites. It should be reachable via a public, static IPv4 and IPv6 address and have DNS A/AAAA entries assigned to it (this will be important for the edge routers later on).

Wireguard routing tunnel configuration

First of all, wireguard-tools should to be installed. This gives access to wg-quick which is very convenient for setting up the interface.

sudo apt install wireguard-tools

Then generate the private/public key pair:

# Run the following as root
umask 077
wg genkey > /etc/wireguard/wgvpnr01-privatekey
wg pubkey < /etc/wireguard/wgvpnr01-privatekey > /etc/wireguard/wgvpnr01-publickey

Next, a wireguard interface needs to be configured. The interface is used for relaying traffic between the different sites.

# /etc/wireguard/wgvpnr01.conf
 
[Interface]
Address = <Routing Tunnel IP>/128
ListenPort = <Listen Port>
PrivateKey = <Private Key>
 
# Enable IP forwarding while the interface is up
PreUp = sysctl -w net.ipv6.conf.all.forwarding=1
PostDown = sysctl -w net.ipv6.conf.all.forwarding=0
 
# Routing tunnel to edge router at site A
[Peer]
PublicKey = <Public Key>
AllowedIPs = <Routing Tunnel IP>/128, <Site Overlay ULA Prefix>::/64
 
# Routing tunnel to edge router at site B
[Peer]
PublicKey = <public key of site B edge router>
AllowedIPs = <Routing Tunnel IP>/128, <Site Overlay ULA Prefix>::/64
 
# ...

Enable the wireguard interface on startup:

sudo systemctl enable --now wg-quick@wgvpnr01.service

Firewall configuration

The firewall configuration below configures the following rules:

Input: Accept incoming traffic on the wireguard port as well as ICMP
Forward: Allow forward across the wireguard interface if the destination is in the ULA /48 range and the target port is in the wireguard port range configured on the servers

# /etc/nftables.conf
flush ruleset
 
# The interface through which forwarding between sites is allowed
# i.e. the wireguard interface configured in the previous step
define vpn_routing_interface = "wgvpnr01"
 
# The port the interface listens on
define vpn_routing_wireguard_port = <Port>
 
# The ULA range that is used to address devices across sites
define vpn_routing_full_range = <Overlay ULA Prefix>::/48
 
# The port range on which devices are listening for wireguard connections
define vpn_routing_dport_range = 58000-60000
 
table inet filter {
    chain input {
        # Default to drop
        type filter hook input priority filter; policy drop;
 
        # Allow established and related connections
        ct state established,related accept
 
        # Allow loopback traffic
        iif lo accept
 
        # Allow DHCP client
        iif $lan_interface udp dport 68 accept
        iif $lan_interface udp dport 546 accept
 
        # Reject traceroute for 30 hops (this allows clients to see the hop IP)
        udp dport { 33434-33474 } reject
 
        # Allow incoming traffic on the routing port
        udp dport $vpn_routing_wireguard_port accept
 
        # Allow ICMP
        icmp type { echo-request, echo-reply, destination-unreachable, time-exceeded } accept
        icmpv6 type { echo-request, echo-reply, destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } accept
 
        log prefix "Dropped Input: "
        drop
    }
    chain forward {
        type filter hook forward priority 0; policy drop;
 
        # Allow established and related connections
        ct state vmap { invalid : drop, established : accept, related : accept }
 
        # Allow forward of ICMPv6 and establishing wireguard connections across relay network
        iifname $vpn_routing_interface oifname $vpn_routing_interface ip6 daddr $vpn_routing_full_range udp dport $vpn_routing_dport_range accept
        iifname $vpn_routing_interface oifname $vpn_routing_interface ip6 daddr $vpn_routing_full_range icmpv6 type { echo-request, echo-reply, destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } accept
 
        log prefix "Rejected Forward: "
        reject with icmpx type host-unreachable
    }
    chain output {
        type filter hook output priority filter;
    }
}

Start the firewall:

systemctl enable --now nftables

At this point, the relay is ready to accept connections from the edge routers and to forward messages between the two sites to the specified wireguard port range.

VPN Edge Router

The edge router can be any device on a site’s network. It uses wireguard for communication with the relay, radvd for advertising the site’s subnet and keepalived preventing edge router conflicts:

sudo apt install wireguard-tools radvd keepalived

The configuration in the following sections assumes that the edge router is part of the <Overlay ULA Prefix>::/48 network and is responsible for advertising <Site Overlay ULA Prefix>::/64.

Internet-facing interface configuration

The default interface should simply use DHCP to get an address. This allows the router to be portable and agnostic to the network it is connected to (as long as the underlying network uses DHCP).

# /etc/systemd/network/<pri>-<interface>.network
 
[Match]
Name=<interface name>
 
[Network]
DHCP=yes
IPv6AcceptRA=true

keepalived/VRRP configuration

To avoid issues with multiple edge routers for the same site prefix conflicting on the same link, only one of the edge router is assigned the router IP. This is done using VRRP/keepalived. The VRRP IP must be the same for the routers that advertise the same subnet.

Using VRRP is optional, of course, but it makes things easier in case fallback routers are configured.

Note that the VRRP IP must be an LLA for SLAAC to work properly.

# /etc/keepalived/keepalived.conf
 
vrrp_sync_group G1 {
    group {
        wg_vpn_edge_router
    }
}
 
vrrp_instance wg_vpn_edge_router {
    # Configure the interface to set the IP on
    interface <lan interface name>
 
    # Configure the router IP, should be the same across all routers
    # that advertise the same subnet
    virtual_router_id <router id>
 
    # Configure the priority, the higher the more important
    priority 100
 
    # The interval to send VRRP advertisements
    advert_int 1.0
 
    # The IP addresses to assign. It must be a link local address
    virtual_ipaddress {
        <VRRP LLA>/128
    }
 
    # Disable preemption
    nopreemt
 
    # Set the GARP delay for publishing the MAC address
    garp_master_delay 1
}

Enable it on system startup:

sudo systemctl enable --now keepalived

radvd configuration

Radvd is used to advertise the site’s /64 ULA prefix as well as the supported routes.

# /etc/radvd.conf
 
# Replace the <interface name> with the interface which should be used
# to advertise the subnet
interface <lan interface name> {
    # Send advertisements
    AdvSendAdvert on;
 
    # Advertise every 10 seconds
    MinRtrAdvInterval 10;
 
    # Advertise at least once every 30 seconds
    MaxRtrAdvInterval 30;
 
    # This line is important, setting the lifetime to 0 prevents it from
    # advertising itself as default gateway
    AdvDefaultLifetime 0;
 
    # Consider a device reachable if available within the last 10 minutes
    AdvReachableTime 600000;
 
    # Advertise the VRRP IP as source address (same as the VRRP configuration)
    AdvRASrcAddress {
        <VRRP LLA>;
    };
 
    # Choose a /64 prefix under the ULA /48 for the current site, here site A
    prefix <Site Overlay ULA Prefix>::/64 {
        # The /64 is on link
        AdvOnLink on;
 
        # Enable SLAAC
        AdvAutonomous on;
 
        # Advertise the router address
        AdvRouterAddr on;
    };
 
    # Add the entire /48 to RIO to advertise routing to the subnet being supported
    route <Overlay ULA Prefix>::/48 {
        AdvRoutePreference high;
    };
};

Enable radvd on startup:

sudo systemctl enable --now radvd

Wireguard routing tunnel configuration

Similar to the relay server, create a new wireguard interface. This configuration establishes the connection to the relay server.

# /etc/wireguard/wgvpnr01.conf
 
# Use the address configured as peer on the relay
[Interface]
Address = <Routing Tunnel IP>/128
ListenPort = <Listen Port>
PrivateKey = <Private key>
MTU = <Main interface MTU - 80>
 
# Enable IP forwarding while the interface is up
PreUp = sysctl -w net.ipv6.conf.all.forwarding=1
PreDown = sysctl -w net.ipv6.conf.all.forwarding=0
 
# Add the relay as peer
# AllowedIPs should be the same as the one configured on the relay's wireguard interface.
#
# The Endpoint should be a DNS entry so that it will work no matter whether
# the internet connection is IPv4 or IPv6.
#
# The PersistentKeepalive is to keep the connection alive behind CG-NAT. A value lower than
# 30 should be OK for most ISP's
[Peer]
PublicKey = <Public Key>
AllowedIPs = <Routing Tunnel IP>/128, <Overlay ULA Prefix>::/48
Endpoint = <Relay Domain>:<Port>
PersistentKeepalive = 25

sudo systemctl enable --now wg-quick@wgvpnr01.service

Note: wg-quick loads the configuration on start and shutdown. If stopping the service fails, manual removal of the interface might be necessary with wg-quick down <interface name>

nftables configuration

The firewall should be configured to allow at least ICMP, VRRP and forwarding to the relay. The configuration below can be used as a reference. It forwards traffic destined for the /48 to the relay and blocks forwarding for its own /64 (as this should ve done via neighbour discovery).

#!/usr/sbin/nft -f
 
flush ruleset
 
define lan_interface = <LAN interface name>
define vpn_routing_interface = wgvpnr01
define vpn_routing_wireguard_port = <Wireguard port>
define vpn_routing_full_range = <Overlay ULA Prefix>::/48
define vpn_routing_site_range = <Site Overlay ULA Prefix>::/64
 
table ip filter {
    chain input {
        # Drop input by default
        type filter hook input priority 0; policy drop;
 
        # Accept established/related
        ct state established,related accept
 
        # Allow loopback traffic
        iif lo accept
 
        # Traceroute rejects
        udp dport { 33434-33474 } reject
 
        # Allow DHCP client
        iif $lan_interface udp dport 68 accept
 
        # Allow wireguard
        udp dport $vpn_routing_wireguard_port accept
 
        # Allow ICMP
        icmp type { echo-request, echo-reply, destination-unreachable, time-exceeded } accept
 
        log prefix "IPv4 Input denied: "
        drop
    }
    chain forward {
        type filter hook forward priority 0; policy drop;
        log prefix "IPv4 Forward denied: "
    }
    chain output {
        type filter hook output priority 0; policy accept;
    }
}
 
table ip6 filter {
    chain input {
        # Drop input by default
        type filter hook input priority 0; policy drop;
 
        # Accept established/related
        ct state established,related accept
 
        # Allow loopback traffic
        iif lo accept
 
        # Traceroute rejects
        udp dport { 33434-33474 } reject
 
        # Allow DHCP client
        iif $lan_interface udp dport 546 accept
 
        # Allow wireguard
        udp dport $vpn_routing_wireguard_port accept
 
        # Allow ICMP
        icmpv6 type { echo-request, echo-reply, destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } accept
 
        log prefix "IPv6 Input denied: "
        drop
    }
    chain forward {
        type filter hook forward priority 0; policy drop;
 
        # Allow established and related connections
        ct state vmap { invalid : drop, established : accept, related : accept }
 
        # Allow forwarding to relay via wireguard interface for other subnets but not the one managed by this router
        iifname $lan_interface oifname $vpn_routing_interface ip6 daddr $vpn_routing_site_range reject with icmpv6 type port-unreachable
        iifname $lan_interface oifname $vpn_routing_interface ip6 daddr $vpn_routing_full_range accept
        iifname $vpn_routing_interface oifname $lan_interface ip6 daddr $vpn_routing_site_range accept
 
        log prefix "IPv6 Forward denied: "
        reject with icmpv6 type port-unreachable
    }
    chain output {
        type filter hook output priority 0; policy accept;
    }
}

Start the firewall:

systemctl enable --now nftables

Once this step has been completed, communication should be possible between the router and the relay. Once another router has been configured, the communication should work site-to-site.

Note: The gateway obviously needs to be configured as a peer ont the relay if it hasn’t happened yet.

Servers

As long as at least one of the routers is on the same link as the server, the latter should auto-configure its IP address via SLAAC after a few seconds.

Note that any IPv6 privacy extension should be disabled to make the IP stable, meaning that SLAAC should always request the same IP address for the respective edge router. With privacy extensions enabled the IP address is partially randomized, in which case the servers IP address can not be configured as Wireguard endpoint on other servers (since the IP keeps changing). Alternatively, the systemd configuration below can be used to configure the IP address statically for any router advertisement.

In any case, after a few seconds the server should map an IP address in the <Site Overlay ULA Prefix>:A::/64 range as global dynamic noprefixroute (see ip -6 a) and the edge router and the server should be able to communicate with each other.

# /etc/systemd/network.d/<pri>-<interface>.conf
[Match]
Name=<interface name>
 
[Network]
DHCP=no
 
# Accept router advertisements
IPv6AcceptRA=true
 
# See https://man7.org/linux/man-pages/man5/systemd.network.5.html#[IPV6ACCEPTRA]_SECTION_OPTIONS
#
# Token should be anything that is stable (see systemd documentation for details)
# static expects the "full" part after the prefix, i.e. should start with ::,
# for example :::1234.
#
# In addition, optionally, create allow/deny list for advertised prefixes
# via PrefixAllowList / PrefixDenyList
[IPv6AcceptRA]
UseRedirect=true
Token=static:<'Host' Part>

Wireguard mesh configuration

Once more than one server is set up, they can establish wireguard tunnel in-between them.

Here, the MTU should be lowered by more than configured automatically by wireguard as the chance of traffic flowing across the router -> relay tunnel is quite high. As such, the MTU should be the MTU of that routing tunnel minus 80.

The Mesh Tunnel IP below is a unique ULA address for communication inside the tunnel. The address should be outside the VPN’s /48 range.

# /etc/wireguard/wgvpn01.conf
 
[Interface]
Address = <Mesh Tunnel IP>/128
ListenPort = <Listen Port>
PrivateKey = <Private key>
MTU = <Routing Tunnel MTU - 80>
 
# Add all other servers as peers
[Peer]
PublicKey = <Public Key>
AllowedIPs = <Mesh Tunnel IP>/128
Endpoint = <Routing Tunnel IP>:<Wireguard Port>

And register it as service:

sudo systemctl enable --now wg-quick@wgvpn01.service

Troubleshooting

SLAAC not configuring an IP

Check the router advertisments on the client machine using rdisc6 <interface>. It should look as follows (placeholders in </>):

Hop limit :           64 ( 0x40)
Stateful address conf. :  No
Stateful other conf. :  No
Mobile home agent         :           No
Router preference : Medium
Neighbour discovery proxy  :           No
Router lifetime :            0 (0x00000000) seconds
Reachable time :       600000 (0x000927c0) milliseconds
Retransmit time :  unspecified (0x00000000)
Prefix : <SLAAC PREFIX>/64
On-link                 :          Yes
Autonomous address conf.:          Yes
Valid time :        86400 (0x00015180) seconds
Pref. time :        14400 (0x00003840) seconds
Route : <ULA PREFIX>/48
Route preference :   High
Route lifetime :           90 (0x0000005a) seconds
Source link-layer address: <ROUTER MAC ADDRESS>
from <VRRP IP ADDRESS>

Note that for SLAAC to be configured, the source IP (from) must be a link local address.

Alteratively, dump the advertisments using tcpdump -i <interface> icmp6.

Worklog

24.02.2025: Extended the concept with some thoughts on each component and a note on third party solutions. Set up a small PoC (to be documented)
27.02.2025: Added configuration notes
05.03.2025: Extended configurations
09.03.2025: Added VRRP
17.03.2025: Updated relay configurations, renamed gateways to edge routers
17.03.2025: Revised individual configurations & concept documentations
11.04.2025: Updated images, updated/fixed some configuration descriptions
13.03.2025: Updated title / description
16.04.2025: Add missing DHCP nft rules
27.04.2025: Minor cleanup, added some additional considerations

Matteias Collet

E2EE VPN setup across CG-NAT using Wireguard, nftables, radvd and keepalived

The problem

Conceptual overview

Potential drawbacks

Limited client access

Relay failure

Information exposure and traffic isolation

Inter-Site communication

Intra-Site communication

Configuration

Prerequisites

VPN Relay

Wireguard routing tunnel configuration

Firewall configuration

VPN Edge Router

Internet-facing interface configuration

keepalived/VRRP configuration

radvd configuration

Wireguard routing tunnel configuration

nftables configuration

Servers

Wireguard mesh configuration

Troubleshooting

SLAAC not configuring an IP

Worklog