| The purpose of this document is to explain
how to tunnel ESP packets through an existing tunnel. The example below
uses 4 Datacryptor® but there is no reason why this should not be equally
relevant for any other ESP implementation. NB - this document only
applies to Datacryptor® in tunnel mode.
The diagram below shows four Datacryptor® - the blue
arrows show the tunnels that are configured between
the Datacryptor® - Datacryptor®_1A has a tunnel with Datacryptor®_1B
and Datacryptor®_2A has a tunnel with Datacryptor®_2B.
Click
on picture thumbnail to view diagram
To understand
how this works consider what happens to an IP packet sent from
PC_A to PC_B.
1. The packet is sent from 172.16.0.10 to the PC's default gateway
(172.16.0.1) - the host port of Datacryptor®_1A.
2. Datacryptor®_1A encrypts the packet and sends it to its network gateway
(192.168.0.10) - the host port of Datacryptor®_2A. The original packet is
encapsulated in an ESP packet with a new IP header (source; 192.168.0.1.
destination; 192.168.0.10).
3. Datacryptor®_2A receives the ESP packet, encrypts it again and sends
it to its network gateway (192.168.1.2) - the network port of Datacryptor®_2B.
The ESP packet is encapsulated in another ESP packet with a new IP
header (source; 192.168.1.1. Destination; 192.168.1.2).
4. Datacryptor®_2B receives the ESP packet, decrypts it and sends it to
its host gateway (192.168.2.1) - the network port of Datacryptor®_1B. At
this
point the original ESP packet (the one that emerged from Datacryptor®1A in
Step 2) is revealed.
5. Datacryptor®_1B receives the ESP packet, decrypts it and sends it to
PC_B (172.16.1.1).
So what is happening here is that the IP packet it encapsulated
once into an ESP packet and then encapsulated again into another
ESP packet - it is effectively being encrypted (and decrypted) twice.
There is another important issue here which must be considered.
When an IP packet is sent through a tunnel the encryption process
and the new IP header increase the overall packet length. Because
our IP packets are being sent through two tunnels the packet length
is increased twice. This can cause problem due to a limitation in
the current version of code (3.41 - 1.14.04 at time of writing).
An example:
1. PC_A sends a 1450 byte packet destined for PC_B.
2. Datacryptor®_1A receives the packet and fragments it into two packets
and encrypts them. It does this because it knows that it cannot encrypt
the 1450 byte packet without pushing the packet length over the MTU
(Maximum Transmission Unit). Instead it splits the packet into two
packets and then encrypts them - This is normal behaviour. It is
worth noting that fragmentation process does not fragment the packets
into two equal length packets - it will encrypt it into one long
packet and one short packet. From Datacryptor®_1A we would see two ESP packets
- one of about 1450 bytes and one of about 70 bytes.
3. Datacryptor®_1A sends these two packets to the host port of Datacryptor®_2A.
Datacryptor®_2A must now fragment the bigger of the two packets because,
again, it
cannot encrypt the packet without pushing the packet length over
the MTU. So from Datacryptor®_2A we would see three ESP packets, one of about
1450 bytes, one of about 110 bytes and one of about 70 bytes. Again,
this is all normal behaviour for fragmentation of packets.
4. Datacryptor®_2B receives these three ESP packets. This is where the problem
occurs - the Datacryptor® is not able to reassemble the fragmented packets
which it must do in order to decrypt the data. This is because the
network port of the Datacryptor® is not able to reassemble fragmented IP
packets - it expects to receive a complete IP packet (i.e. the "more
fragments" bit set to zero). Datacryptor®_2B will actually reply to
Datacryptor®_1A with a "ICMP fragmentation needed but DF bit set" (this
is actually the wrong ICMP message). From PC_A there is no indication
of what is causing the failure because the ICMP error message is
sent back to Datacryptor®_1A rather than back to the originating host.
5. There is another issue here - the Datacryptor® is sending ESP packets
out of its network port with the DF (don't fragment) bit set to zero.
This implies that the Datacryptor® is happy for intermediate devices to fragment
ESP packets. This is incorrect - the Datacryptor® at the other end of the
tunnel is unable to reassemble packets as we can see above. This
is a contradiction - the Datacryptor® should either set the DF bit to one
(which would prevent intermediate devices from fragmenting the ESP
packets) or be able to reassemble fragmented ESP packets.
6. There is a workaround which, whilst not ideal, does avoid the
problem. As of 3.41 there is a CLI command "MTU" which
sets the MTU size on the Datacryptor®. The value assigned to this defines
the biggest
packet that the Datacryptor® will send out of its network port after the
encryption overhead has been added. By defining a lower MTU on the
two edge Datacryptor® (Datacryptor®_1A and Datacryptor®_1B) we can limit the size of the
ESP packets and reduce the fragmentation taking place. For example
if we set the MTU on Datacryptor®_1A and Datacryptor®_1B to 1300 the packet can be
encrypted twice without any fragmentation taking place. This may
give a small reduction in throughput as, overall, more packets of
shorter length will be created. Another alternative is to set the
MTU on PC_A and PC_B to a lower value - this will also solve the
problem but is obviously not a realistic solution if you have many
hosts at either end.
7. One other issue here concerns the MTU of the Datacryptor® in general.
The Datacryptor® does not do MTU path discovery (the process of attempting
to
find the biggest "unfragmentable" packet it can send to
a remote destination) and assumes that any ESP packets will not be
fragmented. If an intermediate device does fragment the packet the
Datacryptor® will not be able to reassemble it. For example, in the diagram
above, if the link between Datacryptor®_2A and Datacryptor®_2B was a serial link
with a low MTU that caused ESP packets to be fragmented then the
packets would not be reassembled and data transfer would fail. This
can be irritating as it means path MTU must be maintained manually.
One of the problems with this is that it can be hard to pinpoint
when you have a problem with MTU - you may find that you can ping
from one end of a tunnel to the other without any problems but when
you try and pass "real" data (e.g. web browsing, file transfer
etc.) it will fail. This is because ping uses a small packet size
(typically 32 bytes) which are unaffected by fragmentation. This
can lead to intermittent throughput where some data gets through
but others doesn't - this can lead to applications hanging etc.
|