lnbook/contrib/onion-routing-htlc-forwardi...

Chapter overview:
  * introduce multi-hop HTLC forwarding + onion routing at a lower level

Relevant questions to answer:
  * Onion Routing:
    * What is onion routing? What security guarantees does it offer?
    * What is the Sphinx format? How does it differ from prior mixnet packet formats?
    * How does DH randomization work? (possibly too low-level)
    * Why is it important to keep the packet fixed sized at all times?
    * What routing information is contained in the payload for an node?
    * How can the routing information be extended in the future?
  * HTLC Forwarding:
    * How does a node gain fees when it forwards a payment?
    * What is the "cltv delta" why does it matter?
    * What does a successful HTLC forwarding look like?
    * How can a node cancel back an HTLC off-chain?
    * What is a typical network switch/hub? How can a similar analogy be applied to forwarding HTLCs?
    * What is the circuit map, why does a node need to maintain this?
    * Can fails or settles be safely pipelined on the network?
    * How does a node send an error back to the sender without knowing who they are?
    * What dangers exist w.r.t time-locks and timely on-chain confirmation?


=== Why is it important to keep the packet fixed sized at all times?

As we have already discussed earlier in the chapter, the "onion" packet transmitted via the routing nodes has multiple layers.
The packet starts at the sender of the payment and reaches the first routing node, who peels off the first layer.
This layer gives it information about the payment being transmitted, such as who is the next routing node to pass it to.
It then passes this packet to the second routing note, who peels off the second layer, and so on until the final routing node (i.e. the recipient of the payment) is reached.

We know from the current design of the Lightning Network that there can be a maximum of 20 hops per Lightning payment.
We can think of the data relating to each of these possible 20 hops as one of 20 "layers" of the packet.
If there are 6 hops in the payment, then the first 6 layers of the packet contain information about the first six routing nodes, and the remaining 14 layers contain junk.
If there are 20 hops in the payment, then all 20 layers of the packet contain information about the twenty routing nodes.

Let us now consider the adverse case, where the packet size is NOT fixed i.e. every time a layer is peeled off of the packet, the size of the packet reduces.
If a malicious routing node receives a packet, it can use the size of the packet to estimate how many "layers" are left in this packet.
If it receives a packet and estimates that there are 20 layers left, i.e. a full packet, then it knows that the node who sent it the packet is the originator of the payment.
If it receives a packet and estimates that there is 1 layer left, then it knows that the node that it is sending the packet to is the final recipient of the payment.
Even if it receives a packet and estimates that there are 2 layers left, then it knows that it is either transmitting the packet to the last routing node (i.e. the payment recipient) or it is transmitting the packet to the second last routing node before the recipient.
It can graph out all the channels of the second last routing node and it knows that the final recipient of the payment is either the second last routing node itself, or one of their channel partners (we ignore the case of private channels).
In all cases, some privacy of the payer and the recipient are lost.

[[malicious-routing-diagram]]
.If the Malicious node knows there are 2 layers left, then it knows that the payment recipient is either Node 19 (and there were only 19 hops) or one of Node 19's channel partners
image:images/malicious-routing-diagram.PNG["If the Malicious node knows there are 2 layers left, then it knows that the payment recipient is either Node 19 (and there were only 19 hops) or one Node 19's channel partners"]

We can extend this example.
Imagine a malicious entity sets up multiple Lightning nodes connected to other well-connected nodes, and it also connects itself across popular payment routes.
These malicious Lightning nodes would then be popular routing choices for those wanting to send payments, especially if they set their routing fees low.
The malicious nodes can then capture the data of all packets that pass through their routing nodes.
With additional information, such as the names of the other routing nodes, it could infer who is making these payments, who is receiving them, and for what amounts.
footnote:[Note that not all Lightning nodes are anonymous.
It is known, for example, that the nodes "aantonop" and "1.ln.aantonop.com" are owned by the author of this book Andreas Antonopolous.
Furthermore, companies and businesses in this space can claim ownership of a node by publicizing their node's alias and pubkey on their website or social media
If we see, for example, a payment with destination "Bitrefill" with a node pubkey that matches Bitrefill's publicized pubkey, we could infer that someone is making a purchase from Bitrefill.
If we know the prices of their services, we could even infer what they purchased. ]
If it has multiple routing nodes connected to each other, it might even be responsible for transmitting several of the hops in a single payment and could form a more complete picture of the route.
We see this example as spiritually similar to the chain analysis already performed on the Bitcoin network; even an incomplete picture of payments can be used to infer things about the parties involved and potentially de-anonymize them.

Fixing the packet size solves the problem of routing nodes knowing how far a packet is along the route.
Whenever a routing node peels off a layer, it adds another layer of junk data to the end, restoring the packet to its "full" size.
In this way, even if a packet has already been transmitted 18 hops, the 19th routing node would still see that the packet contains enough data for up to 20 hops.
The packet size would not provide useful information to the 19th routing node for it to determine if it was the first routing node or the second last routing node.
If it is not the recipient itself, then it knows only that the recipient is someone it is connected to by up to 20 hops.
Assuming a sufficiently complex network with a large number of nodes, the size of the packet then gives them no useful information about the source or the destination of the payment.