lnbook/11_gossip_channel_graph.asc...

[[gossip]]
== Gossip and the Channel Graph

In this chapter we will describe the Lightning Network's _Gossip Protocol_ and how it is used by nodes to construct and maintain a _Channel Graph_. We will also review the _DNS Bootstrap_ mechanism used to find peers to "gossip" with.

The gossip protocol and routing information section is highlighted by a double outline spanning the "Routing Layer" and "Peer 2 Peer Layer" of <<LN_protocol_gossip_highlight>>:

[[LN_protocol_gossip_highlight]]
.The Lightning Network Protocol Suite
image::images/mtln_1101.png["The Lightning Network Protocol Suite"]

As we've learned already, the Lightning Network uses a source-based onion routing protocol to deliver a payment from a sender to the recipient.
To do this, the sending node must be able to construct a path of payment channels that connects it with the recipient, as we will see in <<path_finding>>.
Thus, the sender has to be able to map the Lightning Network, by constructing a _Channel Graph_.
The channel graph is the interconnected set of publicly advertised channels, and the nodes that these channels interlink.

As channels are backed by a funding transaction that is happening on-chain, one might falsely believe that Lightning Nodes could just extract the existing channels from the Bitcoin Blockchain.
However this is only possible to a certain extent.
The funding transactions are Pay-to-Witness-Script-Hash (P2WSH) addresses and the nature of the script (a 2-of-2 multisig) will only be revealed once the funding transaction output is spent.
Even if the nature of the script was known it's important to remember that not all 2-of-2 multisig scripts correspond to payment channels.

There are even more reasons why looking at the Bitcoin blockchain might not be helpful.
For example on the Lightning Network the Bitcoin keys that are used for signing are rotated by the nodes for every channel and update.
Thus even if we could reliably detect funding transactions on the Bitcoin blockchain we would not know which two nodes on the Lightning Network own that particular channel.

The Lightning Network solves this problem by implementing a _gossip protocol_.
Gossip protocols are typical for peer-to-peer (P2P) networks and allow nodes to share information with the whole network with just a few direct connections to peers.
Lightning nodes open encrypted peer-to-peer connections to each other and share (gossip) information that they have received from other peers.
As soon as a node wants to share some information, for example about a newly created channel, it sends a message to all its peers.
Upon receiving a message a node decides if the received message was novel and, if so, forwards the information to its peers.
In this way if the peer-to-peer network is well connected, all new information that is necessary for the operation of the network will eventually be propagated to all other peers.

Obviously if a new peer joins the network for the first time, it needs to know some other peers on the network, so it can connect to others and participate in the network.

In this chapter, we'll explore exactly _how_ Lightning nodes discover each other, discover and update their node status, and communicate with one another.

When most refer to the _network_ part of the Lightning Network, they're referring to the channel graph which itself is a unique authenticated data structure _anchored_ in the base Bitcoin
blockchain.

However, the Lightning network is also a peer-to-peer network of nodes that gossip information about payment channels and nodes. Usually for two peers to maintain a payment channel they need to talk to each other directly which means that there will be a peer connection between them.
This suggests that the channel graph is a subnetwork of the peer to peer network.
However this is not true because payment channels can remain open even if one or both peers go temporarily offline.

Let's revisit some of the terminology that we have used throughout the book, specifically looking at what they mean in terms of the channel graph and the peer-to-peer network (see <<network_terminology>>).

[[network_terminology]]
.Terminology of the different networks
[options="header"]
|===
| Channel graph  |Peer-to-peer network
|  channel | connection
| open | connect
| close | disconnect
|  funding transaction | encrypted TCP/IP connection
| send	|	transmit
| payment |  message
|===

As the Lightning Network is a peer-to-peer network, some initial bootstrapping is required in order for peers to discover each other.  Within this chapter we'll follow the story of a new peer connecting to the network for the first time, and examine each step in the bootstrapping process from initial peer discovery to channel graph syncing and validation.

As an initial step, our new node needs to somehow _discover_ at least _one_ peer that is already connected to the network and has a full channel graph (as we'll see later, there's no canonical version of the channel graph). Using one of many initial bootstrapping protocols to find that first peer, after a connection is established, our new
peer now needs to _download_ and _validate_ the channel graph. Once the channel graph has been fully validated, our new peer is ready to start opening channels and sending payments on the network.

After initial bootstrap, a node on the network needs to continue to maintain its view of the channel graph by: processing new channel routing policy updates, discovering and validating new channels, removing channels that have been closed on-chain, and finally pruning channels that fail to send out a proper "heartbeat" every 2 weeks or so.

Upon completion of this chapter, you will understand a key component of
the peer-to-peer Lightning Network: namely how peers discover each other and maintain a local copy (perspective) of the channel graph. We'll begin by exploring the story of a new node that has just booted up, and needs to find other peers to connect to on the network.

=== Peer Discovery

In this section, we'll begin to follow a new Lightning node that wishes to join the network through 3 steps:

. Discover a set of bootstrap peers
. Download and validate the channel graph
. Begin the process of ongoing maintenance of the channel graph itself


==== P2P Bootstrapping

Before doing any thing else, our new node first needs to discover a set of peers who are already part of the network. We call this process initial peer bootstrapping, and it's something that every peer to peer network needs to implement properly in order to ensure a robust, healthy network.

Bootstrapping new peers to existing peer to peer networks is a very well studied problem with several known solutions, each with their own distinct trade offs. The simplest solution to this problem is simply to package a set of _hard coded_ bootstrap peers into the packaged p2p node software. This is simple in that each new node has a list of bootstrap peers in the software they're running, but rather fragile given that if the set of bootstrap peers goes offline, then no new nodes will be able to join the network. Due to this fragility, this
option is usually used as a fallback in case none of the other p2p bootstrapping mechanisms work properly.

Rather than hard coding the set of bootstrap peers within the software/binary itself, we can instead allow peers to dynamically obtain a fresh/new set of bootstrap peers they can use to join the network. We'll call this process _initial peer discovery_. Typically we'll leverage
existing Internet protocols to maintain and distribute a set of bootstrapping peers. A non-exhaustive list of protocols that have been used in the past to accomplish initial peer discovery include:

  * Domain Name Service (DNS)
  * Internet Relay Chat (IRC)
  * Hyper-Text Transfer Protocol (HTTP)

Similar to the Bitcoin protocol, the primary initial peer discovery mechanism used in the Lightning Network happens via DNS. As initial peer discovery is a critical and universal task for the network, the process has been _standardized_ in https://github.com/lightningnetwork/lightning-rfc/blob/master/10-dns-bootstrap.md[BOLT #10 - DNS Bootstrap].

==== DNS Bootstrapping

The https://github.com/lightningnetwork/lightning-rfc/blob/master/10-dns-bootstrap.md[BOLT #10] document describes a standardized way of implementing peer
discovery using the Domain Name System (DNS). Lightning's flavor of DNS based bootstrapping uses up to 3 distinct record types:

  * +SRV+ records for discovering a set of _node public keys_.
  * +A+ records for mapping a node's public key to its current +IPv4+ address.
  * +AAA+ records for mapping a node's public key to its current +IPv6+ address.

Those somewhat familiar with the DNS protocol may already be familiar with the +A+ (name to IPv4 address) and +AAA+ (name to IPv6 address) record types, but not the +SRV+ type. The +SRV+ record type is used by protocols built on top of DNS, to determine the _location_ for a specified service. In our context, the service in question is a given Lightning node, and the location its IP address. We need to use this additional record type as unlike nodes within the Bitcoin protocol, we need both a public key _and_ an IP address in order to connect to a node. As we see in <<wire_protocol>> the transport encryption protocol used in LN requires knowledge of the public key of a node before connecting, so as to implement identity hiding for nodes in the network.

===== A new peer's bootstrapping workflow

Before diving into the specifics of https://github.com/lightningnetwork/lightning-rfc/blob/master/10-dns-bootstrap.md[BOLT #10], we'll first outline the high level flow of a new node that wishes to use BOLT #10 to join the network.

First, a node needs to identify a single, or set of DNS servers that understand BOLT #10 so they can be used for p2p bootstrapping.

While BOLT #10 uses lseed.bitcoinstats.com as the seed server, there exists no "official" set of DNS seeds for this purpose, but each of the major implementations maintain their own DNS seed, and cross query each other's seeds for redundancy purposes. In <<dns_seeds>> you'll see a list non-exhaustive list of some popular DNS seed servers.

[[dns_seeds]]
.Table of known lightning dns seed servers
[options="header"]
|===
| dns server     | Maintainer
| lseed.bitcoinstats.com | Christian Decker
| nodes.lightning.directory | lightning labs (Olaoluwa Osuntokun)
| soa.nodes.lightning.directory | lightning labs (Olaoluwa Osuntokun)
| lseed.darosior.ninja | Antoine Poinsot
|===


DNS seeds exist for both Bitcoin's mainnet and testnet. For the sake
of our example, we'll assume the existence of a valid BOLT #10 DNS seed at +nodes.lightning.directory+.

Next, our new node will issue an +SRV+ query to obtain a set of _candidate bootstrap peers_. The response to our query will be a series of _bech32_ encoded public keys. As DNS is a text based protocol, we can't send raw binary data, so an encoding scheme is required. BOLT #10 specifies a bech32 encoding due to its use in the wider Bitcoin ecosystem. The number of encoded public keys returned depends on the server returning the query, as well as all the resolvers that stand between the client and the authoritative server.

Using the widely available +dig+ command-line tool, we can query the _testnet_ version of the DNS seed mentioned above with the following command:

----
$ dig @8.8.8.8 test.nodes.lightning.directory SRV
----

We use the +@+ argument to force resolution via Google's nameserver (with IP address 8.8.8.8) as they do not filter large SRV query responses. At the end of the command, we specify that we only want +SRV+ records to be returned. A sample response looks something like:

====
----
$ dig @8.8.8.8 test.nodes.lightning.directory SRV

; <<>> DiG 9.10.6 <<>> @8.8.8.8 test.nodes.lightning.directory SRV
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43610
;; flags: qr rd ra; QUERY: 1, ANSWER: 25, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;test.nodes.lightning.directory.	IN	SRV

;; ANSWER SECTION:
test.nodes.lightning.directory.	59 IN	SRV	10 10 9735 <1>
ln1qfkxfad87fxx7lcwr4hvsalj8vhkwta539nuy4zlyf7hqcmrjh40xx5frs7.test.nodes.lightning.directory. <2>
test.nodes.lightning.directory.	59 IN	SRV	10 10 15735 ln1qtgsl3efj8verd4z27k44xu0a59kncvsarxatahm334exgnuvwhnz8dkhx8.test.nodes.lightning.directory.

 [...]

;; Query time: 89 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Dec 31 16:41:07 PST 2020
----
<1> TCP port number where LN node can be reached.
<2> Node public key (ID) encoded as a virtual domain name.
====

We've truncated the response for brevity and show only two of the returned responses. The responses contain a "virtual" domain name for a target node, then to the left we have the _TCP port_ where this node can be reached. The first response uses the standard TCP port for LN: +9735+. The second response uses a custom port which is permitted by the protocol.

Next, we'll attempt to obtain the other piece of information we need to connect to a node: its IP address. Before we can query for this however, we'll first _decode_ the bech32 encoding of the public key from the virtual domain name:

----
ln1qfkxfad87fxx7lcwr4hvsalj8vhkwta539nuy4zlyf7hqcmrjh40xx5frs7
----

Decoding this bech32 string we obtain the following valid
+secp256k1+ public key:

----
026c64f5a7f24c6f7f0e1d6ec877f23b2f672fb48967c2545f227d70636395eaf3
----

Now that we have the raw public key, we'll now ask the DNS server to _resolve_ the virtual host given so we can obtain the IP information (+A+ record) for the node:

====
----
$ dig ln1qfkxfad87fxx7lcwr4hvsalj8vhkwta539nuy4zlyf7hqcmrjh40xx5frs7.test.nodes.lightning.directory A

; <<>> DiG 9.10.6 <<>> ln1qfkxfad87fxx7lcwr4hvsalj8vhkwta539nuy4zlyf7hqcmrjh40xx5frs7.test.nodes.lightning.directory A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41934
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;ln1qfkxfad87fxx7lcwr4hvsalj8vhkwta539nuy4zlyf7hqcmrjh40xx5frs7.test.nodes.lightning.directory. IN A

;; ANSWER SECTION:
ln1qfkxfad87fxx7lcwr4hvsalj8vhkwta539nuy4zlyf7hqcmrjh40xx5frs7.test.nodes.lightning.directory. 60 IN A X.X.X.X <1>

;; Query time: 83 msec
;; SERVER: 2600:1700:6971:6dd0::1#53(2600:1700:6971:6dd0::1)
;; WHEN: Thu Dec 31 16:59:22 PST 2020
;; MSG SIZE  rcvd: 138
----
<1> The DNS server returns an IP Address X.X.X.X. We've replaced it with X's in the text here so as to avoid presenting a real IP address.
====

In the above command, we've queried the server so we can obtain an +IPv4+ (+A+ record) address for our target node (replaced by X.X.X.X in the example above). Now that we have both the raw public key, IP address, and TCP port, we can connect to the node transport protocol at:
+026c64f5a7f24c6f7f0e1d6ec877f23b2f672fb48967c2545f227d70636395eaf3@X.X.X.X:9735+

Querying the current DNS +A+ record for a given node can also be used to look up the _latest_ set of addresses. Such queries can be used to more quickly sync the latest addressing information for a node, compared to waiting for address updates on the gossip network (see <<node_announcement>>).

At this point in our journey, our new Lightning node has found its first
peer and established its first connection! Now we can begin the second phase of new peer bootstrapping: channel graph synchronization and validation.

First, we'll explore more of the intricacies of BOLT 10 itself to take a deeper look into how things work under the hood.

==== SRV Query Options

The https://github.com/lightningnetwork/lightning-rfc/blob/master/10-dns-bootstrap.md[BOLT #10] standard is highly extensible due to its usage of nested
sub-domains as a communication layer for additional query options. The
bootstrapping protocol allows clients to further specify the _type_ of nodes they're attempting to query for vs the default of receiving a random subset of nodes in the query responses.

The query option sub-domain scheme uses a series of key-value pairs where the key itself is a _single letter_ and the remaining set of text is the value itself. The following query types exist in the current version of the https://github.com/lightningnetwork/lightning-rfc/blob/master/10-dns-bootstrap.md[BOLT #10] standards document:

  * +r+: The "realm" byte which is used to determine which chain or realm    queries should be returned for. As is, the only value for this key is +0+ which denotes "Bitcoin".

  * +a+: Allows clients to filter out returned nodes based on the _types_ of addresses they advertise. As an example, this can be used to only obtain nodes that advertise a valid IPv6 address. The value that follows this type is based on a bitfled that _indexes_ into the set of specified address _type_ which are defined in https://github.com/lightningnetwork/lightning-rfc/blob/master/07-routing-gossip.md[BOLT #7]. The default value for this field is +6+, which which represents both IPv4 and IPv6 (bits 1 and 2 are set)

  * +l+: A valid node public key serialized in compressed format. This allows a client to query for a specified node rather than receiving a set of random nodes.

  * +n+: The number of records to return. The default value for this field is +25+.

An example query with additional query options looks something like the following:

----
r0.a2.n10.nodes.lightning.directory
----

Breaking down the query one key-value pair at a time we gain the following
insights:

  * +r0+: The query targets the Bitcoin realm
  * +a2+: The query only wants IPv4 addresses to be returned
  * +n10+: The query requests

Try some combinations of the various flags using the +dig+ DNS command-line tool yourself:

----
dig @8.8.8.8 r0.a6.nodes.lightning.directory SRV
----

=== The Channel Graph

Now that our new node is able to use the DNS bootstrapping protocol to connect to their very first peer, it can start to sync the channel graph! However, before we sync the channel graph, we'll need to learn exactly _what_ we mean by the channel graph. In this section we'll explore the precise _structure_ of the channel graph and examine the unique aspects of the channel graph compared to the typical abstract "graph" data structure which is well known/used in the field of computer science.

==== A Directed Graph

A graph in computer science is a special data structure composed of vertices (typically referred to as nodes) and edges (also known as links). Two nodes may be connected by one or more edges. The channel graph is also _directed_ given that a payment is able to flow in either direction over a given edge (a channel). An example of a _directed graph_ is shown below:

[[directed_graph]]
.A directed graph (Source: Wikimedia Commons)
image::images/mtln_1102.png["A directed graph"]

In the context of the Lightning Network, our vertices are the Lightning nodes themselves, with our edges being the payment channels connecting these nodes. As we're concerned with _routing payments_, in our model a node with no edges (no payment channels) isn't considered to be a part of the graph as it isn't useful.

As channels themselves are UTXOs (funded 2-of-2 multisig addresses), we can view the channel graph as a special subset of the Bitcoin UTXO set, on top of which we can add some additional information (the nodes, etc) to arrive at the final overlay structure which is the channel graph. This anchoring of fundamental components of the channel graph in the
base Bitcoin blockchain means that it's impossible to _fake_ a valid channel graph, which has useful properties when it comes to spam prevention as we'll see later.

=== Gossip Protocol Messages

The channel graph information is propagated across the Lightning P2P Network as three messages, which are described in https://github.com/lightningnetwork/lightning-rfc/blob/master/07-routing-gossip.md[BOLT #7]:

 * +node_announcement+: The vertex in our graph which communicates the public key of a node, as well as how to reach the node over the internet and some additional metadata describing the set of _features_ the node supports.

 * +channel_announcement+: A blockchain anchored proof of the existence of a channel between two individual nodes. Any 3rd party can verify this proof in order to ensure that a _real_ channel is actually being advertised. Similar to the +node_announcement+ this message also contains information describing the _capabilities_ of the channel which is useful when attempting to route a payment.

 * +channel_update+: A _pair_ of structures that describes the set of _routing policies_ for a given channel. +channel_update+ messages come in a _pair_ as a channel is a directed edge, so each side of the channel is able to specify its own custom routing policy.

It's important to note that each of components of the channel graph are
themselves _authenticated_ allowing a 3rd party to ensure that the owner of a channel/update/node is actually the one sending out an update. This effectively makes the channel graph a unique type of _authenticated data structure_ that cannot be counterfeited. For authentication, we use an +secp256k1+ ECDSA digital signature (or a series of them) over the serialized digest of the message itself. We won't get into the specific of the messaging framing/serialization used in the LN in this chapter, as we'll cover that information in <<wire_protocol>>

With the high level structure of the channel graph laid out, we'll now dive down into the precise structure of each of the three messages used to gossip the channel graph. We'll also explain how one can also _verify_ each message and component of the channel graph.

[[node_announcement]]
==== The node_announcement Message

First, we have the +node_announcement+ message, which serves two primary
purposes:

 1. To advertise connection information so other nodes can connect to a node either to bootstrap to the network, or to attempt to establish a  new payment channel with that node.

 2. To communicate the set of protocol-level features (capabilities) a node understands/supports. Feature negotiation between nodes allows developers to add new features independently and support them with any other node on an opt-in basis.

Unlike channel announcements, node announcements are not anchored in
the base blockchain. Therefore, node announcements are
only considered "valid" if they have propagated with a corresponding channel announcement. In other words, we always reject nodes without payment channels in order to ensure a malicious peer can't flood the network with bogus nodes that are not part of the channel graph.

===== The node_announcement message structure

The node_announcement is comprised of
the following fields:

  * +signature+: A valid ECDSA signature that covers the serialized digest of all fields listed below. This signature must correspond to the public key of the advertised node.

  * +features+: A bit vector that describes the set of protocol features that this node understands. We'll cover this field in more detail in <<feature_bits>> on the extensibility of the Lightning protocol. At a high level, this field carries a set of bits that represent the features a node understands. As an example, a node may signal that it understands the latest channel type.

  * +timestamp+: A UNIX "epoch" encoded timestamp. This allows clients to enforce a partial ordering over the updates to a node's announcement.

  * +node_id+: The +secp256k1+ public key that this node announcement belongs to. There can only be a single +node_announcement+ for a given node in the channel graph at any given time. As a result, a +node_announcement+ can supersede a prior +node_announcement+ for the same node if it carries a higher (later) timestamp.

  * +rgb_color+: A field that allows a node to specify an RGB "color" to be associated with it, often used in channel graph visualizations and node directories.

  * +alias+: A UTF-8 string to serve as the nickname for a given node. Note that these aliases aren't required to be globally unique, nor are they verified in any way. As a result, they should not be relied on as a form of identity - they can be easily spoofed.

  * +addresses+: A set of public internet reachable addresses that are to be associated with a given node. In the current version of the protocol four address types are supported: IPv4 (1), IPv6 (2), Tor v2 (3), Tor v3 (4). On the wire, each of these address types are denoted by an integer type which is included in parenthesis after the address type.

===== Validating node announcements

Validating an incoming +node_announcement+ is straight forward. The following assertions should be upheld when examining a node announcement:

  * If an existing +node_announcement+ for that node is already known, then the +timestamp+ field of a new incoming +node_announcement+ MUST be greater than the prior one.

    * With this constraint, we enforce a forced level of "freshness".

  * If no +node_announcement+ exists for the given node, then an existing +channel_announcement+ that references the given node (more on that later) MUST already exist in one's local channel graph.

  * The included +signature+ MUST be a valid ECDSA signature verified using the included +node_id+ public key and the double-sha256 digest of the raw message encoding (minus the signature and frame header) as the message.

  * All included +addresses+ MUST be sorted in ascending order based on their address identifier.

  * The included +alias+ bytes MUST be a valid UTF-8 string.

==== The channel_announcement Message

Next, we have the +channel_announcement+ message, which is used to _announce_ a new _public_ channel to the wider network. Note that announcing a channel is _optional_. A channel only needs to be announced if it is intended to be used for routing by the Lightning network. Active routing nodes may wish to announce all their channels. However, certain nodes like mobile nodes likely don't have the
uptime or desire to be an active routing node. As a result, these
mobile nodes (which typically use light clients to connect to the Bitcoin p2p network), instead may have purely _unannounced_ (also known as "private") channels.

===== Unannounced "private" channels

An unannounced channel isn't part of the known public channel graph, but can still be used to send/receive payments. An astute reader may now be wondering how a channel which isn't part of the public channel graph is able to receive payments. The solution to this problem is a set of "path finding helpers" that we call "routing hints. As we'll see in <<invoices>>, invoices created by nodes with unadvertised channels will include information to help the sender route to them assuming the node has at least a single channel with an existing public routing node.

Due to the existence of unadvertised channels, the _true_ size of the channel graph (both the public and private components) is unknown.

===== Locating a channel on the bitcoin blockchain

As mentioned earlier, the channel graph is authenticated due to its usage of public key cryptography, as well as the Bitcoin blockchain as a spam prevention system. In order to have a node accept a new +channel_announcement+, the advertisement must _prove_ that the channel actually exists in the Bitcoin blockchain. This proof system adds an upfront cost to adding a new entry to the channel graph (the on-chain fees one must pay to create the UTXO of the channel). As a result, we mitigate spam and ensure that a dishonest node on the network can't fill up the memory of an honest node at no cost with bogus channels.

Given that we need to construct a proof of the existence of a channel, a
natural question that arises is: how do we "point to" or reference a given channel for the verifier? Given that a payment channel is anchored in a Unspent Transaction Output (see <<utxo>>), an initial thought might be to first attempt to advertise the full outpoint (+txid:index+) of the channel. Given the outpoint is globally unique and confirmed in the chain, this sounds like a good idea, however it has a drawback: the verifier must maintain a full copy of the UTXO set in order to verify channels. This works fine for Bitcoin full-nodes, but clients which rely on lightweight verification don't typically maintain a full UTXO set. As we want to ensure we can support mobile nodes in the Lightning Network, we're forced to find another solution.

What if rather than referencing a channel by its UTXO, we reference it based on its "location" in the chain? In order to do this, we'll need a scheme that allows us to reference a given block, then a transaction within that block, and finally a specific output created by that transaction. Such an identifier is described in https://github.com/lightningnetwork/lightning-rfc/blob/master/07-routing-gossip.md[BOLT #7] and is referred to as a _short channel ID_, or +scid+.
The +scid+ is used both in +channel_announcement+ (and +channel_update+) as well as within the onion encrypted routing packet included within HTLCs as we learned <<onion_routing>>.

[[short_channel_id]]
[[scid]]
===== The short channel id

Based on the information above, we have three pieces of information we need to encode in order to uniquely reference a given channel. As we want a compact representation, we'll attempt to encode the information into a _single_ integer. Our integer format of choice is an unsigned 64-bit integer, comprised of 8 bytes.

First, the block height: Using 3 bytes (24-bits) we can encode 16777216 blocks. That leaves 5 bytes for us to encode the transaction index and the output index respectively. We'll use the next 3
bytes to encode the transaction index _within_ a block. This is more than enough given that it's only possible to fix tens of thousands of transactions in a block at current block sizes. This leaves 2 bytes left for us to encode the output index of the channel within the transaction.

Our final +scid+ format resembles:
----
block_height (3 bytes) || transaction_index (3 bytes) || output_index (2 bytes)
----

Using bit packing techniques, we first encode the most significant 3 bytes as the block height, the next 3 bytes as the transaction index, and the least significant 2 bytes as the output index of that creates the channel UTXO.

A short channel ID can be represented as a single integer
(+695313561322258433+) or as a more human friendly string: +632384x1568x1+. Here we see the channel was mined in block +632384+, was the +1568+'th transaction in the block, with the channel output as the second (UTXOs are zero-indexed) output produced by the transaction.

Now that we're able to succinctly point to a given channel funding output in the chain, we can now examine the full structure of the +channel_announcement+ message, as well as how to verify the proof-of-existence included within the message.

===== The channel_announcement message structure

A channel_announcement primarily communicates two things:

 1. A proof that a channel exists between node A and node B with both nodes controlling the mulitsig keys in that channel output.

 2. The set of capabilities of the channel (what types of HTLCs can it route, etc)

When describing the proof, we'll typically refer to node +1+ and node +2+. Out of the two nodes that a channel connects, the "first" node is the node that has a "lower" public key encoding when we compare the public key of the two nodes in compressed format hex-encoded in lexicographical order. Correspondingly, in addition to a node public key on the network, each node should also control a public key within the Bitcoin blockchain.

Similar to the +node_announcement+ message, all included signatures of the +channel_announcement+ message should be signed/verified against the raw encoding of the message (minus the header) that follows _after_ the final signature (as it isn't possible for a digital signature to sign itself).

With that said, a +channel_announcement+ message has the following fields:

 * +node_signature_1+: The signature of the first node over the message digest.

 * +node_signature_2+: The signature of the second node over the message
   digest.

 * +bitcoin_signature_1+: The signature of the multisig key (in the funding output) of the first node over the message digest.

 * +bitcoin_signature_2+:  The signature of the multisig key (in the funding output) of the second node over the message digest.

 * +features+: A feature bit vector that describes the set of protocol level features supported by this channel.

 * +chain_hash+: A 32 byte hash which is typically the genesis block hash of the blockchain (e.g. Bitcoin mainnet) the channel was opened within.

 * +short_channel_id+: The +scid+ that uniquely locates the given channel funding output within the blockchain.

 * +node_id_1+: The public key of the first node in the network.

 * +node_id_2+: The public key of the second node in the network.

 * +bitcoin_key_1+: The raw multisig key for the channel funding output for the first node in the network.

 * +bitcoin_key_2+: The raw multisig key for the channel funding output for the second node in the network.

===== Channel announcement validation

Now that we know what a +channel_announcement+ contains, we can look at how to verify the channel's existence on-chain.

Armed with the information in the +channel_announcement+, any Lightning node (even one without a full copy of the Bitcoin blockchain) can verify the existence and authenticity of the payment channel.

First, the verifier will use the short channel ID to find which Bitcoin block contains the channel funding output. With the block height information, the verifier can request only that specific block from a Bitcoin node. The block can then be linked back to the genesis block by following the block header chain backwards (verifying the proof-of-work), confirming that this is in fact a block belonging to the Bitcoin blockchain.

Next, the verifier uses the transaction index number to identify the transaction ID of the transaction containing the payment channel. Most modern Bitcoin libraries will allow indexing into the transaction of a block based on the index of the transaction within the greater block.

Next, uses a Bitcoin library (in the verifier's language) to extract the relevant transaction according to its index within the block. The verifier will validate the transaction (checking that it is properly signed and produces the same transaction ID when hashed).

Next, the verifier will extract the Pay-to-Witness-Script-Hash output referenced by the output index number of the short channel ID. This is the address of the channel funding output. Additionally, the verified will ensure that the size of the allegd channel matches the value of the output produced at the specified output index.

Finally, the verifier will reconstruct the multisig script from +bitcoin_key_1+ and +bitcoin_key_2+ and confirm that it produces the same address as in the output.

The verifier has now independently verified that the payment channel in the announcement is funded and confirmed on the Bitcoin blockchain!

==== The channel_update Message

The third and final message used in the gossip protocol is the +channel_update+ message. Two of these are generated for each payment channel (one by each channel partner) announcing their routing fees, timelock expectations and capabilities.

The +channel_update+ message also contains a timestamp, allowing a node to update its routing fees and other expectations and capabilities by sending a new +channel_update+ message with a higher (later) timestamp that supersedes any older updates.

The +channel_update+ message contains the following fields:


* signature: A digital signature matching the node's public key, to authenticate the source and integrity of the channel update

* chain_hash: The hash of the genesis block of the chain containing the channel

* short_channel_id: The short channel ID to identify the channel

* timestamp: The timestamp of this update, to allow recipients to sequence updates and replace older updates.

* message_flags: A bit field indicating the presence of additional fields in the channel_update message

* channel_flags: A bit field showing the direction of the channel and other channel options

* cltv_expiry_delta: The timelock delta expectations of this node for routing (see <<onion_routing>>)

* htlc_minimum_msat: The minimum HTLC amount that will be routed

* fee_base_msat: The base fee that will be charged for routing

* fee_proportional_millionths: The proportional fee rate that will be charged for routing

* htlc_maximum_msat (option_channel_htlc_max): The maximum amount that will be routed

A node that receives the +channel_update+ message can attach this metadata to the channel graph edge to enable path finding,  as we will see in <<path_finding>>.

=== Ongoing Channel Graph Maintenance

The construction of a channel graph is not a one-time event, but rather an ongoing activity. As a node bootstraps into the network it will start receiving "gossip", in the form of the three update messages. It will use these messages to immediately start building a validated channel graph.

The more information a node receives, the better its "map" of the Lightning Network becomes and the more effective it can be at path finding and payment delivery.

A node won't only add information to the channel graph. It will also keep track of the last time a channel was updated and will delete "stale" channels that have not been updated in more than two weeks. Finally, if it sees that some node no longer has any channels, it will also remove that node.

The information collected from the gossip protocol is not the only information that can stored in the channel graph. Different Lightning node implementations may attach other metadata to nodes and channels. For example, some node implementations calculate a "score" that evaluates a node's "quality" as a routing peer. This score is used as part of path finding to prioritize or de-prioritize paths.

=== Conclusion
In this chapter, we've learned how Lightning nodes discover each
other, discover and update their node status, and communicate with one another. We've learned how channel graphs are created and maintained and we've explored a few ways that the Lightning Network discourages bad actors or dishonest nodes from spamming the network.