In traditional messengers, you send messages to another account by sending them to a centralised server interface, which stores and distributes everything to the receiver(s). Some problems emerge from this. For example: Normally, traditional messengers hide everything they store behind thick walls of access control to avoid unwanted access by third parties. But this also means that nobody except for the people with access to the databases knows for sure how messages are stored and if they are adequately encrypted. To mitigate this server-side trust problem, messengers introduced procedures like the Signal protocol, which implements robust end-to-end encryption with advanced cryptographic algorithms. The goal of end-to-end encryption is to avoid third-party access to any messages sent in a conversation, even when some intermediate cryptographic keys are compromised.
But core mechanisms of such end-to-end protocols like the “double hash ratchet” procedure also have their weaknesses. For example, hidden participants can be introduced into conversations by the service provider behind the visible parts of messenger apps designed to create a trustful feeling. How could this be solved?
Roughly a year ago, I stumbled across a whitepaper. It was about a messenger built on a blockchain network. It approached this problem by employing two solutions. Firstly, the responsibility of storing message data is decentralised between multiple nodes so that there is no central control authority for all data. And secondly, all messages are encrypted, signed and most importantly, visible to everyone. Everybody participating in such a network can verify that their messages are stored correctly and inaccessible to third parties due to encryption, as long as the chosen encryption is strong enough.
Additionally, a messenger based on the blockchain can be built such that users are fully anonymous behind their cryptographic public keys. However, this use case is arguably not strictly limited to blockchain technology but fits here very well. Blockchain messengers initially need their blockchain to decentralise the message storage responsibility globally across clients who want to participate in the network but don’t trust each other. The blockchain is used as a computational foundation to employ a consensus algorithm between non-trustworthy nodes and subsequently ensure that messages are stored according to the reference algorithm.
I found the idea very compelling and started to code my own blockchain-based messenger PeerBridge. After a lot of research and development together with my colleague Felix Kästner over a few months, we finally came to a working prototype with which we could send messages to each other via our blockchain.
We did the project initially because we wanted to learn more about blockchains, exceeding the fundamental theories by creating one ourselves. And we learned a lot. Here are the key takeaways.
Don’t use blockchain to sell your product.
Blockchains are nothing new, but they tend to be sold as non plus ultra solutions. Please don’t do this: think of blockchain first and then think about the products which could be built on it. This was the first mistake we made. We wanted to create something on blockchain in the first place. Realistically, there are often a lot of more accessible and cheaper solutions than using a blockchain. Now we can say we built a working and valuable blockchain, which sounds smart and looks great on a CV. But from an economic perspective, I would consider the project inefficient regarding the solution that we aimed to facilitate. Especially, if it would’ve been aimed toward building an actual market-ready product. Otherwise, we could’ve come to a much cheaper solution like building a system based on or similar to Matrix.
Deploying a Blockchain messenger is like founding a bank. And it comes with similar problems.
As noted above, blockchain conjuncts non-trustworthy participants. This is achieved by providing collective incentives (and penalties). The primary motivation within a blockchain ecosystem is always its cryptocurrency and the corresponding value that backs it. It’s pretty easy to implement a cryptocurrency because, as the original developer of a cryptosystem, you can freely choose how the currency is distributed. Add numbers to blocks and transactions and name them block rewards and transaction fees. Then, add up the numbers well-defined, and you have your account balance.
However, much more difficult is to bind the cryptocurrency to other valuable assets. To achieve that, you either need to start your exchange that underlies the local financial regulations or list your currency on one of the existing cryptocurrency exchanges (often bound to a fee of a couple of million $) and let the market decide its price. There might be other solutions, such as attaching your currency to another crypto ecosystem (like with ERC20-tokens) but none was feasible for a 2-man-team like us. At that moment, we would’ve had to start up a company and get venture capital to facilitate a real deployment.
There is no entirely satisfactory consensus protocol.
Proof-of-Work is very easy to implement and robust, but it uses too much energy. There are Proof-of-Stake or other (also hybridised) consensus protocols as alternatives. Although Proof-of-Stake is better than Proof-of-Work in its environmental sustainability, it comes with a self-reinforcing oligarchy. Participants who initially own the most cryptocurrency will also continuously get a more significant fraction of the rewards. It is also easy to get lost in the variations of consensus protocols. Often, the same abstract consensus protocol can be differentiated further by their concrete selection of rewards, the delegation of authority or the introduction of cryptocurrency burning. But so far, we didn’t come across consensus algorithms that didn’t also have their drawbacks.
We ultimately decided on (non-delegated) Proof-of-Stake in PeerBridge because it was comparatively easy to comprehend and implement with our available resources. But beyond the already mentioned built-in oligarchy, we also had to consider specialised attacks on the system, such as the shuffling attack. In such an attack, the attacker continuously shuffles cryptocurrency to own public keys that are more probable to create the next block, enlargening the attacker's relative stake. Mitigations include introducing concepts such as counting only stakes as relevant for the consensus, which was held for a long enough time.
Proof-of-Stake is surprisingly easy to compute if you don’t introduce advanced governance mechanisms or delegation.
We wanted our consensus protocol to create a block roughly once in a second. This requirement emerged from the idea that messages shouldn’t take longer than a second to be stored in a block. Keep in mind that block frequencies cannot be lowered freely since they impose a natural limit on the scalability and robustness of the distributed network. We found 1 second to be a good tradeoff. In Proof-of-Work blockchains, block creation times can be easily regulated by the hashing difficulty of new blocks. We adapted this for our Proof-of-Stake consensus protocol.
The core principle of our chosen Proof-of-Stake is that an upper bound increases with time after the last block creation, and nodes can create a new block once their hit is below the rising upper bound. Imagine it like the Limbo dance, but the Limbo bar is our upper bound and the person trying to pass it is our hit. The winner gets the reward by creating a new block. How fast the upper bound is rised depends on the current block difficulty. The block difficulty is automatically regulated by the system such that blocks are created roughly every 1s. The actual Proof-of-Stake is implemented such that a participant with a higher stake encounters a faster increase in the current upper bound. Metaphorically speaking, participants with more cryptocurrency have a higher chance of passing the Limbo bar. And most importantly, the hit of each participant is redistributed after each block generation, which introduces a pseudo-random factor to give participants with equal stakes an approximately equal chance of winning. Here is the code:
The key idea of the proof calculation is that every participant can verify another participant's right to create a block. So if a block is rightfully created, every other participant can integrate the block into its blockchain without trusting the block creator's self-authorisation. Vice versa, if a participant was not eligible to create a block, it can be dismissed by all other clients in the network. However, this would’ve become more complicated if we approached delegation or voting for collective governance. We didn’t.
A working blockchain can be implemented in less than 100 lines of code. Or not?
We’ve come across many tutorials in our language of choice (Go), which proclaimed to teach “how to implement a blockchain”. This typically included:
- Declaration of the block and transaction models
- Declaration of the blockchain model (just a collection of blocks)
- Implementation of a simple try-to-hit-a-hash-mechanism
- Show how blocks are created after a genesis block
This is where most tutorials end. No multi-node connectivity or consensus, which misses the point of a real blockchain. This is where it got fiddly for us. We needed to find out how to connect nodes in a peer-to-peer network. After lots of trial-and-error, we came to a solution utilising the distributed hash table Kademlia with go-libp2p. However, achieving local inter-node connectivity is only half of the rent. In local networks, everything is easy-peasy. In global networks interconnecting multiple local networks, this is another story:
Peer-to-peer networks become challenging when it comes to NAT traversal.
We read a lot about concepts like “hole-punching” to achieve a connection to clients behind a NAT (as with every typical router). This is just one example of how implementing these overlay networks goes down the rabbit hole. From our experience, most development time went into implementing and testing the overlay network functionality.
Achieving consensus and fault tolerance with multiple concurrent connections is additionally tricky. We needed to implement migration and fallback mechanisms to account for problems like lost blocks or new nodes in the network that needed to catch up with the current blockchain head. By the way, we should mention:
Blockchains are trees (or DAGs). This makes them slow and additionally complicated.
With the previously mentioned block creation time of 1s, it frequently occurs that multiple nodes create blocks concurrently due to network latencies. This imposes the problem mentioned above of a robust block and transaction migration (see ACID criteria). It also necessitates side chains. We chose a tree structure for the blockchain head of an arbitrary height in our implementation. Blocks that are pushed below this height by new incoming blocks get finalised, and blocks from short side chains are orphaned together with their contained transactions. Those transactions get queued again for new blocks if they are not yet included in the chain to avoid double-spending. Problems like these make the migration of recent transactions and blocks additionally tricky. And once the blockchain reaches sizes of multiple GB, it becomes vital to look out for bottlenecks in the implementation. In the end, we had to construct our own custom recursive SQL queries to be able to fetch blockchain information in a potentially scalable manner:
Coming back to the original point. Even if tutorials on blockchain seem easy at first, they often oversimplify and leave out critical issues. At least from what we experienced.
This is not a rant. But as discussed above, in many different perspectives, complexity was our main enemy during the development of our blockchain messenger. It often seemed like we always needed more complex solutions to account for problems resulting from other complicated solutions, which slowed down our progress towards a finished product significantly. But there was one case where we found a sleek and satisfactory solution.
We love the secp256k1 elliptic curve.
Every transaction in a blockchain needs a sender and a receiver. Both are identified with their public key, resulting from a cryptographic key pair generation. Additionally, transaction senders need to sign their transactions with the same key pair such that all spectators can verify the authenticity of the transaction. Cryptosystems like Bitcoin or Ethereum use the secp256k1 elliptic curve digital signature algorithm (ECDSA) to facilitate this. We adapted the existing implementation from Bitcoin (in C++) for our Go-based blockchain, similar to how Ethereum does it. We included it in our client applications as well to achieve seamless interoperability.
Another advantage of the ECDSA is the shortness of public keys compared to RSA. This drastically reduces transaction payloads. And using the elliptic curve Diffie-Hellman (ECDH) procedure, we were able to encrypt the messages from our clients with the generation of a shared secret. The trick with this shared secret is that both conversation participants only need to know each other's public keys and their private key. Then both can independently derive a shared secret without sharing any message in advance—a straightforward solution for multiple problems.
After having developed our application, we presented it at OUTPUT.DD, a project board, organised by the computer science faculty at TU Dresden. It attracted a lot of attention. Our presentation described the original reasoning behind developing a blockchain messenger and the significant problems we encountered, highlighting the weaknesses of such a concept.
After the presentation, we discussed a bit more with colleagues. In this way, we found more significant problems with the general idea of a blockchain messenger. For example, we discussed the hypothetical possibility that currently safe encryption might not forever be secure, increasing cracking power and further breakthroughs in quantum computing. But since all transactions are supposed to become final and therefore non-removable, there might be the possibility of decrypting messages in the future. And it is possible to use the metadata on a blockchain to trace messages from specific individuals. Additionally, since the system of a blockchain messenger is decentralised with no singular control authority, this imposes problems with criminal networking or sharing other illegal content. Other issues are the limited scalability, as is visible with highly frequented blockchains like Ethereum and its exploding gas fees, or the ever-increasing cost of storage a node would need to offer.
As with our final presentation at OUTPUT.DD, we subsequently concluded that blockchain messengers are concepts with too many flaws and problems to become a possible way of communication in the future. Please don't do it from the perspective of building a good product for the abstract use cases. Use Matrix or a similar system instead. From our perspective as software engineers, we still don’t regret developing the blockchain messenger since we were able to account for our initial goal to learn more about blockchain ecosystems in general. So from that perspective, please do it!