April 23, 2024

Blockchain and P2P Exchange: Repeating Scalability Mistakes

In 2001, when BitTorrent was invented, another p2p network that was already quite well known and worked on the samevery principles, began to lose ground.It became a victim of its own success: it had almost 150,000 users, and the community-supported directory servers couldn't handle the load. Three hours after connecting, the new “server” reached its bandwidth limits and shut down.


How cryptocurrency can (not) learn from the mistakes of others

How is this related to cryptocurrencies? You may encounter similar problems when trying to fully synchronize a crypto wallet, for example, the full Ethereum client. Distributed technology has scalability issues. For those who began their “online career” with unconsciously documenting the decline of the p2p network, which was mentioned above, this situation is surprisingly familiar.

In this article I will try to describe the rootsproblems, identify its connection with cryptocurrencies, analyze how it was solved in the p2p network, and also consider why these solutions are partially or completely unsuitable for the cryptosphere.

Distributed Scalability History

There are many parallels that can be drawn between the cryptosphere andp2p file sharing. Many of these are beautifully covered in the four-part series, “Bittorrent Lessons for Crypto” by Simon Morris. I agree with most of his points except one: BitTorrent wasnotthe first system that provided simple andReliable file exchange over slow and unreliable channels. Even leaving aside the 1986 Z-Modem, which already had many features for stability, BT was onlysimplifiedversion of an existing technology:a system that allowed many people to download a single file and share pieces of it with each other to increase download speed and, most importantly, reduce the load on the person who downloaded the file. This system also made it possible to post hyperlinks to shared files on regular websites. There was at least one similar system - eDonkey 2000 or eDonkey or ed2k. The principles of operation of BitTorrent and eDonkey are so similar that it is difficult not to call BitTorrent a clone of the latter.

The main difference between BT and ed2k isthat in ed2k you could access files without a third-party website. No need to go to Pirate Bay. You could just search the connected eDonkey directory server and access all the files shared by other members. This could be done using the good old search bar. What happened outside of this search line was both the happiness and the curse of the web.

If you do not have technical knowledge, then simply skip the text below in italics.

To download a file from BitTorrent, you must first find and download the torrent filefrom some web server, and put it in the BT client. Then he himself will deal with all the rather simple things: he will connect to the tracker or trackers to get access to the desired file, then he will get a list of everyone who can download fragments of this file from and start downloading.

Neither the tracker, nor the participants can be found without a torrent file, which must be posted by someone on the website. This means that if you want to publish a file, then you need to:

  • run tracker for this file
  • create a torrent file with the path to the tracker
  • upload torrent file to website
  • distribute the link to the torrent file

Here an additional level of complexity arises, an additional intermediary. Of course, sites that publish such links, such as Pirate Bay, very often break the law.

In order to publish a file in eDonkey, you had to:

  • launch eDonkey client
  • put the file in the desired folder on the computer

And that’s it! Anyone could connect to the same "eDonkey server" by simply typing part of the file name in the search bar. Push the button, and you're done. Everyone who uploaded the same file automatically shared fragments of it with others, as in BitTorrent. Thus, the ed2k server played the role of both a website and a tracker.

Please note that neither party stores the actual file data, so no copyright infringementdirectly. Of course, everyone who uploads or downloads a real file violates these rights.

In addition, to use ed2k it was possiblePublish hyperlinks to websites. If it was more convenient for users to receive a link through a trusted website, then they could use this option as well.

Of course, no one thought that only one serverwill do all the work of the search service and track information about all files on the Internet. Therefore, anyone could run their eDonkey server. The servers connected to each other, forming their own network, as is the case with cryptocurrencies.

However, there was one small problem. How to find the files that “my server” - the server to which I connect - does not track?

How did BitTorrent solve this problem? No way, at least in the beginning. If your torrent file does not have an active tracker, then nothing will work. (This system feature also prevented some BT issues from occurring, but I will talk about this another time *.)

On the other hand, eDonkey servers sent a list of all the servers they knew at the time of connection. This allowed&#8230;send a file request to all servers.That is why the eDonkey network has died.

There could be an unlimited number of eDonkey servers on the Internet.Every customercould ask your server for the presence of a file.Adding a new server did not reduce the load on other servers, but only increased eDonkey's overall search traffic on the Internet. As more and more clients became aware of the eDonkey server, his Internet connection, often a home DSL connection, became overloaded. The only way to continue was to get a different IP address from the ISP.

Additionnew customeralso affected every server that this client could find. But I will tell you how this discovery happened another time. It turned out that the maximum numbercustomers throughout the networkthat the servers can withstand is about 150,000. This is nothing compared to modern standards, but we must remember that the bandwidth of the DSL connection was then 256 Kbps.

I participated in the eDonkey community andwas in the midst of this disaster. I tried to find solutions to the problem and tried unsuccessfully to convince people that they did not use network-loading tools that increased traffic, and so went out of control.

That is why when I first heard about howBitcoin works, which happened long before its official appearance, I laughed. And you had to start mining. Understanding the technical limitations does not guarantee an understanding of the principle of greed.

The problem of cryptocurrency scalability

If you know anything about distributed ledgers, then you probably know that everyfull-fledgeda client of such networks must havefull copy of the registry(or at least the current version of the dataeach registry account and some history), which means that all network updates must be downloaded. As is the case with edonkey servers, creating another full-fledged client does not reduce the load on other full-fledged clients, but only puts an even greater burden on everyone, since clients need to send copies of transactions toyetone computer that also generates transactions.

Creating a “light” client or “lich” does not have a positive effect on fat clients. In this case, all of themalsoOne more client's transactions have to be processed, but at least one of them now receives requests from light clients every time they need to check their balance.

Thus, the load on the network increases withthe emergence of each new client, whether it is a full or light client. It cannot be reduced by any means built into the tools. Doesn’t resemble anything?

If there are miners on the network, then the same problemappears also at them: they should process and sign each transaction. Typically, these computers and their connections are more suitable for the server role, in the end, they make money on transactions. But even in this case, overloading the connection is only a matter of time. Especially if miners become actual servers on the network, when all clients turn into light ones, as the requirements for full clients are too high for non-commercial use.

File hosting solution

If you are an avid techie, you might think, “Why didn't they just use distributed hash tables?”

And here's why: it all happenedbefore distributed hash tables were invented.

DHT (Distributed Hash Tables - “distributed hash tables”) distribute hash tables, essentially ordinary databases, across a network of computers, so each of them is responsiblefor partdata.Looking ahead, I will say that in this way you can store much more data than fits into the memory of any of the participating computers. If everything is done correctly, then each participant will receive requests only for “their” part of the hash table, which distributes the load among many nodes. Adding nodes to DHTreducesload on each node, rather than increasing it. This is why DHT variants are used almost everywhere today, starting with Google.

In fact, DHT is the solution for ed2k too(more precisely for the “Overnet” protocol), and for the BitTorrent tracker segregation problem. In Overnet, as in DHT BitTorrent, each client becomes a tracker for a specific part of the network, providing many more resources for searches. Overnet eliminates the need for an entire network of eDonkey servers.

Can cryptocurrency learn from the mistakes of others?

Distributed hash tables are like magicthe wand. They allow you to "sharding" almost all information, protecting the nodes from too many requests. Can this solution be used for cryptocurrencies?

In short, no (a more detailed answer is provided below). Longer answer: possible, but all the initial blockchain promises must be kept.

Reliability of the "distributed registry" for the most partparts is ensured by the fact that distributed registers, unlike distributed hash tables, are distributed only as it happens with newspapers - everyone gets their own copy. The division of registries will reduce the availability of data and, consequently, their reliability.

Here is another analogy with file sharing: You can download a movie only if there are enough sources on the network at the same time to download the entire file. If 300 sources have the beginning of the file, and 50 have its end, and no one has the middle of the file, then the entire movie will not load. It is sad, but not the end of the world. It is much worse when some part of the money is not enough in the wallet.

Of course, you can copy parts many timesregistry to ensure the presence of a sufficient number of clients&#8230; But how much is enough? Is it good when all your money can “go offline” with a 1 in a thousand chance? 1 in a million? As long as all the nodes are owned by ordinary people, you never know how many participants may go offline at any given time. What if there is a massive power outage in Sudan and all copies of your wallet are stored there, even if you are in the US?

Of course, you can always service it yourselffull client, and the fragmentation protocol will ensure that a copy of the wallet and transaction history is always available on your computer. But you can look at this from the other side: let's say you are selling something, and the buyer is the only participant who has proof of his funds&#8230; Would you agree to such a deal?

There is also a slight security issue.network - it is provided by miners. Of course, in this case, the block cannot be signed by each miner. Otherwise, in a network without servers, the server role will have to be shifted to miners: each miner will have to process all transactions on the network, heavily loading its connection.

This means that fragmentation is automaticallywill reduce the security costs of the cryptocurrency network, and it will become easier to attack. Today, there are only two cryptocurrency networks that can afford to reduce security costs: Bitcoin and Ethereum. Almost all other cryptocurrency networks have already been attacked by 51%. This proves one thing: the computational power of attackers is simply amazing.

All this leads us to the following. If you fragment the cryptocurrency registry, then who can guarantee that the fragment that you have is really part of the registry? In the end, it turns out a network of untrusted nodes that cannot be trusted. What happens if some part of the registry is “in the hands” of an attacker?

Three arguments are distinguished here:

  1. Logical negligence (no one will be able to capture all the nodes on which a certain fragment is stored).
  2. Statistics (the probability that all nodes with a certain fragment will be on the same network is very small).
  3. Encryption (cryptographic protection prevents this problem).

The problem is that people tend to greatly underestimate the size and strength of networks of attackers in the world of cryptocurrencies. For example, at the beginning of 2019 there was a 51% attack on Ethereum Classic, whichcouldcost the attackers $55 million (or at leastat least half of this amount, given the decline in cryptocurrency prices). The attackers later returned $100 thousand worth of stolen tokens. It turned out that it was just a test or a warning. It seems that this attack cost the attacker no more than $1 million, because presumably he had approximately that amount in his wallet.

If someone can get a huge amountprocessing power at a relatively low price (compared with the money that can be stolen), then it offers incredible opportunities. Items 1 and 2 can be handled by a DDOS attack of "real" nodes, taking their place. As for paragraph 3, I propose to turn to history.

On the eDonkey network, the file name was not the key to it. Any member could change the file name. The file path was obtained using cryptographic hashes. Suppose you clicked on a verified link, for example, to download a new blockbuster. You will get the hash and file size.

After a few hours or days, the movie downloaded,you cooked popcorn in the microwave and sat down on the sofa with your significant other, turned on the movie and&#8230; It was immediately closed because it turned out to be hardcore porn.

After long explanations with, perhaps, youryour other half, you will begin to figure out what happened. And the following happened. The MD4 hashing algorithm was used to address the file. You cannot decrypt the hash into data due to loss of information, however youyou canwith some effort finding data,which have the same hash and size. It is difficult to crack a hash by exhaustive search; it takes a lot of computing power and time. But for cracking the MD4 hash, it is not necessary to do exhaustive search: a few years ago a vulnerability was found in the algorithm that allowed to recreate the data of a specific hash.

Hacking modern hashes requires moreprocessing power than before. But, as we have already seen, the amount of computing power of the attackers is simply colossal and, it seems, to acquire it is much easier than it seems. For security, you need more than a couple of cryptographic hashes.

I am not saying it's impossible to scaleblockchain network. This can be done using conversion, sacrificing certain properties, which were already mentioned above. The result will be something other than the world's best duplicated distributed registry, a copy of which everyone has. Why is this necessary at all? It is much easier to solve all problems with traditional databases or private blockchains without spending a huge amount of electricity to prove the work done.

*: I was one of those people who provided ed2k services to the community that BitTorrent simply didn't need: server lists. Torrent files were an ugly intermediate step and included a path to finding trackers. Ed2k: // links were not like that, and at that time almost everyone had dynamic IP addresses, and DNS support in the edonkey protocol was severely lacking, which made it impossible to even enter the eDonkey network without an actual list of servers.