Why exchanges collapse during high volume

User avatar
Steve Sokolowski
Posts: 3114
Joined: Wed Aug 27, 2014 3:27 pm
Location: State College, PA
Contact:

Why exchanges collapse during high volume

Postby Steve Sokolowski » Thu Jun 15, 2017 5:59 pm

The cryptocurrency industry is collapsing under the recent media attention. As people flood trading sites, join mining pools, and start new projects, the infrastructure has grown more fragile than many people believe. In this post, I'll propose an explanation that the bubble cycle stems, in part, from rational decisions made by businesses informed by market forces.

It's no secret that businesses in general, and exchanges in particular, are reaching their breaking points. Coinbase, one of the leading exchanges, is so overloaded with customer service requests that I have not heard of a single person receiving a reply from a support ticket at all. This cost them tens of thousands of dollars in potential business from our customers alone, solely because they couldn't reply to an E-Mail. Poloniex recently added 12 servers to handle increased load on its servers, and issued a warning to traders that they may not be able to retain service quality in the future. Novaexchange disabled its API during one period of high volume, which led to our diverting $100,000 of trades to another site.

The networks themselves are completely unprepared for the demand. Bitcoin is useless, with people unable to afford using it anymore and our payout thresholds 250 times higher than they were just 6 months ago. Ethereum, while poised to grow its blocksizes slowly due to its superior design, currently finds itself limited during initial coin offerings. Litecoin has seen a dramatic increase in people paying transaction fees, which were close to zero until recently. Networks like Gamecredits and Bitconnectcoins see large blocks regularly.

In our case, we first encountered performance issues in November 2014, when database triggers imposed a 6GH/s capacity limit on the pool. Through strategies like removing the triggers, parallelizing operations, rethinking coin assignment algorithms, and reducing the amount of data that is stored, we have been able to increase performance by a factor of 170x to today's theoretical 1.05TH/s limit. I suspect that if we rewrote this code in C++ and performed additional parallelization over a few years, we could improve performance up to at least 10,000x the original system. At our current rate of effort, however, we won't reach that capacity for years, and therefore every time we spend another two weeks improving performance, more hashrate appears and the system is struggling again.

Some customers have questioned why we haven't added new features in a long time and why 80% of our time is spent improving performance. There are two causes that are applicable across the entire industry and which should be discussed independently: performance improvements cause bugs, and there is too much uncertainty caused by the blocksize debate to invest more effort.

On the technical side, the most important reason that performance improvements are problematic is because they always cause bugs, and the bugs in mining are particularly difficult to predict. In most software, bug testing is relatively simple: you write automated tests that run every time you make changes, and add tests every time you add a new feature. These tests will identify problems with the code before a release. Some testing also involves users trying out features in the software, but usually this is in a controlled environment. Games are more difficult to test, but with games it is still possible to spend a small amount and buy a number of different computers to see whether different system configurations affect the software.

In cryptocurrency, there are some features that are impossible to test outside of production. For example, one can find a block on the litecoin testnet, but you never really know whether the code is right until a block is found on the production network. Most of the time, a few initial blocks are lost, at a cost of $800 each, due to an unforeseen circumstance. When we start SHA-256 mining, we expect that the first bitcoin block will be lost, at an enormous cost, because you can't test finding main network blocks. In fact, when Segregated Witness went live on the Litecoin network, I'm told that a bug caused one of the largest mining pools to orphan many blocks.

Theoretically, one could purchase mining equipment and mine at those networks to confirm the correctness of the code, which would not cost any less but would at least improve the customer experience. However, the impact of market conditions is impossible to test, which is why we have problems after releases and why exchanges are always scrambling to fix bugs.

Consider what happens if we test the system with "fake" miners, as we do now. While we can find fake shares and execute fake sells, they don't affect the prices on exchanges. Upon deployment, exchanges get hit with sells and buys, leading automated bots to react and create unexpected conditions. In one case, we found that bots would place small orders one satoshi higher than a large spread, causing inaccurate pricing. In another, we found that nobody was placing buy orders the Bitconnectcoin market on Novaexchange, causing us to use up all the liquidity and crash the market. We had to request people to manually take advantage of the huge arbitrage opportunity from between Novaexchange and the "official" exchange, which does not have an API (there are unfortunately still times when people are not placing buy orders at Novaexchange and the arbitrage is still insufficient). Then, there are miners who buy large amounts of hashrate from external systems that don't comply with interface standards. We have encountered hundreds of these issues over the past three years, all of which are untestable in a simulated environment.

But after all those problems are identified, we almost always arrive at a stable system. Soon thereafter, performance issues hit as miners are drawn to the stability, so changes need to be made. Perhaps the changes unknowingly eliminate a function that works around the bug in the third-party interface, because the function showed up in the profiler as being a huge resource user and it was not obvious why it was there. The new release improves performance but reintroduces the incompatibility. The next step is to add a fix that restores compatibility while retaining the other performance improvements. The system is finally stable again at a higher performance level, until more customers arrive and the cycle repeats with other bugs.

Hardware is rarely a solution to performance issues, and when it is a solution, software can almost always produce better gains than hardware. One of the unrecognized issues in computer science is that while people continually express concern that Moore's Law is coming to an end, the improvements available in software are much greater. Once the chip makers realize this and more resources are assigned to software improvements, we will see even more dramatic gains in speed than have been coming through hardware. The instant startup time of Windows 10, which contains far more code and features compared to the slow Windows 95, is an example of this potential.

Software performance is the reason why the cryptocurrency industry is on a precipice right now. All of the current issues: quadratic hashing in blockchains, overloaded pools, exchanges that crash, and so on, are the result of software. Fortunately, as I showed above, unbelievable gains of 100x or 1000x are available if enough effort is put into improving software, and improving software performance is not a particularly novel or unknown field. Most of the gains to be had in the industry's software are simple fixes to common problems. And even if bugs occur from the software fixes, correcting them is much easier than having the innovative idea necessary to write the original code. No scientific breakthroughs are required to improve performance of almost everything out there.

So if there is so much performance to be gained, why do sites still struggle under the load? After all, if I were able to work 70 hours per week on this project, I could easily get ahead of the issues and stay 20% or 50% higher than current capacity. Exchanges could hire more developers to make optimizations so that there is always 5x or more "surge capacity" on days of high volume. Similarly, Coinbase could earn huge profit increases if they simply trained a few low-wage customer service representatives to handle the most basic questions like "how do I add a credit card" and delete duplicates, which likely comprise 95% of their support backlog.

The reason we don't, and they probably don't, is because of uncertainty, the second reason. Uncertainty is one of the critical factors that prevents growth in the broader economy. Risk-averse businesses don't invest when uncertainty means that growth potential could be stunted in the future by unexpected events.

The primary causes of uncertainty today result from the blocksize debate and the bubble cycle itself. The blocksize debate causes uncertainty because it is no longer obvious whether the industry will be able to expand. I, for example, see Bitcoin's only chance at survival resting on an unconditional hard fork to bigger blocks. If the User Activated Soft Fork or the SegWit 2x proposals wins, then Bitcoin will not have enough capacity, and Ethereum or Litecoin will lead, most likely permanently. (Oddly, as judged by the ridiculously low price of litecoins, few seem to recognize that Litecoin will still be more capable than Bitcoin if either of these two proposals is adopted.) Until at least August 1, we don't want to make any serious investments in infrastructure because we don't even know which algorithm will be the most profitable to mine.

On a longer term, if we're going to seek the half million dollars necessary for making a play at becoming the largest mining pool in the world, or to try to find a buyer for our trading software to create a seamless multi-currency wallet, then we had better be pretty certain that this industry is going to be a reputable sector with room to grow. I refuse to spend money while we are still dependent upon childish developers and blind CEOs who won't act upon the obvious fact that there is enormous suppressed demand for someone willing to take a stand and hard fork the Bitcoin blockchain to Unlimited-sized blocks on August 1. If you want to grow the sector and make huge profits for yourself (i.e. by allowing other businesses to become your customers), be brave and allow people like us to move forward by removing the primary contributor to uncertainty on one definite date. If you don't want to make money and would rather be "right," then feel free to support the user-activated soft fork and drive people away while they are even more uncertain that a chain reorganization will wipe out their life savings for years to come. Or, you can watch the 2MB blocks fill up the day after Segwit 2x is activated and continue fighting for years to come, with three, four, or more forks to show for it.

In the meantime, I continue to sell every bitcoin I earn, invest my profits into tech stocks, work the 30 hours a week I can to keep the current pool running smoothly, and wait. After all, I don't recall reading that Nvidia plans to limit the size of roads for its self-driving cars.

The bubble cycle itself is another source of uncertainty. It doesn't make sense for Poloniex to pay people to improve performance by 5x if the current capacity is sufficient for the baseline during the five years after the bubble. It is easy for companies to go bankrupt by overinvesting during good times and then having useless capacity later, and the exchange is better off crashing during periods of heavy volume. In the case of mining, it's difficult to argue that scrypt profitability isn't going to drop from the current 11 cents to at least its breakeven historical rate of 2 cents. In fact, it's likely profitability will be even worse than that, because there is a delay between ordering and bringing miners online, so hashrate will continue to increase with even more efficient miners coming online for some time during the crash, probably driving profitability to 0.5 cents or lower.

The current bubble is strange because it has lasted so much longer than previous bubbles, which increases the uncertainty even more. To me, the lengthy buildup seems to suggest that the fall will be even more lengthy. I'm surprised how many people I talk to seem to have forgotten what it was like in 2014. Imagine if the next down cycle lasts five or even six years, rather than three, and it takes two years just to reach the bottom. This is why it's necessary to be prepared, exercise caution, and not overinvest.

In conclusion, the difficulties customers are experiencing with cryptocurrency businesses are a direct result of uncertainty. The uncertainty is caused primarily by the blocksize debate. Software improvements are available to improve capacity issues, but businesses are making rational decisions to limit investment until Bitcoin's blocksize debate is permanently resolved, or until Bitcoin fails and another currency ascends due to not resolving it. When you try to buy bitcoins and the your exchange is down, it's because they didn't spend the money necessary to handle the surges they know will happen. The industry is in a holding pattern, with cash sitting on the sidelines and investors (smartly) unwilling to move forward until they can see if cryptocurrency will be permitted to expand. By holding out, they can expand into other sectors instead of bitcoin should the blocksize debate roll on.

Until uncertainty declines, cryptocurrency businesses will continue to invest just enough to improve their software to handle the minimum load, and customers will continue to be frustrated by their unwillingness to expand to meet surge conditions or future demand. The companies and developers that survive the coming months or years will be those that focus on the same things that have always mattered to successful businesses: exceptional customer service, listening to what people actually want, and budgeting wisely to survive until for the day that certainty arrives.
Last edited by Steve Sokolowski on Thu Jun 15, 2017 6:49 pm, edited 3 times in total.
lilbob
Posts: 21
Joined: Mon May 22, 2017 8:49 am

Re: Why exchanges collapse during high volume

Postby lilbob » Thu Jun 15, 2017 6:30 pm

Swiss banks are entering into discussions on how to proceed to make crypto funds. no worriies buddy, on the potential future, your software concepts are most wanted... A little more time for them to start teaching it in school would be great. But at the end of the day the new models begin arriving soon. i propose to stay with what is working best and at a great efficiency, people are only now waking up, sales of old machines and orbs are everywhere on the web. Prohashing atm provide possibly the most important access for these people. Recent server update prepared you , stick with it :D
User avatar
mmfiore
Posts: 12
Joined: Wed Apr 12, 2017 10:42 am

Re: Why exchanges collapse during high volume

Postby mmfiore » Thu Jun 15, 2017 9:13 pm

Steve I really enjoy your posts. They are very well reasoned and very thought provoking. Keep up the good work!
frphm
Posts: 31
Joined: Thu Jun 01, 2017 1:39 am

Re: Why exchanges collapse during high volume

Postby frphm » Thu Jun 15, 2017 11:21 pm

Steve Sokolowski wrote:I'm surprised how many people I talk to seem to have forgotten what it was like in 2014.


I'm still confused by this bubble you speak of. I was under the impression that the 2013-2014 crash was precipitated by the Mt. Gox fiasco, which eroded any last bit of confidence anyone had in Bitcoin. Seeing as it is now three years later and the industry is finally seeing a great deal of media attention and innovation...I just don't understand what would cause a crash. Crypto is worthless - it's all code, it's just another commodity to trade. Why would all the traders just give up and jump ship for good...simply because of a dip like the one we have seen the past couple days?

I see a market that is maturing, and the growth we are experiencing is just a byproduct, not a bubble. I feel like it would take something pretty devastating to cause a crash similar to 2013-2014. If August 1st is expected to be such an event, what's stopping those traders from shifting to a different coin? I understand your points on fragile infrastructure, but is the market really fragile enough to be taken down by one coin? In a commodity market?

The housing bubble of 2007-2008 was caused by reckless lending, and consumers running out of money. That type of situation really doesn't apply here. The dotcom bubble was caused by the frenzy to take advantage of the Internet, and then competition killed off the weak companies, and investors were reminded that bad companies don't make money (go figure). Again, it doesn't really apply, because the values the investors were paying relied on companies with products or services. Crypto is an empty commodity. We aren't trading stocks, value isn't determined by corporate behavior, it's determined by demand alone.

As always, feel free to correct me if I'm off-base.
JKDReaper
Posts: 101
Joined: Fri Mar 31, 2017 11:17 am

Re: Why exchanges collapse during high volume

Postby JKDReaper » Thu Jun 15, 2017 11:45 pm


I see a market that is maturing, and the growth we are experiencing is just a byproduct, not a bubble. I feel like it would take something pretty devastating to cause a crash similar to 2013-2014. If August 1st is expected to be such an event, what's stopping those traders from shifting to a different coin? I understand your points on fragile infrastructure, but is the market really fragile enough to be taken down by one coin? In a commodity market?


You, me, this forum, etc...all see the vast possibilities and benefits crypto can offer in a huge variety of ways. I posted the other day about this and for myself...i think what could cause a crash/collapse, is just that...a crash/collapse. As you said, the news and such are covering crypto in a positive way more than ever before. This is bringing in an influx of new investors, miners, etc...and a majority of these new people are catching the "next big thing", like you mentioned with the tech collapse. They don't truly understand yet the possibilities or nuances of how this all works. So, if these new people start loosing money left and right due to a decline in the market (remember, many many of these new people have never invested in stocks or anything besides their company IRA), and suddenly the coverage starts being a little less "bright" and positive. What does this do? At a minimum this slows any recovery, but quite possibly increases the decline. Which leads to less positive press...which increases the decline...which leads to negative press...which.......it could be as chain reaction.

So to avoid a crash, we have to do just that...avoid a crash. I don't mean the type of decline we've had past free days...most reasonable people will understand some small bumps and dips...but if it continues too long or if we get a large drop quickly...the ball could start rolling down hill and just keep picking up speed.

Is there an easy solution to this? Honestly, no...sure we can blog, tweet, call reporters or whatever...but to not only maintain, but to GROW this world, the masses need a reason to not o ly accept it, but to embrace it.

I used this example the other day...compare using 2 types of credit cards to crypto/standard economic ideals. If you could pick between 2 credit cards...1 that was well established, but just "basic", or this new one that is accepted all the same places, same rates, etc...but also offered cash back (verified funds in an account that could be seen by everyone) and rewarded you for your purchases with free ball game tickets and electronics...which would you choose? This is what the crypto world needs to offer the masses, in my humble opinion at least, in order for it to take hold on a mass scale.

Any who, just my 2 sats....

Who is online

Users browsing this forum: No registered users and 1 guest