How and with what to analyze the connections of cryptocurrency pairs?

cryptosensors.info

How to analyze the relationship of cryptocurrencies? What are the laws governing the movement of cryptocurrencies? What are the features in the movements of the crypt?

Content:

  1. Introduction.
  2. Is everything all right with your distribution?….
  3. Extraction of data on cryptocurrencies. Active cryptopairs.
  4. How is the crypto market moving? …
  5. A nonparametric tool for analyzing relationships between pairs of cryptocurrencies.
  6. Conclusion.

Introduction

Not so long ago, our world was replenished with another phenomenon: cryptocurrencies. At first, they were treated as a tool that only geeks are interested in. Today it is obvious that the “crypt” is becoming a backbone. Its mysteriousness, frightening volatility attracts more and more people: from speculators to lovers of conspiracy theory. There is a natural desire to understand the movement of the crypt. The urge to organize and “put everything on the shelves” forces you to turn to data processing tools. Very often these tools are used not quite, to put it mildly, correctly.

For example, in the flow of information about cryptocurrency markets, a lot of attention is paid to the study of cryptocurrency connections. Which cryptocurrencies rise / fall at the same time (have the same movement vector)? AND…. on the contrary – which have the opposite direction. The answers to these questions can enrich the investor in the truest sense of the word. This task is a classic task of correlation analysis. It would seem … we take the quotes of cryptocurrencies … a package of spreadsheets. “Go” and done. But … I remember a phrase from an advertisement: “not all yoghurts are equally useful.”

The subtleties, as they say, are in the details. The fact is that the methods for calculating correlations are different. Spreadsheets use Pearson’s correlation, although it is referred to simply as “correlation.” But there is one but. Pearson’s correlation is only possible if the data we are trying to analyze is normally distributed.

Is everything all right with your distribution?….

Let’s remember what the normal distribution is. First there will be boring formulas, then an entertaining example. So, a one-dimensional random variable that corresponds to a normal distribution has the following probability density function:

Нормальное распределение формула

Behind the eyes, it is also called the Gaussian function;) It has only two parameters. First: µ is the mathematical expectation (mean), median and distribution mode. Second parameter: σ – standard deviation (σ2 – variance) of the distribution. The probability density function has the following graphical representation:

In order to describe the normal distribution without formulas, consider, for example, the height of the people around us. Think of your friends, acquaintances, work colleagues. Are there many giants among them? Are there many people of extremely small stature among them? The most common value is likely to be “average height“.

The normal distribution has another remarkable property. We measure three standard deviations down from the average height. We measure three standard deviations up from the average height. 99.73% of your subjects will be within this range. In other words, the lion’s share of the sample is within the “three sigma” range.

Imagine now that we are in the fabulous Middle Ages. In addition to people, other creatures live on earth: giants, gnomes. Dragons hover in the sky. Elves lurk in the woods. Let’s form a sample of the growth of fantastic creatures. As you can see, the histogram has heavy tails.

Why? It’s simple – a meeting with a gnome or a giant is not so rare. The distribution of the height of the inhabitants of the fairy forest differs significantly from the normal one.

It is especially worth noting that if a random variable is affected by many random variables that are practically independent of each other, the behavior of such a random variable is described by a normal distribution. The normal distribution is fairly well understood. A lot of data processing techniques are based on it. The Pearson correlation, which requires normality, is only one in a thousand instruments. And here a fundamental question arises. What is the distribution of price changes in cryptocurrencies?

Mining data about cryptocurrencies. Active crypto pairs.

We will consider the Binance crypto exchange as a source of data on cryptocurrency pairs. We will use Python 3.7.7 as processing tools. We use libraries such as: scipy, numpy, pandas, plotly.

At the time of this writing, we have historical data on 600 cryptocurrency pairs. We will consider daily data. Data mining is done using the Binance API. The sample size is 90 days. The subject of consideration will be the following indicator:

Growth_rate_Close = Close temp day / Close last day

Those. if, for example, this ratio is 1.015, the closing price has increased by 1.5%. If the value, for example, is 0.98, then the price has dropped by 2%. Thus, we analyze not the absolute values of the closing prices of cryptocurrency pairs, but their gains.

Note that not all crypto pairs are actively traded. Let’s select the most liquid ones. There are two criteria for activity – the number of transactions or the volume of trade. Let’s choose the number of deals. So, if you sort all cryptocurrency pairs by the number of transactions, and display it on the chart, you get the following.

Let’s take the 35 most actively traded cryptocurrencies. They are shown in red on the graph. All data on cryptocurrency pairs are displayed in blue. The graph shows that the most active part of cryptocurrencies is only 6% of the entire list of cryptocurrencies.

0,05833 = 35 / 600

Yes, that’s a fact. These are the realities of the crypto market.

How is the crypto market moving? …

What does the histogram of price increases for these most active 35 positions look like? How is the crypto market moving? Something like this:

Testing for compliance with the normal distribution is performed both by visual compliance with the normal distribution graph and by calculating statistics. How close is this picture to a normal distribution? Visually? Doubtful …

Additionally, we will conduct the Shapiro – Wilk test. This test is used to determine whether a sample fits a normal distribution. The following results were obtained:

pair                        Shapiro-Wilk                        p

pair_adausdt_1d        0.9671995043754578        0.02252993

pair_ankrbtc_1d        0.9147272706031799        0.00001967

pair_ankrusdt_1d        0.943658173084259        0.00069367

pair_bttbnb_1d        0.9101758599281311        0.00001190

pair_bttbusd_1d        0.9795295000076294        0.16745451

pair_btttrx_1d        0.9735089540481567        0.06264874

pair_bttusdt_1d        0.9781306385993958        0.13348551

pair_cocosusdt_1d        0.9613599181175232        0.00900380

pair_denteth_1d        0.9790965914726257        0.15614410

pair_dentusdt_1d        0.9696356058120728        0.03333539

pair_dogeusdt_1d        0.9118067026138306        0.00001422

pair_hotbtc_1d        0.8912015557289124        0.00000167

pair_hoteth_1d        0.9076758027076721        0.00000907

pair_hotusdt_1d        0.9536965489387512        0.00285049

pair_iostbtc_1d        0.9279756546020508        0.00009242

pair_iostusdt_1d        0.9479432702064514        0.00125220

pair_keyusdt_1d        0.6302748918533325        0.00000000

pair_maticusdt_1d        0.9804877638816833        0.19529814

pair_mblusdt_1d        0.9175736904144287        0.00002714

pair_mftusdt_1d        0.8282192349433899        0.00000001

pair_npxseth_1d        0.8064180612564087        0.00000000

pair_npxsusdt_1d        0.7406529188156128        0.00000000

pair_onebtc_1d        0.9784880876541138        0.14147447

pair_oneusdt_1d        0.9795064926147461        0.16683325

pair_scbtc_1d        0.9732644557952881        0.06019074

pair_trxbtc_1d        0.8679629564285278        0.00000019

pair_trxusdt_1d        0.9682722687721252        0.02675677

pair_vetbtc_1d        0.986322283744812        0.47012007

pair_vetusdt_1d        0.9671380519866943        0.02230977

pair_vthousdt_1d        0.9375638961791992        0.00030876

pair_winbnb_1d        0.9504895210266113        0.00179445

pair_wintrx_1d        0.9059931635856628        0.00000758

pair_winusdc_1d        0.9044052362442017        0.00000641

pair_winusdt_1d        0.908967137336731        0.00001043

pair_zilusdt_1d        0.9742823839187622        0.07111172

Only 10 out of 35 crypto pairs have p, which turned out to be higher than the alpha level of Shapiro – Wilk. This means that, technically, we cannot reject the hypothesis that the samples are normally distributed. Here’s what the histogram of the 10 mentioned crypto pairs looks like:

To what extent does their appearance correspond to the normal distribution? Despite the significance of the Shapiro-Wilk statistic, it is highly doubtful. Thickened tails are visible on both the right and left. Let’s remember about gnomes and giants;)

Cryptocurrency pairs do not live according to the laws of normal distribution! This fact has an important consequence. It is necessary to use such analysis tools that would be free of distribution type. We are talking about nonparametric statistics. And … researching relationships is also possible there!

A nonparametric tool for analyzing relationships between cryptocurrency pairs

The nonparametric analogue of the Pearson coefficient is the Spearman coefficient. In general, its calculation refers to the methods of rank correlation. But … rank correlation is applicable to real variables as well. Calculating the Spearman coefficient between cryptocurrency pairs in each case, we get two values: the coefficient itself, as well as the value p, which allows us to assess the significance level of the Spearman coefficient itself.

At the time of the research, we identified only 35 pairs of instruments where there is an interesting and statistically significant relationship. Why is the term interestingused here? Not strong … not weak? Because the value of the correlation is not the only parameter that indicates how interesting the relationship is in terms of making a profit. You can read the research “Guide indicators in cryptocurrency trading or the Truman effect in action. Weak correlations are in the arsenal of a trader.” at www.cryptosensors.info, which will tell you about the nuances of researching relationships.

The cryptocurrency market is volatile, but … our monitoring software processing works with it. You can find out what relationships exist between cryptocurrency pairs at this time (we are not talking about the time of writing these lines; but about the time when you read these lines) at www.cryptosensors.info.

Conclusion

Cryptocurrency pairs do not behave according to the laws of normal distribution. When choosing a tool for analyzing cryptocurrency connections, you must use the right tool. Spearman’s correlation coefficient can be used to analyze the crypto market, since it is a nonparametric criterion.

You may be interested in research / data: