How to analyze the relationship of cryptocurrencies? What are the laws governing the movement of cryptocurrencies? What are the features in the movements of the crypt?
- Is everything all right with your distribution?….
- Extraction of data on cryptocurrencies. Active cryptopairs.
- How is the crypto market moving? …
- A nonparametric tool for analyzing relationships between pairs of cryptocurrencies.
Not so long ago, our world was replenished with another phenomenon: cryptocurrencies. At first, they were treated as a tool that only geeks are interested in. Today it is obvious that the “crypt” is becoming a backbone. Its mysteriousness, frightening volatility attracts more and more people: from speculators to lovers of conspiracy theory. There is a natural desire to understand the movement of the crypt. The urge to organize and “put everything on the shelves” forces you to turn to data processing tools. Very often these tools are used not quite, to put it mildly, correctly.
For example, in the flow of information about cryptocurrency markets, a lot of attention is paid to the study of cryptocurrency connections. Which cryptocurrencies rise / fall at the same time (have the same movement vector)? AND…. on the contrary – which have the opposite direction. The answers to these questions can enrich the investor in the truest sense of the word. This task is a classic task of correlation analysis. It would seem … we take the quotes of cryptocurrencies … a package of spreadsheets. “Go” and done. But … I remember a phrase from an advertisement: “not all yoghurts are equally useful.”
The subtleties, as they say, are in the details. The fact is that the methods for calculating correlations are different. Spreadsheets use Pearson’s correlation, although it is referred to simply as “correlation.” But there is one but. Pearson’s correlation is only possible if the data we are trying to analyze is normally distributed.
Is everything all right with your distribution?….
Let’s remember what the normal distribution is. First there will be boring formulas, then an entertaining example. So, a one-dimensional random variable that corresponds to a normal distribution has the following probability density function:
Behind the eyes, it is also called the Gaussian function;) It has only two parameters. First: µ is the mathematical expectation (mean), median and distribution mode. Second parameter: σ – standard deviation (σ2 – variance) of the distribution. The probability density function has the following graphical representation:
In order to describe the normal distribution without formulas, consider, for example, the height of the people around us. Think of your friends, acquaintances, work colleagues. Are there many giants among them? Are there many people of extremely small stature among them? The most common value is likely to be “average height“.
The normal distribution has another remarkable property. We measure three standard deviations down from the average height. We measure three standard deviations up from the average height. 99.73% of your subjects will be within this range. In other words, the lion’s share of the sample is within the “three sigma” range.
Imagine now that we are in the fabulous Middle Ages. In addition to people, other creatures live on earth: giants, gnomes. Dragons hover in the sky. Elves lurk in the woods. Let’s form a sample of the growth of fantastic creatures. As you can see, the histogram has heavy tails.
Why? It’s simple – a meeting with a gnome or a giant is not so rare. The distribution of the height of the inhabitants of the fairy forest differs significantly from the normal one.
It is especially worth noting that if a random variable is affected by many random variables that are practically independent of each other, the behavior of such a random variable is described by a normal distribution. The normal distribution is fairly well understood. A lot of data processing techniques are based on it. The Pearson correlation, which requires normality, is only one in a thousand instruments. And here a fundamental question arises. What is the distribution of price changes in cryptocurrencies?
Mining data about cryptocurrencies. Active crypto pairs.
We will consider the Binance crypto exchange as a source of data on cryptocurrency pairs. We will use Python 3.7.7 as processing tools. We use libraries such as: scipy, numpy, pandas, plotly.
At the time of this writing, we have historical data on 600 cryptocurrency pairs. We will consider daily data. Data mining is done using the Binance API. The sample size is 90 days. The subject of consideration will be the following indicator:
Growth_rate_Close = Close temp day / Close last day
Those. if, for example, this ratio is 1.015, the closing price has increased by 1.5%. If the value, for example, is 0.98, then the price has dropped by 2%. Thus, we analyze not the absolute values of the closing prices of cryptocurrency pairs, but their gains.
Note that not all crypto pairs are actively traded. Let’s select the most liquid ones. There are two criteria for activity – the number of transactions or the volume of trade. Let’s choose the number of deals. So, if you sort all cryptocurrency pairs by the number of transactions, and display it on the chart, you get the following.
Let’s take the 35 most actively traded cryptocurrencies. They are shown in red on the graph. All data on cryptocurrency pairs are displayed in blue. The graph shows that the most active part of cryptocurrencies is only 6% of the entire list of cryptocurrencies.
0,05833 = 35 / 600
Yes, that’s a fact. These are the realities of the crypto market.
How is the crypto market moving? …
What does the histogram of price increases for these most active 35 positions look like? How is the crypto market moving? Something like this:
Testing for compliance with the normal distribution is performed both by visual compliance with the normal distribution graph and by calculating statistics. How close is this picture to a normal distribution? Visually? Doubtful …
Additionally, we will conduct the Shapiro – Wilk test. This test is used to determine whether a sample fits a normal distribution. The following results were obtained:
pair Shapiro-Wilk p
pair_adausdt_1d 0.9671995043754578 0.02252993
pair_ankrbtc_1d 0.9147272706031799 0.00001967
pair_ankrusdt_1d 0.943658173084259 0.00069367
pair_bttbnb_1d 0.9101758599281311 0.00001190
pair_bttbusd_1d 0.9795295000076294 0.16745451
pair_btttrx_1d 0.9735089540481567 0.06264874
pair_bttusdt_1d 0.9781306385993958 0.13348551
pair_cocosusdt_1d 0.9613599181175232 0.00900380
pair_denteth_1d 0.9790965914726257 0.15614410
pair_dentusdt_1d 0.9696356058120728 0.03333539
pair_dogeusdt_1d 0.9118067026138306 0.00001422
pair_hotbtc_1d 0.8912015557289124 0.00000167
pair_hoteth_1d 0.9076758027076721 0.00000907
pair_hotusdt_1d 0.9536965489387512 0.00285049
pair_iostbtc_1d 0.9279756546020508 0.00009242
pair_iostusdt_1d 0.9479432702064514 0.00125220
pair_keyusdt_1d 0.6302748918533325 0.00000000
pair_maticusdt_1d 0.9804877638816833 0.19529814
pair_mblusdt_1d 0.9175736904144287 0.00002714
pair_mftusdt_1d 0.8282192349433899 0.00000001
pair_npxseth_1d 0.8064180612564087 0.00000000
pair_npxsusdt_1d 0.7406529188156128 0.00000000
pair_onebtc_1d 0.9784880876541138 0.14147447
pair_oneusdt_1d 0.9795064926147461 0.16683325
pair_scbtc_1d 0.9732644557952881 0.06019074
pair_trxbtc_1d 0.8679629564285278 0.00000019
pair_trxusdt_1d 0.9682722687721252 0.02675677
pair_vetbtc_1d 0.986322283744812 0.47012007
pair_vetusdt_1d 0.9671380519866943 0.02230977
pair_vthousdt_1d 0.9375638961791992 0.00030876
pair_winbnb_1d 0.9504895210266113 0.00179445
pair_wintrx_1d 0.9059931635856628 0.00000758
pair_winusdc_1d 0.9044052362442017 0.00000641
pair_winusdt_1d 0.908967137336731 0.00001043
pair_zilusdt_1d 0.9742823839187622 0.07111172
Only 10 out of 35 crypto pairs have p, which turned out to be higher than the alpha level of Shapiro – Wilk. This means that, technically, we cannot reject the hypothesis that the samples are normally distributed. Here’s what the histogram of the 10 mentioned crypto pairs looks like:
To what extent does their appearance correspond to the normal distribution? Despite the significance of the Shapiro-Wilk statistic, it is highly doubtful. Thickened tails are visible on both the right and left. Let’s remember about gnomes and giants;)
Cryptocurrency pairs do not live according to the laws of normal distribution! This fact has an important consequence. It is necessary to use such analysis tools that would be free of distribution type. We are talking about nonparametric statistics. And … researching relationships is also possible there!
A nonparametric tool for analyzing relationships between cryptocurrency pairs
The nonparametric analogue of the Pearson coefficient is the Spearman coefficient. In general, its calculation refers to the methods of rank correlation. But … rank correlation is applicable to real variables as well. Calculating the Spearman coefficient between cryptocurrency pairs in each case, we get two values: the coefficient itself, as well as the value p, which allows us to assess the significance level of the Spearman coefficient itself.
At the time of the research, we identified only 35 pairs of instruments where there is an interesting and statistically significant relationship. Why is the term interestingused here? Not strong … not weak? Because the value of the correlation is not the only parameter that indicates how interesting the relationship is in terms of making a profit. You can read the research “Guide indicators in cryptocurrency trading or the Truman effect in action. Weak correlations are in the arsenal of a trader.” at www.cryptosensors.info, which will tell you about the nuances of researching relationships.
The cryptocurrency market is volatile, but … our monitoring software processing works with it. You can find out what relationships exist between cryptocurrency pairs at this time (we are not talking about the time of writing these lines; but about the time when you read these lines) at www.cryptosensors.info.
Cryptocurrency pairs do not behave according to the laws of normal distribution. When choosing a tool for analyzing cryptocurrency connections, you must use the right tool. Spearman’s correlation coefficient can be used to analyze the crypto market, since it is a nonparametric criterion.
You may be interested in research / data:
- Research: Guide Indicators in Cryptocurrency Trading or the Truman Effect in Action. Weak correlations are in the arsenal of a trader.
- Research: Quotes of cryptocurrency pairs. Collection and processing. What should a trader know about?
- Research: Statistics on the effectiveness of candlestick analysis for trading cryptocurrencies. Patterns: bullish hammer, bearish hammer.
- Data: Cryptopairs quotes in xlsx format.
- Data: Comparable data for ten well-known cryptopairs.
- Data: Exchange candlestick analysis. Evaluating the use and effectiveness of patterns. Patterns: Bull hammer Bear hammer.
- Data: Search data for Truman zones ALMOST ALL (guide indicators for cryptocurrency pairs).
- Data: Cryptopairs-relationships.