13.12.11 17:03 By Sergei Shpilkin, translated by Semen Kvasha
There were math scientist on the protest rally in Bolotnaya square showing Gaussian formula Photo: Kirill Lebedev
Statistical analysis of Duma election results show how artificial United Russia's success was.
Every election is a huge experiment in finding out voters' opinions and preferences. Federal elections provide approximately 96,000 'measurement results' – the protocols of local election commission. Even if we suspect that this data was compromised somehow, we can still learn a lot from it: both voters' preferences and the "measurement apparatus", i.e. the election system itself.
Before we start our statistical analysis, let's bring it down to an evident, graphic form. We will examine two graphic approaches, useful for election data analysis.
This is a diagram allowing us to find the connection between two factors. Measurement results are put on a Cartesian plane as dots.
The advantage of this diagram is that it can operate with really small amount of data. See below three scatterplots showing the results of elections in three districts of Moscow – South Tushino, Strogino and Golyanovo. Each dot is a result of one party at one voting station. The horizontal axis shows attendance, vertical shows the share of registered voters voting for each party. There are five dots, one for each of the parties with the biggest results, for each voting station
All the districts are Moscow suburbs with a very average population, but the pictures we see on these diagrams are very different.
South Tushino graph
The only dot that doesn't fit the picture is a small
The Strogino diagram looks different. The attendance range is larger, but at voting stations with better attendance the amount of votes for United Russia increases while the other parties shares don't change.
This can easily be explained if we look at how the position of the voting station on a scatterplot changes with the artificial adding of votes for one party
First, all the parties dots move towards bigger attendance by the same percentage as the amount of added votes. Second, the dot of the party to which the votes are added, moves up by the same percentage. In the end, the dot of the party for whom the 'ballot-stuffing is committed'
We can see another feature in the Golyanovo diagram: Not only the dots show the votes added for UR, as on the previous scatterplot, but there are groups where the votes for the other parties are significantly fewer compared to neighboring voting stations, and the share of UR is even larger.
This corresponds with the situation where the votes for United Russia are added and taken away from the other parties.
We can see on this 'all Moscow voting stations' scatterplot a tight group of blue dots left below
These are the stations where UR received a little more than 50% votes.
Since these stations can be seen as a tight group, probably 50% was a planned assignment for these stations.
Vote density histograms are an even more convenient and indicative way of visualization of election results. Lets clarify the way this diagram works with a classic example of a coin toss.
The probability of a coin falling with "heads" is 50%. But that doesn't mean that if you toss the coin 100 times there'll be 50 "heads". However, if we conduct numerous
The amount of certain numbers of "heads" tosses is a random number, so the distribution will not be smooth and will never be the same, but still it can be described as a normal or Gaussian distribution, known from probability theory.
The diagram of Gaussian distribution is a symmetric bell-shaped curve. It is typical for many variables where there are many independent random factors
But the attendance of the election, and share of votes given for a party if the voter decides independently are also such variables.
Because of the different sizes of voting stations, different levels of commitments of the voters, their different sympathies in different voting stations, etc., the distribution of these variables can be shown as a sum of many Gaussian distributions with different widths and different positions of the central line, but we can expect that its general feature – symmetric bell-shaped curve will be kept.
It's convenient to analyze elections by building histograms of voting stations or attendance and voting percentage for different parties. First, as an example, let's look at the histograms of the attendance in different voting stations in foreign countries. The horizontal axis shows attendance, the vertical axis shows the number of voting stations.
All the countries show more or less smooth bell-shaped curves, similar to a Gaussian curve.
It's not like that in the Russian elections.
There are two peculiarities:
Attendance distribution doesn't have a bell-shape form, as was shown in the diagrams of foreign attendance. This mainly concerns high attendance, where the distribution doesn't decrease symmetrically to the first area and stays high up to 100% attendance. To put it differently, the amount of voting stations with high attendance is too high.
The distributions have high peaks at the figures which are divisible by five.
I.e. in presidential election in 2008, there were 1429 stations with attendance at 79%, 2069 with attendance at 80% and 1787 with attendance at 81%.
As we already saw in the scatterplot diagrams, large amounts of voting stations with high attendance appear when there are artificial adding of votes for one of the parties, although we can try to explain it with traditionally high attendance in North Caucasus republics and in the countryside.
What cannot be explained with natural mechanisms is the attendance figures approaching round numbers.
The only reasonable explanation is that such attendance was reached by 'manual control', i.e election results were falsified according to a plan set 'from above.'
By the way, during the recent Duma election we didn't see many cases of nice peaks on round numbers. But that doesn't mean the elections were ok. On the next diagram we see the distribution of votes for different parties at voting stations. The horizontal axis shows the percentage of votes received by parties, on the right the quantity of voting stations with such votes, the interval is 0,5%.
This is even more interesting.
The United Russia curve looks like normal distribution only on it's left side, up to the peak.
The right part is stretched unnaturally up to 100%, the peaks on figures divisible by five definitely show that both UR percentage and attendance were pre-ordered. We need to mention that if we can't achieve the necessary attendance by catching people in the streets and bringing them to the stations, one can only make a nice-looking round percentage after opening the ballot-boxes and vote count, i.e. by falsifying the results.
On the other hand, the Communist party's
Thus, we see that the only party whose curve differs radically from Gaussian is United Russia.
There is direct evidence of falsifying the voting figures for this party. This is another argument for the idea that the strangeness of Russian attendance distribution is a result of adding the voices for United Russia. Let's see how it shows election results – quantitatively. The next graph shows different parties' results received on voting stations with different attendance.
The horizontal axis shows attendance, the vertical axis shows the number of votes at stations, the interval of attendance is 1%.
The votes distribution is very similar, except for the distribution of votes for United Russia: the higher the attendance, the more votes UR receives. This is what we saw on scatterplots: when votes for UR are added while the other parties are scattered along the attendance axis, keeping the correlation between themselves and the relative share of UR grows.
Assuming that the difference in distribution of votes between UR and the other parties depending on the attendance is the result of artificial increase of the amount of votes for UR, we can try to establish the size of this increase. We will try to separate from UR votes distribution, a part proportional to the sum of votes for all the other parties.
As we see in this diagram, we could separate a part from the votes for United Russia, the part, proportional to all the other voices, so that until the attendance of 50% - 52% the remainder of the UR voices after subtraction of this part practically equals 0. In our assumptions that means that there were almost no added votes. This correlates with the lower cloud on a scatterplot
After the division between 'normal'
When this article was being written, the election results were not final and embraced 108,6 million registered voters
25.10.12 13:41 By Gazeta.ru editorial
24.10.12 12:26 By Gazeta.ru editorial
23.10.12 12:37 By Gazeta.ru editorial