December 22, 2014 13:30


Elections improbability

13.12.11 17:03    By Sergei Shpilkin, translated by Semen Kvasha

There were math scientist on the protest rally in Bolotnaya square showing Gaussian formula

There were math scientist on the protest rally in Bolotnaya square showing Gaussian formula   Photo: Kirill Lebedev


Copy and paste the html-code below:

Statistical analysis of Duma election results show how artificial United Russia's success was.

Every election is a huge experiment in finding out voters' opinions and preferences. Federal elections provide approximately 96,000 'measurement results' – the protocols of local election commission. Even if we suspect that this data was compromised somehow, we can still learn a lot from it: both voters' preferences and the "measurement apparatus", i.e. the election system itself.

Before we start our statistical analysis, let's bring it down to an evident, graphic form. We will examine two graphic approaches, useful for election data analysis.


This is a diagram allowing us to find the connection between two factors. Measurement results are put on a Cartesian plane as dots.

The advantage of this diagram is that it can operate with really small amount of data. See below three scatterplots showing the results of elections in three districts of Moscow – South Tushino, Strogino and Golyanovo. Each dot is a result of one party at one voting station. The horizontal axis shows attendance, vertical shows the share of registered voters voting for each party. There are five dots, one for each of the parties with the biggest results, for each voting station (we didn't show data for two other parties that participated in the election).

All the districts are Moscow suburbs with a very average population, but the pictures we see on these diagrams are very different.

South Tushino graph (see above) looks very natural. The attendance range is not very large, which can be expected in a small territory, correlation between the different party results is more or less the same at all voting stations.

The only dot that doesn't fit the picture is a small (approximately 300 voters) "closed" voting station.

The Strogino diagram looks different. The attendance range is larger, but at voting stations with better attendance the amount of votes for United Russia increases while the other parties shares don't change.

This can easily be explained if we look at how the position of the voting station on a scatterplot changes with the artificial adding of votes for one party (shown with arrows).

First, all the parties dots move towards bigger attendance by the same percentage as the amount of added votes. Second, the dot of the party to which the votes are added, moves up by the same percentage. In the end, the dot of the party for whom the 'ballot-stuffing is committed' (we'll call it ballot-stuffing, although that could be done with a simple forgery of election protocol) moves right diagonally and all the other dots move horizontally.

We can see another feature in the Golyanovo diagram: Not only the dots show the votes added for UR, as on the previous scatterplot, but there are groups where the votes for the other parties are significantly fewer compared to neighboring voting stations, and the share of UR is even larger.

This corresponds with the situation where the votes for United Russia are added and taken away from the other parties.

We can see on this 'all Moscow voting stations' scatterplot a tight group of blue dots left below (UR votes). They are distributed roughly the same as on South Tushino scatterplot. We can naturally assume that these are the stations where the votes were counted fairly. The other UR dots are scattered in a diagonal cloud corresponding to an artificial increase of UR votes and probably a decrease of the votes for the other parties. If we look closely, we can see a tight diagonal line a bit higher than the line where the amount of votes for UR equals the half of attendance.

These are the stations where UR received a little more than 50% votes.

Since these stations can be seen as a tight group, probably 50% was a planned assignment for these stations.


Vote density histograms are an even more convenient and indicative way of visualization of election results. Lets clarify the way this diagram works with a classic example of a coin toss.

The probability of a coin falling with "heads" is 50%. But that doesn't mean that if you toss the coin 100 times there'll be 50 "heads". However, if we conduct numerous (say, 10,000) experiments, tossing coins 100 times in each (we can model the coin toss on computer) and make a histogram, we'll receive something akin to this diagram.

The amount of certain numbers of "heads" tosses is a random number, so the distribution will not be smooth and will never be the same, but still it can be described as a normal or Gaussian distribution, known from probability theory.

The diagram of Gaussian distribution is a symmetric bell-shaped curve. It is typical for many variables where there are many independent random factors (like coin tosses in our case).

But the attendance of the election, and share of votes given for a party if the voter decides independently are also such variables.

Because of the different sizes of voting stations, different levels of commitments of the voters, their different sympathies in different voting stations, etc., the distribution of these variables can be shown as a sum of many Gaussian distributions with different widths and different positions of the central line, but we can expect that its general feature – symmetric bell-shaped curve will be kept.

It's convenient to analyze elections by building histograms of voting stations or attendance and voting percentage for different parties. First, as an example, let's look at the histograms of the attendance in different voting stations in foreign countries. The horizontal axis shows attendance, the vertical axis shows the number of voting stations.

All the countries show more or less smooth bell-shaped curves, similar to a Gaussian curve.

It's not like that in the Russian elections.

There are two peculiarities:

Attendance distribution doesn't have a bell-shape form, as was shown in the diagrams of foreign attendance. This mainly concerns high attendance, where the distribution doesn't decrease symmetrically to the first area and stays high up to 100% attendance. To put it differently, the amount of voting stations with high attendance is too high.

The distributions have high peaks at the figures which are divisible by five.

I.e. in presidential election in 2008, there were 1429 stations with attendance at 79%, 2069 with attendance at 80% and 1787 with attendance at 81%.

As we already saw in the scatterplot diagrams, large amounts of voting stations with high attendance appear when there are artificial adding of votes for one of the parties, although we can try to explain it with traditionally high attendance in North Caucasus republics and in the countryside.

What cannot be explained with natural mechanisms is the attendance figures approaching round numbers.

The only reasonable explanation is that such attendance was reached by 'manual control', i.e election results were falsified according to a plan set 'from above.'

By the way, during the recent Duma election we didn't see many cases of nice peaks on round numbers. But that doesn't mean the elections were ok. On the next diagram we see the distribution of votes for different parties at voting stations. The horizontal axis shows the percentage of votes received by parties, on the right the quantity of voting stations with such votes, the interval is 0,5%.

This is even more interesting.

The United Russia curve looks like normal distribution only on it's left side, up to the peak.

The right part is stretched unnaturally up to 100%, the peaks on figures divisible by five definitely show that both UR percentage and attendance were pre-ordered. We need to mention that if we can't achieve the necessary attendance by catching people in the streets and bringing them to the stations, one can only make a nice-looking round percentage after opening the ballot-boxes and vote count, i.e. by falsifying the results.

On the other hand, the Communist party's (CPRF), Just Russia's and LDPR's curves would look quite natural if not for the artificially high number of stations with very low percentage of votes (left side of the diagram) and the depleted summit of CPRF vote distribution curve. This gives statistical support to the numerous witness reports of ballot rewriting, increasing the number of votes for UR and decreasing the number of votes for the other parties.

Thus, we see that the only party whose curve differs radically from Gaussian is United Russia.

There is direct evidence of falsifying the voting figures for this party. This is another argument for the idea that the strangeness of Russian attendance distribution is a result of adding the voices for United Russia. Let's see how it shows election results – quantitatively. The next graph shows different parties' results received on voting stations with different attendance.

The horizontal axis shows attendance, the vertical axis shows the number of votes at stations, the interval of attendance is 1%.

The votes distribution is very similar, except for the distribution of votes for United Russia: the higher the attendance, the more votes UR receives. This is what we saw on scatterplots: when votes for UR are added while the other parties are scattered along the attendance axis, keeping the correlation between themselves and the relative share of UR grows.

Assuming that the difference in distribution of votes between UR and the other parties depending on the attendance is the result of artificial increase of the amount of votes for UR, we can try to establish the size of this increase. We will try to separate from UR votes distribution, a part proportional to the sum of votes for all the other parties.

As we see in this diagram, we could separate a part from the votes for United Russia, the part, proportional to all the other voices, so that until the attendance of 50% - 52% the remainder of the UR voices after subtraction of this part practically equals 0. In our assumptions that means that there were almost no added votes. This correlates with the lower cloud on a scatterplot (pic.4) and justifies our method.

The remaining (abnormal) part of this curve should be considered an artificial increase of the amount of votes for the United Russia party.

After the division between 'normal' (proportional to the votes for the other parties) and 'abnormal' is done, we can appraise them quantitatively and try to reconstruct the "corrected" voting results without such an increase.

When this article was being written, the election results were not final and embraced 108,6 million registered voters (98%). After subtracting this data from 32,1 milliom votes for UR, the normal votes count up to 16,8 million, abnormal, artificially added votes total 15,2.

About Gazeta.Ru Gazeta.Ru on Facebook Gazeta.Ru on Twitter

© JSC "Gazeta.Ru". (1999-2014). Terms of use.  
License № ФС77-28061 27.04.2007
Founder's address:Vrubel str. 4, building 1, Moscow, Russia 125080
Editor's office address: Malaya Dmitrovka Str. 20, Moscow, Russia 127006
Phone: +7 (495) 980 80 28
Fax: +7 (495) 980 90 73
Distribution free
Editorial staff doesn't bear responsibility for information in advertisements. Editorial staff doesn't provide reference information.
All rights reserved. Design by Olesya Volkova.