Who voted for each candidate? (Chilean Elections 2017)

Last Sunday, the president of Chile was elected, so many pundits tried to explain how the votes from the first round were distributed to the second round. Since the vote is secret, no one can be wrong.

That is why I decided to take a different approach from the standard one, resorting to a mathematical model made very quickly on that same Sunday, thus publishing on Monday the article: Estimation of vote distribution in the second round of the Chilean presidential elections 2017, from which I received a lot of feedback that I incorporated into this "improved" version.

First, let’s understand the problem. In the first round, there were 8 candidates, and the votes were distributed as follows:

While in the second round between Piñera and Guillier, the votes were distributed as follows:

We know which region the votes come from, commune, polling place, all available on the SERVE website, but what everyone is curious about is: how were the votes from the first round distributed in the second round?

For this, I made a mathematical model that led me to the following result (where total_pv represents new voters):

Now I will explain how I arrived at that result:

Methodology

Creating the Dataset:

On the SERVE website (http://pv.servelelecciones.cl/ and http://www.servelelecciones.cl/), the votes by polling table are published, so I built a web scraper to download the votes by table, thus creating a dataset (available here) with the following columns:

  • region: Region
  • circ_sen: Senatorial District
  • distrito: District
  • comuna: Commune
  • circ_elec: Electoral District
  • local_vot: Polling Place
  • mesa: Polling Table
  • goic_pv: Number of votes received by Goic in the first round
  • kast_pv: Number of votes received by Kast in the first round
  • pinera_pv: Number of votes received by Piñera in the first round
  • guillier_pv: Number of votes received by Guillier in the first round
  • sanchez_pv: Number of votes received by Sánchez in the first round
  • meo_pv: Number of votes received by MEO in the first round
  • artes_pv: Number of votes received by Artés in the first round
  • navarro_pv: Number of votes received by Navarro in the first round
  • nulos_pv: Number of null votes in the first round
  • blanco_pv: Number of blank votes in the first round
  • noVoto_pv: Number of people who did not vote in the first round
  • pinera_sv: Number of votes received by Piñera in the second round
  • guillier_sv: Number of votes received by Guillier in the second round
  • nulos_sv: Number of null votes in the second round
  • blancos_sv: Number of blank votes in the second round
  • noVoto_sv: Number of people who did not vote in the second round

Methodology

In each region, we will use the results of each polling table to perform a regression that will allow us to estimate the percentages of votes that went to each candidate in the second round. The models would be as follows:

  • pinera_sv = %votes_goic_pinera*goic_pv + %votes_kast_pinera*kast_pv + … + %noVoto_pinera*noVoto_pv
  • guillier_sv = %votes_goic_guillier*goic_pv + %votes_kast_guillier*kast_pv + … + %noVoto_guillier*noVoto_pv
  • nulos_sv = %votes_goic_nulos*goic_pv + %votes_kast_nulos*kast_pv + … + %noVoto_nulos*noVoto_pv
  • blancos_sv = %votes_goic_blancos*goic_pv + %votes_kast_blancos*kast_pv + … + %noVoto_blancos*noVoto_pv
  • noVoto_sv = %votes_goic_noVoto*goic_pv + %votes_kast_noVoto*kast_pv + … + %noVoto_noVoto*noVoto_pv

But for the model to make sense, we must constrain the estimated percentages. The constraints are as follows:

  • All percentages are positive (greater than 0).
  • The sum of the percentages transferred must equal 100%.

We calibrate the model using R with the CVXR package (code attached at the end).

Results

The regression was run by region and a percentage per region was obtained, but at the national level, the votes of each candidate in the first round were distributed as follows (total_pv represents the percentage of new people who voted in the second round):

The conclusions are several:

  • Those who voted for Piñera and Guillier repeated their vote.
  • Piñera failed to capture 100% of Kast’s votes, probably due to his shift to the center.
  • A little less than half of Sánchez’s votes did not vote in the second round. This may be because the Frente Amplio called for not voting for Piñera, but not for voting for Guillier. Another way to see it is that the Frente Amplio is the left wing tired of the Nueva Mayoría.
  • Artés and Navarro gave all their votes to Guillier.
  • Goic and MEO voters voted less in the second round, but showed a preference for Guillier.
  • Piñera managed to capture more new voters than Guillier.

This same graph seen in number of votes instead of percentage looks as follows (total_pv represents the number of new people who voted in the second round):

It can be seen that the bulk of Piñera’s votes came first from his votes in the first round, then from new voters, and finally from Kast’s votes.

Now if we look at the votes overlapped according to their origin (total_pv represents the number of new people who voted in the second round):

It can be seen that if the Frente Amplio (Sánchez) had voted for Guillier (doubling the width of purple), they might have reached Piñera. On the other hand, Piñera could have won without Kast’s votes, but it would have been very tight.

In the end, the election result was due to the new voters (noVoto_pv) and to the fact that the Frente Amplio did not vote in the second round.

Greetings!

PS: The analysis result files are in Presidenciales. You will need to download chromedriver or phantomjs to run the web scraping.

 

Be the first to comment

Leave a Reply

Your email address will not be published.




This site uses Akismet to reduce spam. Learn how your comment data is processed.