Election prediction using Elo

fun elo modelling prediction

A fun project to predict the 2020 presidential election using weekly average polling data and Elo models

Quang Nguyen


This post is a re-adaptation and R-implementation of my previous post on a fun modeling project using Elo to predict the latest 2020 presidential elections. The previous implementation was done in Julia and can be found in this interactive Pluto.jl document. You can find the source code for the Julia implementation here. For this exercise, first let’s load our libraries:

What is an Elo model?

The elo rating system is a method developed by Arpad Elo in order to rank players by ability in a zero sum game. Elo has been used in various contexts, most famously the way to rank chess players through the UCSF and FIDE systems. Elo has been used in modern contexts as well, from Tinder to World of Warcraft. A really nice attribute about Elo is that a difference in rankings can be translated to a probability. Let’s assume a focal player \(A\) with rank \(R_A\) is facing up against opponent \(B\) with rank \(R_B\). The expected “score” \(E_A\) of player \(A\) in the match can be defined as:

\[E_A = \frac{1}{1 + 10^{(R_B - R_A)/400}} = \frac{1}{1 + 10^{-(R_A - R_B)/400}}\]

Recall that the standard logistic function is \(\frac{1}{1 + e^{-x}}\). You can see that there are tons of similarities between the standard logistic function and the function for the expectation of player \(A\)’s score, with the exception being the exponential constant \(e\). This is because while elo probabilities are log-odds, they are not natural log based but rather based on base 10. There are easy conversions between the base 10 and base \(e\), which allows sports statisticians to perform logistic regression on elo rankings. However, that is a bit beyond the focus of this project.

The elo model updates the ranking of the person after each match. For our focal player \(A\), the rankings are updated as:
\[R_A* = R_A + K \cdot (outcome - E_A)\] where \(K\) is the \(K\)-factor controlling the relative gains and losses of elo points. The variable outcome here is a binary variable (1 if player \(A\) wins). Think of this similar to how you would use a binary cross-entropy loss to train a model that outputs an odds (such as a logistic regression) instead of true binary values. In this case, ranking updates are based on the difference between the true outcome and the expected outcome \(E_A\). Let’s define a couple of functions reflecting the ideas above:

# Calculate win probabilities from two rankings  
win_prob <- function(r1, r2){
  x1 <- 10^((r2 - r1)/400)
  p1 <- 1/(1 + x1)
  p2 <- 1 - p1
  return(list(p1 = p1 , p2 = p2))

# Back transform from probability to difference in rankings 
back_transformation <- function(prob){
  diff <- 400 * log((1/prob) - 1,base = 10)
  # we want difference between the focal person, which means we have to invert this
  return(diff * -1)

The first function, win_prob calculates the win probability given two rankings. It outputs a list with p1 being the win probability of the focal player 1. The second function, back_transformation, takes in the probability and returns a difference in rankings.

To test and see whether our model is correct, let’s compare to a reference. According to this website, a win probability of 0.76 for focal player 1 would be a result of 200 elo difference. Let’s try it out!

Show code
glue("Win probability of 76% converts to an elo difference of {diff}", diff = back_transformation(0.76))
Win probability of 76% converts to an elo difference of 200.240940227674

So our elo formulation so far is correct.

The US presidential election as a pseudo-sports contest

The US presidential election is a surprisingly appropriate context to use elo. In this election, two candidates compete for the “title” of the President of the United States. Polling data is collected regularly, which can act as “matches” where the two candidates can test their “mettle” before the final election day. Furthermore, the structure of the election, a winner-take-all Electoral College system, makes it even closer to being a sports match.

In this instance, elo system is applied to model the probability of winning an election. Weekly average polling results are used as proxy for “games”. After each weekly poll, the candidate gains or lose elo based on whether they win or lose (polling average > 50%) and how much did they win (or lose) by. Some major caveats and assumptions:


“Training” data

Let’s start our modeling exercise by reading in basic polling data from FiveThirtyEight. Here, we simplify things a little bit. First, we do not consider the split votes of the states of Nebraska and Maine, and second, we only focus on the adjusted percentage scores, which usually accounts for polling reliability.

Show code
data <- read.csv(file = "data/presidential_poll_averages_2020.csv")
proc_data <- data %>% as_tibble() %>% mutate(modeldate = as_date(strptime(modeldate, format = "%m/%d/%Y"))) %>% # reformat names
  mutate(candidate_name = case_when(
    candidate_name == "Convention Bounce for Joseph R. Biden Jr." ~ "Joseph R. Biden Jr.",
    candidate_name == "Convention Bounce for Donald Trump" ~ "Donald Trump", 
    TRUE ~ candidate_name
  )) %>% mutate(week = week(modeldate)) %>% filter(!state %in% c("NE-1", "NE-2", "ME-1", "ME-2", "National")) %>% # for simplicity
  group_by(state, week, candidate_name) %>% # get week 
  summarise(poll_avg = mean(pct_trend_adjusted)) # average per week 

Prior information for initial elo seeding

Just having an elo model is not enough! We want to seed our “players” with some elo first. The easiest way is to seed both candidates with an elo of 1000, but that doesn’t account for each candidate’s advantage at each state due to their political affiliation. Here, we decided to take a simple solution and take the election results from the 2016 election. Based on the margin of victory, we will use the back_transformation function to calculate a difference in elo and use that as a seed. For example, if Donald Trump won state X with 76% of the votes, then he would start with 1200 elo compared to Joe Biden’s 1000 elo. As such, we are able to incorporate some prior information. We want polling to be good enough to overcome this inherent advantage.

Show code
# Function to convert abbreviation to state name
get_stname <- function(state_abb){
  if (state_abb == "DC"){
    name <- "District of Columbia"
  } else {
    name <- state.name[grep(state_abb, state.abb)]
  if (rlang::is_empty(name)){
  } else {
# Prior information 
prior <- read.csv(file = "data/returns-2016.csv") %>% as_tibble()

# convert state abbreviation to name, renormalize to two party candidates, then back transform prob to difference in elo
prior <- prior %>% mutate(state = map_chr(state_abbreviation, ~get_stname(.x))) %>% 
  mutate(Clinton = Clinton/(Clinton + Trump), Trump = Trump/(Clinton + Trump)) %>% 
  select(-c(state_abbreviation, Other)) %>% 
  mutate(elo_diff = back_transformation(Clinton))

Incorporating margin of victory

A lot of elo models also incorporate something called the margin of victory. Margin of victory (mov) scales elo gains and losses based on how well the player performed. A knock-out match from an underdog would result in a massive gain in elo, compared to if that match was a close one. We try to incorporate margin of victory into our elo update calculation by modifying the K-factor with a linear adjustment \[K_{eff} = K * mov\] where \[mov\] is the margin of victory. This means that the “winning” candidate will gain points as a proportion of K that is equal to their margin of polling victory. As such, we increase values of K significantly to compensate for the low gains in elo when margins can be 1 percent. This also prevents polling results from driving elo scores to crazy levels.

Taking that into account, here’s out update function.

# Update elo using those rankings  

update_elo <- function(r1, r2, outcome, mov, K=30){
  probs <- win_prob(r1, r2)
  p1 <- probs$p1
  p2 <- probs$p2
  K = K * mov # K-gains as a proportion of margin of victory  
  r1_new <- round(r1 + K*(outcome - p1), digits = 0)
  r2_new <- round(r2 + K*((1 - outcome) - p2), digits = 0)
  return(list(r1 = r1_new, r2 = r2_new))

Getting elo sequence

The general gist of the model would be getting the elo sequence for each state based on weekly average polling, with prior information from the 2016 election. We’re going to choose the Democratic candidate as the focal candidate, with rankings equals r1. With that in mind, a wrapper function that performs this for each state.

get_sequence <- function(proc_data, prior, st, init=1000, K = 50){
  df <- proc_data %>% filter(state == st) # filter by state
  weeks <- unique(df$week) # get all unique weeks 
  elo <- vector(mode = "list", length = length(weeks) + 1) # initialize elo vector
  prior_diff <- prior %>% filter(state == st) %>% pull(elo_diff)
  # if Trump won state then prior_diff < 0 since Democrats are the focal candidate
  if (prior_diff < 0){
    elo[[1]] <- tibble(
      DT_elo = init + abs(prior_diff),
      JB_elo = init
  } else if (prior_diff > 0){
    elo[[1]] <- tibble(
      DT_elo = init,
      JB_elo = init + abs(prior_diff)
  } else {
    elo[[1]] <-  tibble(
      DT_elo = init,
      JB_elo = init
  # iterate 
  for (i in 2:length(elo)){
    elos <- update_elo(r1 = elo[[i-1]]$JB_elo, 
                       r2 = elo[[i-1]]$DT_elo,
                       outcome = df$win[i-1], 
                       mov = df$mov[i-1], K = K)
    elo[[i]] <- tibble(
      DT_elo = elos$r2,
      JB_elo = elos$r1,
  seq <- do.call(rbind, elo)
  weeks <- c(min(weeks)-1, weeks)
  seq <- seq %>% mutate(state = rep(st, length(elo)), week = weeks)
  seq <- seq %>% select(state, week, DT_elo, JB_elo) %>% 
    pivot_longer(c(DT_elo, JB_elo), names_to = "candidate", values_to = "elo") %>% 
    mutate(candidate = recode(candidate, DT_elo = "DT", JB_elo = "JB")) %>% 
    pivot_wider(names_from = candidate, values_from = elo)

The implementation for this is a bit weird, since we have a time 0 where the candidates are seeded. But before we apply this to our data, first let’s process further the basic polling and assigning a win or loss based upon the polling percentages to generate our \(outcome\). If the poll is tied, a random draw from the binomial distribution is performed.

Show code
proc_data <- proc_data %>% 
  mutate(candidate_name = case_when(candidate_name == "Donald Trump" ~ "DT", TRUE ~ "JB")) %>% 
  pivot_wider(names_from = "candidate_name", values_from = "poll_avg") %>% 
  mutate(win = case_when(JB > DT ~ as.integer(1), 
                         JB < DT ~ as.integer(0), 
                         JB == DT ~ rbinom(1,1,0.5))) %>% 
  mutate(mov = abs(DT - JB)/100)

Applying our elo sequences model for all states.

Show code
get_elo_all_states <- function(proc_data, prior, K=50, init=1000){
  states <- unique(proc_data$state)
  data <- vector(mode = "list", length = length(states))
  for (i in seq_len(length(states))){
    data[[i]] <- get_sequence(proc_data, prior, states[i], K = K, init = init)
  return(do.call(dplyr::bind_rows, data))

elo_sequences <- get_elo_all_states(proc_data, prior)
elo_sequences <- elo_sequences %>% rowwise() %>% mutate(prob = win_prob(r1 = JB, r2 = DT)$p1)

Final probabilities using Monte Carlo simulations

To determine the final probability of winning the election, we take our state-level probabilities and simulate binomial draws with our probabilities. This is similar to how FiveThirtyEight performs their final modelling as well. The final win chance is therefore the the number of simulations where a focal candidate (in this case Biden) gains 270 or more Electoral Votes.

Let’s create another wrapper function that wraps all of this together in one go:

elo_predict <- function(proc_data, prior, e_votes, K = 50, init = 1000, n_iter = 1000){
  # first, let's get all elo sequences 
  elo_sequences <- get_elo_all_states(proc_dat = proc_data, prior = prior, K = K, init = init)
  # then, let's get probabilities from them 
  elo_sequences <- get_elo_all_states(proc_data, prior)
  elo_sequences <- elo_sequences %>% rowwise() %>% mutate(prob = win_prob(r1 = JB, r2 = DT)$p1)
  # then, let's retrieve the probabilities by state at the last week 
  # (which.max allows for selecting the most recent week)
  suppressMessages(probs <- elo_sequences %>% group_by(state) %>% summarise(prob = prob[which.max(week)]))
  # merge electoral votes with probability data 
  probs <- left_join(probs, e_votes, by = "state")
  # predict and multiply with votes 
  suppressMessages(preds <- map_dfc(seq_len(n_iter), ~{
    map_int(probs$prob, ~rbinom(1,1,.x)) * probs$votes
  # rename 
  names(preds) <- paste0("sim_",seq_len(n_iter))
  # retrieve state names and swtich columns
  preds <- preds %>% mutate(state = probs$state) %>% select(state, everything())
  return(list(preds = preds, elo_seq = elo_sequences))


Now that we have everything set up, let’s perform some predictions!
We use the default \(K\)-factor of 50 and the initial elo seeding of 1000 with 1000 simulations. Since elo operates mostly from the differences in elo rankings, the initial elo seeding is not as important.

Show code
pred <- elo_predict(proc_data = proc_data, prior = prior, e_votes = electoral_votes, K = 50, init = 1000)