Video Games and Statistics

The video game industry has grown exponentially. New game genres, new business models, new types of gamers, and new devices for playing have been created, but what hasn’t changed is that so‑called triple‑A games still exist. Triple‑A games are characterized by belonging to companies with a large budget to invest in games, which have to sell many copies to be profitable.

We are talking about games that cost between 10 and 60 million dollars, so companies use various (and questionable) techniques to make them profitable, for example DLCs, season passes, expansions, buying "boosters," and finally they make sure there are reviews that talk about their games. Obviously, they need reviews that speak well of their games, so I have always believed that reviews are not trustworthy.

Today I decided to check if this is true, and I will do it with the information available on www.metacritic.com. This page is very interesting because it compiles all the reviews made by reviewers and generates an "average score" called "metascore," and also allows users to give their own evaluation of each game, which when averaged is called "userscore." All the information about these scores is available with all the necessary detail to do an analysis and check how trustworthy the reviews are.

Our fundamental assumption is that the userscore is the real evaluation of video games, while the metascore is the score altered by the influences of the mega corporations that make games. In this article we will show how we obtained the information from Metacritic and then how we evaluated and statistically tested the "generosity of reviewers."

The Available Data

As I said before, the site has average scores and details. For our purposes we will download for PS4 games the following data:

  • MetaScore
  • UserScore
  • Company that publishes the game
  • MetaScore detail (score given by each reviewer)

For example, for the case of FIFA 17, we would see something like this on the page:

Generating for each game a record with this information (I hid the key I will use to cross‑reference with the detail table):

On the other hand, below we can see the detail of the reviews containing the score each reviewer gave to a particular game. In this case, all the reviews for FIFA 17:

Data Extraction

To extract the data we will use Selenium through Python. Selenium is a controller that allows you to control a browser – either Firefox, Chrome or PhantomJS – through code written in a language like Python.

To keep things simple, we will extract the data to a set of .csv files, each containing the information available for 100 games.

The code is as follows:

import pandas as pd
import time
import sys
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome()
#driver = webdriver.PhantomJS()

url_generica = "http://www.metacritic.com/browse/games/score/metascore/all/ps4/filtered?sort=desc&page={}"

def get_game(url):
    print("GoTo " + url)
    driver.get(url)
    game = driver.find_element_by_xpath('//h1[@class="product_title"]/a[@class="hover_none"]/span[@itemprop="name"]').text
    print(game)
    registro = pd.DataFrame({
        "game" : [game],
        "publisher" : [driver.find_element_by_xpath('//*[@itemprop="publisher"]/span[@class="data"]').text],
        "devel" : [driver.find_element_by_xpath('//*[@class="summary_detail developer"]/span[@class="data"]').text],
        "metascore" : [driver.find_element_by_xpath('//*[@itemprop="ratingValue"]').text],
        "userscrore" : [driver.find_element_by_xpath('//div[@class="userscore_wrap feature_userscore"]/a[@class="metascore_anchor"]/div').text],
        "metascore_url" : [url + "/critic-reviews"],
        "userscore_url" : [url + "/user-reviews" ],
        "game_url" : [url]
    })
    print("Metascore Scrapped")
    return(registro)

def get_metascore_detail(url):
    driver.get(url)
    print("Scrapping Details")
    registro = pd.DataFrame({
        "detail_metascore_value" : [x.text for x in driver.find_elements_by_xpath('//div[@class = "module reviews_module critic_reviews_module"]//div[@class = "review_grade"]')],
        "detail_metascore_source" : [x.text for x in driver.find_elements_by_xpath('//div[@class = "review_critic"]/div[@class="source"]')]
    })
    registro["metascore_url"] = url
    print("Details Scrapped")
    return(registro)

paginas = [x for x in range(8)]
while len(paginas) > 0:
    p_num = paginas.pop()
    url = url_generica.format(p_num)
    print(url)
    driver.get(url)
    games_urls = driver.find_elements_by_xpath('//*[@class="product_item product_title"]/a')
    games_urls = [l.get_attribute("href") for l in games_urls]

    compiled_metacritic = pd.DataFrame()
    compiled_detail = pd.DataFrame()
    while len(games_urls) > 0:
        print("********* NEW SCRAP ********")
        url = games_urls.pop()
        compiled_metacritic = compiled_metacritic.append(get_game(url))
        compiled_detail = compiled_detail.append(get_metascore_detail(url + "/critic-reviews"))
        print("********* END SCRAP ********")
    compiled_metacritic.to_csv("metacritic/metacritic_" + str(p_num) + ".csv", index = False)
    compiled_detail.to_csv("detail/metacritic_detail" + str(p_num) + ".csv", index = False)

Analysis

The first thing is to look at the data in general. What we will do first is to see at the aggregate level the userscore versus the metascore. As can be seen in the table below, on average the userscore is 4.26 higher than the metascore, which means that generally there is a tendency for users to be more demanding than reviewers.

promedio metascrore promedio userscore diferencia
71.53 67.27 4.26

By performing a t‑test, we obtain a p‑value that allows us to statistically reject the null hypothesis that the means are equal. In other words, there is statistical evidence to assert that the userscore is lower than the metascore:

Now we will perform the same test but for each publisher using R. The code is as follows:

options(stringsAsFactors = FALSE)
Sys.setenv(TZ='GMT')
library(plyr)
library(reshape2)
library(xtable)

read_csv_folder = function(folder){
  files = dir(folder)
  ldply(files, function(x) read.csv(paste0(folder,"/",x)), .progress = "text")
}

metacritic = read_csv_folder("metacritic")
metacritic = transform(metacritic, metascore = as.numeric(metascore), userscrore = 10*as.numeric(userscrore))
metacritic = metacritic[complete.cases(metacritic),]
head(metacritic[,c("game","publisher","metascore","userscrore")])

details = read_csv_folder("detail")
details = transform(details, detail_metascore_value = as.numeric(detail_metascore_value))
details = details[complete.cases(details),]

# General
gral = data.frame(promedio_metascrore = round(mean(metacritic$metascore),2),
                  promedio_userscore = round(mean(metacritic$userscrore),2),
                  diferencia = round(mean(metacritic$metascore),2) - round(mean(metacritic$userscrore),2)
)
write.csv2(gral, "clipboard")

# Publisher
metacritic_stats = ddply(metacritic, "publisher", function(x) data.frame(
  metascrore = mean(x$metascore),
  userscore = mean(x$userscrore),
  diferecia = mean(x$metascore - x$userscrore),
  p_value = tryCatch(t.test(x$metascore, x$userscrore, alternative = "two.sided")$p.value, error = function(e) NA),
  cases = nrow(x)))
metacritic_stats = subset(metacritic_stats, cases > 5)
nrow(metacritic_stats)
metacritic_stats = subset(metacritic_stats, p_value <= 0.05)
metacritic_stats = metacritic_stats[order(metacritic_stats$p_value, decreasing = T),]
print(metacritic_stats)

In the following table we can see that when performing the same t‑test for publishers with at least 5 games, out of the 26 that meet the condition, 7 have a significant p‑value, which allows us to conclude that their games statistically have a metascore higher than the userscore, or in other words, something makes these games liked more by reviewers than by those who play them… strange, maybe beta games are better than final versions? haha.

Which publishers have more "generous" reviewers? To find out, we will compare the difference between the metascore and the userscore for each publisher‑reviewer, testing that the value is 0. If the null hypothesis is rejected, we can affirm that the value is different from 0. Then we will see, for each company, what percentage of reviewers are generous. The result in the table below shows that the companies from the list above have a very good relationship with reviewers. For example, 100% of the companies that review EA Sports and Konami give statistically higher scores to games than users do.

comp = merge(metacritic, details, by = "metascore_url", all.x = T)

# Reviewer with largest error
reviewer_stats = ddply(comp, c("publisher", "detail_metascore_source"), function(x) data.frame(
  error = mean(x$metascore - x$userscrore),
  p_value = tryCatch(t.test(x$metascore - x$userscrore)$p.value, error = function(e) NA),
  cases = nrow(x)))
reviewer_stats = subset(reviewer_stats, cases >= 5)
reviewer_stats_numero_evaluado = ddply(reviewer_stats, "publisher", function(x) data.frame(total_reviewers = nrow(x)))
reviewer_stats = subset(reviewer_stats, p_value <= 0.05 & error > 0)
reviewer_stats_numero_afectado = ddply(reviewer_stats, "publisher", function(x) data.frame(reviewers_generosos = nrow(x)))
reviewer_stats = merge(reviewer_stats_numero_evaluado, reviewer_stats_numero_afectado, all.x = T)
reviewer_stats[is.na(reviewer_stats)] = 0
reviewer_stats$porcentaje_generoso = round(100 * reviewer_stats$reviewers_generosos / reviewer_stats$total_reviewers)
arrange(reviewer_stats, -porcentaje_generoso)

Conclusions

We can conclude that there is indeed a relationship between reviewers and publishers that can confuse us when choosing games. Considering that games cost approximately 60 dollars, I would recommend waiting to see what other users say before buying a game, especially for large companies such as: Sierra Games, Konami, Ubisoft, Electronic Arts, EA Sports, Zen Studios, and Activision.

I hope you liked it and share it! Greetings!

Attachments: bd_juegos

Be the first to comment

Leave a Reply

Your email address will not be published.




This site uses Akismet to reduce spam. Learn how your comment data is processed.