Movies make us laugh, cry, and some… sleep, so I decided to do a small analysis on 2016 movies.
As with Video Games and Data Science, we did web scraping from www.metacritic.com to generate a database, in which, for each movie we obtained the following information:
- Country of Origin
- Genres
- Language
- Metascore (average score from reviews)
- Production Company
- Rating
- Duration
- Summary
- Userscore (average score assigned by users)
The database is available for download at the end of the article.
Descriptive Analysis
Out of the 610 movies available on the site, 596 were downloaded correctly, but only 398 have all the data, so we will work with a sample corresponding to 63% of Metacritic movies.
The numeric variables are userscore, metascore and the difference between them (metascore – userscore) called bonus (how much score the reviews gave away), as we can see in the following table, the 3 variables follow a normal distribution, important for building models, and the first thing we detect is that in general users give 4.70 points less than the reviews, this is opposite to what happens in video games.
| variable | average | stddev | shapiro |
|---|---|---|---|
| bonus | -4.70 | 13.43 | 0.05 |
| userscore | 63.93 | 14.15 | 0.00 |
| metascore | 59.23 | 18.14 | 0.00 |
Histogram of the variables:

Genre Analysis
Let’s look at the people’s favorite movies. In the following table we sort descending by average userscore for each genre (remember each movie can have more than one genre). We immediately detect that movies with a cultural connotation (Documentary, Music, Musical, History, Biography) have the highest scores and the bonus given by critics is above average. On the other hand, the movies worst perceived by people are: Horror, Thriller, Sci-Fi, Comedy, Action.
Interestingly, Comedy, Action, Thriller are the most frequent categories, while movies with a cultural connotation are among the least frequent.
| genre | frequency | userscore | metascore | bonus |
|---|---|---|---|---|
| Documentary | 27 | 71.30 | 76.70 | 5.41 |
| Music | 14 | 70.57 | 67.93 | -2.64 |
| Musical | 10 | 70.50 | 69.20 | -1.30 |
| History | 27 | 68.11 | 66.26 | -1.85 |
| Romance | 56 | 67.36 | 61.89 | -5.46 |
| Biography | 38 | 67.32 | 63.47 | -3.84 |
| War | 14 | 66.79 | 59.00 | -7.79 |
| Sport | 9 | 66.00 | 58.78 | -7.22 |
| Animation | 23 | 65.91 | 60.43 | -5.48 |
| Drama | 231 | 65.75 | 61.80 | -3.95 |
| Fantasy | 39 | 65.10 | 55.36 | -9.74 |
| Crime | 52 | 65.02 | 57.13 | -7.88 |
| Mystery | 37 | 64.27 | 54.27 | -10.00 |
| Adventure | 70 | 63.66 | 54.93 | -8.73 |
| Family | 33 | 63.12 | 56.45 | -6.67 |
| Western | 6 | 61.83 | 56.83 | -5.00 |
| Action | 78 | 61.72 | 50.27 | -11.45 |
| Comedy | 129 | 61.64 | 56.99 | -4.64 |
| Sci-Fi | 39 | 61.59 | 51.64 | -9.95 |
| Thriller | 119 | 61.47 | 53.85 | -7.62 |
| Horror | 52 | 59.48 | 52.13 | -7.35 |
Conclusion
The most frequent movies are the ones people don’t like, while the least frequent are the best received. It’s interesting that the film industry doesn’t do this analysis and question what kind of movies to film.
But for now, when you look at the billboard and nothing catches your attention, remember this article, laugh on the inside and Netflix’n chill.

Leave a Reply