Movies 2016

Movies make us laugh, cry, and some… sleep, so I decided to do a small analysis on 2016 movies.
As with Video Games and Data Science, we did web scraping from www.metacritic.com to generate a database, in which, for each movie we obtained the following information:

  • Country of Origin
  • Genres
  • Language
  • Metascore (average score from reviews)
  • Production Company
  • Rating
  • Duration
  • Summary
  • Userscore (average score assigned by users)

The database is available for download at the end of the article.

Descriptive Analysis

Out of the 610 movies available on the site, 596 were downloaded correctly, but only 398 have all the data, so we will work with a sample corresponding to 63% of Metacritic movies.
The numeric variables are userscore, metascore and the difference between them (metascore – userscore) called bonus (how much score the reviews gave away), as we can see in the following table, the 3 variables follow a normal distribution, important for building models, and the first thing we detect is that in general users give 4.70 points less than the reviews, this is opposite to what happens in video games.

variable average stddev shapiro
bonus -4.70 13.43 0.05
userscore 63.93 14.15 0.00
metascore 59.23 18.14 0.00

Histogram of the variables:

 

Genre Analysis

Let’s look at the people’s favorite movies. In the following table we sort descending by average userscore for each genre (remember each movie can have more than one genre). We immediately detect that movies with a cultural connotation (Documentary, Music, Musical, History, Biography) have the highest scores and the bonus given by critics is above average. On the other hand, the movies worst perceived by people are: Horror, Thriller, Sci-Fi, Comedy, Action.
Interestingly, Comedy, Action, Thriller are the most frequent categories, while movies with a cultural connotation are among the least frequent.

genre frequency userscore metascore bonus
Documentary 27 71.30 76.70 5.41
Music 14 70.57 67.93 -2.64
Musical 10 70.50 69.20 -1.30
History 27 68.11 66.26 -1.85
Romance 56 67.36 61.89 -5.46
Biography 38 67.32 63.47 -3.84
War 14 66.79 59.00 -7.79
Sport 9 66.00 58.78 -7.22
Animation 23 65.91 60.43 -5.48
Drama 231 65.75 61.80 -3.95
Fantasy 39 65.10 55.36 -9.74
Crime 52 65.02 57.13 -7.88
Mystery 37 64.27 54.27 -10.00
Adventure 70 63.66 54.93 -8.73
Family 33 63.12 56.45 -6.67
Western 6 61.83 56.83 -5.00
Action 78 61.72 50.27 -11.45
Comedy 129 61.64 56.99 -4.64
Sci-Fi 39 61.59 51.64 -9.95
Thriller 119 61.47 53.85 -7.62
Horror 52 59.48 52.13 -7.35

Conclusion

The most frequent movies are the ones people don’t like, while the least frequent are the best received. It’s interesting that the film industry doesn’t do this analysis and question what kind of movies to film.
But for now, when you look at the billboard and nothing catches your attention, remember this article, laugh on the inside and Netflix’n chill.

Attachments

Be the first to comment

Leave a Reply

Your email address will not be published.




This site uses Akismet to reduce spam. Learn how your comment data is processed.