{"id":3291,"date":"2017-05-23T18:00:20","date_gmt":"2017-05-23T16:00:20","guid":{"rendered":"https:\/\/geekosas.com\/?p=3291"},"modified":"2026-05-23T18:00:56","modified_gmt":"2026-05-23T16:00:56","slug":"movies-2016","status":"publish","type":"post","link":"https:\/\/geekosas.com\/index.php\/2017\/05\/23\/movies-2016\/","title":{"rendered":"Movies 2016"},"content":{"rendered":"<p>Movies make us laugh, cry, and some&#8230; sleep, so I decided to do a small analysis on 2016 movies.<br \/>\nAs with <a href=\"https:\/\/www.geekosas.com\/index.php\/2017\/02\/20\/video-juegos-y-data-science\/\" target=\"_blank\" rel=\"noopener noreferrer\">Video Games and Data Science<\/a>, we did web scraping from <a href=\"http:\/\/www.metacritic.com\/\">www.metacritic.com<\/a> to generate a database, in which, for each movie we obtained the following information:<\/p>\n<ul>\n<li>Country of Origin<\/li>\n<li>Genres<\/li>\n<li>Language<\/li>\n<li>Metascore (average score from reviews)<\/li>\n<li>Production Company<\/li>\n<li>Rating<\/li>\n<li>Duration<\/li>\n<li>Summary<\/li>\n<li>Userscore (average score assigned by users)<\/li>\n<\/ul>\n<p>The database is available for download at the end of the article.<\/p>\n<h3>Descriptive Analysis<\/h3>\n<p>Out of the 610 movies available on the site, 596 were downloaded correctly, but only 398 have all the data, so we will work with a sample corresponding to 63% of Metacritic movies.<br \/>\nThe numeric variables are userscore, metascore and the difference between them (metascore &#8211; userscore) called bonus (how much score the reviews gave away), as we can see in the following table, the 3 variables follow a normal distribution, important for building models, and the first thing we detect is that in general users give 4.70 points less than the reviews, this is opposite to what <a href=\"http:\/\/www.geekosas.com\/index.php\/2017\/02\/20\/video-juegos-y-data-science\/\" target=\"_blank\" rel=\"noopener noreferrer\">happens in video games<\/a>.<\/p>\n<table border=\"1\">\n<tbody>\n<tr>\n<th>variable<\/th>\n<th>average<\/th>\n<th>stddev<\/th>\n<th>shapiro<\/th>\n<\/tr>\n<tr>\n<td>bonus<\/td>\n<td align=\"right\">-4.70<\/td>\n<td align=\"right\">13.43<\/td>\n<td align=\"right\">0.05<\/td>\n<\/tr>\n<tr>\n<td>userscore<\/td>\n<td align=\"right\">63.93<\/td>\n<td align=\"right\">14.15<\/td>\n<td align=\"right\">0.00<\/td>\n<\/tr>\n<tr>\n<td>metascore<\/td>\n<td align=\"right\">59.23<\/td>\n<td align=\"right\">18.14<\/td>\n<td align=\"right\">0.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Histogram of the variables:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2149\" data-permalink=\"https:\/\/geekosas.com\/index.php\/es\/2017\/04\/23\/peliculas-2016\/histogramas\/\" data-orig-file=\"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2017\/03\/histogramas.png?fit=1000%2C400&amp;ssl=1\" data-orig-size=\"1000,400\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"histogramas\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2017\/03\/histogramas.png?fit=1000%2C400&amp;ssl=1\" class=\"alignnone wp-image-2149\" src=\"https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/03\/histogramas-300x120.png?resize=1030%2C412\" alt=\"\" width=\"1030\" height=\"412\" srcset=\"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2017\/03\/histogramas.png?resize=300%2C120&amp;ssl=1 300w, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2017\/03\/histogramas.png?resize=768%2C307&amp;ssl=1 768w, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2017\/03\/histogramas.png?w=1000&amp;ssl=1 1000w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/p>\n<p>&nbsp;<\/p>\n<h3>Genre Analysis<\/h3>\n<p>Let&#8217;s look at the people&#8217;s favorite movies. In the following table we sort descending by average userscore for each genre (remember each movie can have more than one genre). We immediately detect that movies with a cultural connotation (Documentary, Music, Musical, History, Biography) have the highest scores and the bonus given by critics is above average. On the other hand, the movies worst perceived by people are: Horror, Thriller, Sci-Fi, Comedy, Action.<br \/>\nInterestingly, Comedy, Action, Thriller are the most frequent categories, while movies with a cultural connotation are among the least frequent.<\/p>\n<table border=\"1\">\n<tbody>\n<tr>\n<th>genre<\/th>\n<th>frequency<\/th>\n<th>userscore<\/th>\n<th>metascore<\/th>\n<th>bonus<\/th>\n<\/tr>\n<tr>\n<td>Documentary<\/td>\n<td align=\"right\">27<\/td>\n<td align=\"right\">71.30<\/td>\n<td align=\"right\">76.70<\/td>\n<td align=\"right\">5.41<\/td>\n<\/tr>\n<tr>\n<td>Music<\/td>\n<td align=\"right\">14<\/td>\n<td align=\"right\">70.57<\/td>\n<td align=\"right\">67.93<\/td>\n<td align=\"right\">-2.64<\/td>\n<\/tr>\n<tr>\n<td>Musical<\/td>\n<td align=\"right\">10<\/td>\n<td align=\"right\">70.50<\/td>\n<td align=\"right\">69.20<\/td>\n<td align=\"right\">-1.30<\/td>\n<\/tr>\n<tr>\n<td>History<\/td>\n<td align=\"right\">27<\/td>\n<td align=\"right\">68.11<\/td>\n<td align=\"right\">66.26<\/td>\n<td align=\"right\">-1.85<\/td>\n<\/tr>\n<tr>\n<td>Romance<\/td>\n<td align=\"right\">56<\/td>\n<td align=\"right\">67.36<\/td>\n<td align=\"right\">61.89<\/td>\n<td align=\"right\">-5.46<\/td>\n<\/tr>\n<tr>\n<td>Biography<\/td>\n<td align=\"right\">38<\/td>\n<td align=\"right\">67.32<\/td>\n<td align=\"right\">63.47<\/td>\n<td align=\"right\">-3.84<\/td>\n<\/tr>\n<tr>\n<td>War<\/td>\n<td align=\"right\">14<\/td>\n<td align=\"right\">66.79<\/td>\n<td align=\"right\">59.00<\/td>\n<td align=\"right\">-7.79<\/td>\n<\/tr>\n<tr>\n<td>Sport<\/td>\n<td align=\"right\">9<\/td>\n<td align=\"right\">66.00<\/td>\n<td align=\"right\">58.78<\/td>\n<td align=\"right\">-7.22<\/td>\n<\/tr>\n<tr>\n<td>Animation<\/td>\n<td align=\"right\">23<\/td>\n<td align=\"right\">65.91<\/td>\n<td align=\"right\">60.43<\/td>\n<td align=\"right\">-5.48<\/td>\n<\/tr>\n<tr>\n<td>Drama<\/td>\n<td align=\"right\">231<\/td>\n<td align=\"right\">65.75<\/td>\n<td align=\"right\">61.80<\/td>\n<td align=\"right\">-3.95<\/td>\n<\/tr>\n<tr>\n<td>Fantasy<\/td>\n<td align=\"right\">39<\/td>\n<td align=\"right\">65.10<\/td>\n<td align=\"right\">55.36<\/td>\n<td align=\"right\">-9.74<\/td>\n<\/tr>\n<tr>\n<td>Crime<\/td>\n<td align=\"right\">52<\/td>\n<td align=\"right\">65.02<\/td>\n<td align=\"right\">57.13<\/td>\n<td align=\"right\">-7.88<\/td>\n<\/tr>\n<tr>\n<td>Mystery<\/td>\n<td align=\"right\">37<\/td>\n<td align=\"right\">64.27<\/td>\n<td align=\"right\">54.27<\/td>\n<td align=\"right\">-10.00<\/td>\n<\/tr>\n<tr>\n<td>Adventure<\/td>\n<td align=\"right\">70<\/td>\n<td align=\"right\">63.66<\/td>\n<td align=\"right\">54.93<\/td>\n<td align=\"right\">-8.73<\/td>\n<\/tr>\n<tr>\n<td>Family<\/td>\n<td align=\"right\">33<\/td>\n<td align=\"right\">63.12<\/td>\n<td align=\"right\">56.45<\/td>\n<td align=\"right\">-6.67<\/td>\n<\/tr>\n<tr>\n<td>Western<\/td>\n<td align=\"right\">6<\/td>\n<td align=\"right\">61.83<\/td>\n<td align=\"right\">56.83<\/td>\n<td align=\"right\">-5.00<\/td>\n<\/tr>\n<tr>\n<td>Action<\/td>\n<td align=\"right\">78<\/td>\n<td align=\"right\">61.72<\/td>\n<td align=\"right\">50.27<\/td>\n<td align=\"right\">-11.45<\/td>\n<\/tr>\n<tr>\n<td>Comedy<\/td>\n<td align=\"right\">129<\/td>\n<td align=\"right\">61.64<\/td>\n<td align=\"right\">56.99<\/td>\n<td align=\"right\">-4.64<\/td>\n<\/tr>\n<tr>\n<td>Sci-Fi<\/td>\n<td align=\"right\">39<\/td>\n<td align=\"right\">61.59<\/td>\n<td align=\"right\">51.64<\/td>\n<td align=\"right\">-9.95<\/td>\n<\/tr>\n<tr>\n<td>Thriller<\/td>\n<td align=\"right\">119<\/td>\n<td align=\"right\">61.47<\/td>\n<td align=\"right\">53.85<\/td>\n<td align=\"right\">-7.62<\/td>\n<\/tr>\n<tr>\n<td>Horror<\/td>\n<td align=\"right\">52<\/td>\n<td align=\"right\">59.48<\/td>\n<td align=\"right\">52.13<\/td>\n<td align=\"right\">-7.35<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Conclusion<\/h3>\n<p>The most frequent movies are the ones people don&#8217;t like, while the least frequent are the best received. It&#8217;s interesting that the film industry doesn&#8217;t do this analysis and question what kind of movies to film.<br \/>\nBut for now, when you look at the billboard and nothing catches your attention, remember this article, laugh on the inside and Netflix&#8217;n chill.<\/p>\n<h3>Attachments<\/h3>\n<ul>\n<li><a href=\"https:\/\/www.geekosas.com\/wp-content\/uploads\/2017\/03\/bd_peliculas.csv\">bd_peliculas<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<div class=\"mh-excerpt\"><p>Movies make us laugh, cry, and some&#8230; sleep, so I decided to do a small analysis on 2016 movies. As with Video Games and Data <a class=\"mh-excerpt-more\" href=\"https:\/\/geekosas.com\/index.php\/2017\/05\/23\/movies-2016\/\" title=\"Movies 2016\">[&#8230;]<\/a><\/p>\n<\/div>","protected":false},"author":1,"featured_media":2162,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"class_list":["post-3291","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-sin-categoria"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2017\/03\/cine.jpg?fit=339%2C295&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8vjqF-R5","jetpack-related-posts":[{"id":3319,"url":"https:\/\/geekosas.com\/index.php\/2018\/05\/23\/have-video-games-gotten-worse\/","url_meta":{"origin":3291,"position":0},"title":"Have video games gotten worse?","author":"Daniel Fischer","date":"2018-05-23","format":false,"excerpt":"Introduction \/ Abstract A data scientist is one who manages to make data speak to them; it is basically a conversation, where you ask questions and the data answers. In this notebook I want to share my latest conversation with this dataset of scores assigned to different video games. The\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/04\/consoles-800x491.jpg?fit=800%2C491&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/04\/consoles-800x491.jpg?fit=800%2C491&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/04\/consoles-800x491.jpg?fit=800%2C491&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/04\/consoles-800x491.jpg?fit=800%2C491&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":3285,"url":"https:\/\/geekosas.com\/index.php\/2017\/05\/23\/video-games-and-statistics\/","url_meta":{"origin":3291,"position":1},"title":"Video Games and Statistics","author":"Daniel Fischer","date":"2017-05-23","format":false,"excerpt":"The video game industry has grown exponentially. New game genres, new business models, new types of gamers, and new devices for playing have been created, but what hasn't changed is that so\u2011called triple\u2011A games still exist. Triple\u2011A games are characterized by belonging to companies with a large budget to invest\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/02\/fifa-300x121.jpg?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/02\/fifa-300x121.jpg?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/02\/fifa-300x121.jpg?resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/02\/fifa-300x121.jpg?resize=700%2C400 2x"},"classes":[]},{"id":3296,"url":"https:\/\/geekosas.com\/index.php\/2017\/05\/23\/i-will-teach-an-r-course\/","url_meta":{"origin":3291,"position":2},"title":"I will teach an R course.","author":"Daniel Fischer","date":"2017-05-23","format":false,"excerpt":"The course will be at Microsoft Chile on September 22: The R Intensive is an event designed for those who have data analysis and modeling needs in their work and want to gain in 1 day the theoretical and practical knowledge to start solving their analytical challenges with this tool.\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2016\/11\/RStudio-Ball.png?fit=1000%2C1000&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2016\/11\/RStudio-Ball.png?fit=1000%2C1000&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2016\/11\/RStudio-Ball.png?fit=1000%2C1000&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2016\/11\/RStudio-Ball.png?fit=1000%2C1000&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":3223,"url":"https:\/\/geekosas.com\/index.php\/2020\/05\/14\/insert-records-into-the-database-at-full-speed\/","url_meta":{"origin":3291,"position":3},"title":"Insert Records into the Database at Full Speed","author":"Daniel Fischer","date":"2020-05-14","format":false,"excerpt":"ETL tools are very useful for performing automated and recurring data transformation processes; they are characterized by performing three tasks: (E) Extract: Connect to one or more sources and extract data. (T) Transform: Transform or manipulate the data. (L) Load: Load the transformed data into the final repository. That is\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2021\/09\/patch.png?fit=728%2C380&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2021\/09\/patch.png?fit=728%2C380&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2021\/09\/patch.png?fit=728%2C380&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2021\/09\/patch.png?fit=728%2C380&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":3339,"url":"https:\/\/geekosas.com\/index.php\/2019\/05\/23\/gender-pay-gap-in-technology\/","url_meta":{"origin":3291,"position":4},"title":"Gender Pay Gap in Technology","author":"Daniel Fischer","date":"2019-05-23","format":false,"excerpt":"The Gender Pay Gap is the difference that exists on average in the salaries of Men vs. Women. Today there are people who attribute this to discrimination, while others say it is due to the decisions that men on average make versus those of women. Since both opinions have merit,\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2019\/02\/GenderPayGap-201803070107196681-20180404082357920.jpg?fit=619%2C413&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2019\/02\/GenderPayGap-201803070107196681-20180404082357920.jpg?fit=619%2C413&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2019\/02\/GenderPayGap-201803070107196681-20180404082357920.jpg?fit=619%2C413&ssl=1&resize=525%2C300 1.5x"},"classes":[]},{"id":3274,"url":"https:\/\/geekosas.com\/index.php\/2016\/05\/23\/segment-customers-step-by-step\/","url_meta":{"origin":3291,"position":5},"title":"Segment customers step by step","author":"Daniel Fischer","date":"2016-05-23","format":false,"excerpt":"Previously I wrote about neural networks (click here to see it). Neural networks and all other \"supervised methods\" are used when you have a sample of values to predict. But when you know what you want to achieve but do not have a sample of the value to predict, the\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2016\/05\/kmenas6.png?fit=620%2C539&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2016\/05\/kmenas6.png?fit=620%2C539&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2016\/05\/kmenas6.png?fit=620%2C539&ssl=1&resize=525%2C300 1.5x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/posts\/3291","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/comments?post=3291"}],"version-history":[{"count":1,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/posts\/3291\/revisions"}],"predecessor-version":[{"id":3292,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/posts\/3291\/revisions\/3292"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/media\/2162"}],"wp:attachment":[{"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/media?parent=3291"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/categories?post=3291"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/tags?post=3291"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}