{"id":3235,"date":"2020-05-23T12:03:16","date_gmt":"2020-05-23T10:03:16","guid":{"rendered":"https:\/\/geekosas.com\/?p=3235"},"modified":"2026-05-23T19:29:11","modified_gmt":"2026-05-23T17:29:11","slug":"best-practices-when-programming-with-code","status":"publish","type":"post","link":"https:\/\/geekosas.com\/index.php\/2020\/05\/23\/best-practices-when-programming-with-code\/","title":{"rendered":"Best Practices When Programming (with code)"},"content":{"rendered":"<h1>Best Practices When Programming (with code)<\/h1>\n<p>Many enjoy programming and solving algorithmic mazes in their favorite language, but what nobody likes is diving into old code or even worse, code written by someone else.<br \/>\nI&#8217;ve had to do it several times, sometimes it was bad, other times worse, many times I end up rewriting part of the code that makes my eyes bleed, and that&#8217;s why I&#8217;ve learned some best practices when programming.<\/p>\n<p><strong>Warning:<\/strong> This comes from a Data Scientist&#8217;s perspective, so these cases generally apply to ETL and Machine Learning models.<\/p>\n<hr \/>\n<h2>\ud83d\udcad Think Before You Code<\/h2>\n<p>Many times I&#8217;ve seen that when people receive a task, they sit down and start pressing keys trying to accomplish what was requested, and that generally produces very bad code.<\/p>\n<p>The lines of code written reflect the logic with which the developer solved the problem, so if they don&#8217;t have this logic clear, it will be a difficult thread to follow.<\/p>\n<p>This will help you name variables, write comments and divide the code into parts.<\/p>\n<hr \/>\n<h2>\u25b6\ufe0f Code Should Run Sequentially<\/h2>\n<p>With the appearance of interactive languages like Python, R and Julia among others, I&#8217;ve seen many times scripts that to be executed must be run in pieces, something like: run from line 10 to 35, then run lines 1 and 2, then take the result from that line and paste it into line 40.<\/p>\n<p>That&#8217;s not programming, it&#8217;s spreadsheet-ing in R or Python.<\/p>\n<hr \/>\n<h2>\ud83d\udd04 If It&#8217;s an ETL, Give It an ETL Structure<\/h2>\n<p>Try to use predefined structures and when processing data, many of those processes will be ETL. If possible, follow these steps:<\/p>\n<ol>\n<li>Environment configuration script<\/li>\n<li>Libraries<\/li>\n<li>Load Data<\/li>\n<li>Manipulate data<\/li>\n<li>Save the result<\/li>\n<\/ol>\n<hr \/>\n<h2>\u2702\ufe0f Divide the Code Into Parts<\/h2>\n<p>Try to have each script execute a particular task, saving the result in some storage for a next stage, which will only have in common that it reads the previously saved data.<\/p>\n<hr \/>\n<h2>\ud83c\udff7\ufe0f Give Variables Good Names<\/h2>\n<p>It will help you return to the code in the future:<\/p>\n<ol>\n<li>Name functions as <strong>verbs<\/strong> that describe what they do<\/li>\n<li>Name objects with <strong>nouns<\/strong> that describe them<\/li>\n<li>If you have an object that represents a big apple, name it <code>big_apple<\/code>. If you don&#8217;t like that standard, use another like <code>bigApple<\/code>, but be <strong>consistent<\/strong><\/li>\n<li>If you have tables with 2 columns that are the same value, give them the <strong>same name<\/strong><\/li>\n<\/ol>\n<pre><code class=\"language-python\"># \u274c Bad\ndef data(x):\n    return x * 0.16\n\nt = data(100)\n\n# \u2705 Good\ndef calculate_tax(price):\n    return price * 0.16\n\ntotal_tax = calculate_tax(100)<\/code><\/pre>\n<hr \/>\n<h2>\ud83d\udd24 Don&#8217;t Use Non-ASCII Characters<\/h2>\n<p>Don&#8217;t create variables with characters like <code>\u00f1<\/code> or <code>\u00e1\u00e9\u00ed\u00f3\u00fa<\/code> or even spaces. At some point it will be a problem.<\/p>\n<pre><code class=\"language-python\"># \u274c Bad\na\u00f1o_fiscal = 2024\npr\u00f3ximo_a\u00f1o = 2025\n\n# \u2705 Good\nfiscal_year = 2024\nnext_year = 2025<\/code><\/pre>\n<hr \/>\n<h2>\ud83d\uddc2\ufe0f Use Version Control<\/h2>\n<p>You don&#8217;t want to lose all your work, or imagine that at some point you made a mistake and want to go back. Back up your work in something like <strong>GIT<\/strong>, there are very good online courses.<\/p>\n<pre><code class=\"language-bash\"># Basic commands\ngit init\ngit add .\ngit commit -m &quot;Add tax calculation function&quot;\ngit push origin main<\/code><\/pre>\n<hr \/>\n<h2>\ud83d\udcca Work with Data in Tidy Format<\/h2>\n<p>New tools for manipulating data are focused on doing so with data in tidy format or &quot;long&quot; tables.<\/p>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Tidy_data\">\ud83d\udcd6 Tidy Data &#8211; Wikipedia<\/a><\/p>\n<p>\u274c Bad &#8211; Wide format<\/p>\n<table>\n<thead>\n<tr>\n<th>person<\/th>\n<th>weight_2020<\/th>\n<th>weight_2021<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Juan<\/td>\n<td>70<\/td>\n<td>72<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\u2705 Good &#8211; Tidy\/Long format<\/p>\n<table>\n<thead>\n<tr>\n<th>person<\/th>\n<th>year<\/th>\n<th>weight<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Juan<\/td>\n<td>2020<\/td>\n<td>70<\/td>\n<\/tr>\n<tr>\n<td>Juan<\/td>\n<td>2021<\/td>\n<td>72<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>\ud83c\udfa8 Data Formatting Goes at the End<\/h2>\n<p>It is recommended to use objects and\/or variables in their native format in the language and leave the formatting for the human eye until the end.<\/p>\n<pre><code class=\"language-python\"># \u274c Bad - Formatting too early\ntax_rate = &quot;16%&quot;\nresult = float(tax_rate.replace(&quot;%&quot;,&quot;&quot;)) \/ 100 * price  # unnecessary conversion\n\n# \u2705 Good - Work in native format\nTAX_RATE = 0.16\nresult = TAX_RATE * price\nprint(f&quot;Tax: {result * 100:.1f}%&quot;)  # format only at display time<\/code><\/pre>\n<p>Same with dates:<\/p>\n<pre><code class=\"language-python\">from datetime import date\n\n# \u2705 Work as Date object\ntransaction_date = date(2024, 1, 15)\n\n# Format only when displaying\nprint(transaction_date.strftime(&quot;%d-%m-%Y&quot;))  # 15-01-2024<\/code><\/pre>\n<hr \/>\n<h2>\ud83d\udccb Generate an Execution Log<\/h2>\n<p>With a <code>try<\/code> and <code>catch<\/code> you can generate a log, for example in R:<\/p>\n<pre><code class=\"language-r\">#!\/bin\/Rscript\nsource(&quot;\/opt\/config\/init.R&quot;) # script containing log function and several other things\ntryCatch({\n  # generally... process identified by destination table\n  param = c(\n    proc = &#039;process name&#039;\n  )\n\n  ### do what&#039;s necessary here....\n\n  write_log(param[[&#039;proc&#039;]], message = file)\n\n}, error = function(e) {\n  write_log(param[[&#039;proc&#039;]], exit_code = -1, message = paste(e$message, collapse = &quot; &quot;))\n  print(e$message)\n  }\n)<\/code><\/pre>\n<p>Or in Python:<\/p>\n<pre><code class=\"language-python\">import logging\n\nlogging.basicConfig(filename=&#039;execution.log&#039;, level=logging.INFO)\n\ntry:\n    # main process\n    logging.info(&quot;Process started successfully&quot;)\n\nexcept Exception as e:\n    logging.error(f&quot;Error in process: {str(e)}&quot;)\n    raise<\/code><\/pre>\n<hr \/>\n<h2>\u2699\ufe0f Parameters Don&#8217;t Belong in the Code<\/h2>\n<p>The code can have many parameters, but execution parameters, for example &quot;execution date&quot; or input file path, are better off being parametric.<\/p>\n<p>It can&#8217;t be that to change an execution parameter it&#8217;s necessary to modify the code. These should be passed through the console or through some web interface:<\/p>\n<p>In R:<\/p>\n<pre><code class=\"language-r\">args = commandArgs(trailingOnly = TRUE)<\/code><\/pre>\n<p>In Python:<\/p>\n<pre><code class=\"language-python\">import sys\nargs = sys.argv[1:]\nfile = args[0]<\/code><\/pre>\n<p>Even better, using <code>argparse<\/code> in Python:<\/p>\n<pre><code class=\"language-python\">import argparse\n\nparser = argparse.ArgumentParser()\nparser.add_argument(&#039;--date&#039;, default=&#039;2024-01-01&#039;, help=&#039;Execution date&#039;)\nparser.add_argument(&#039;--input&#039;, required=True, help=&#039;Input file path&#039;)\nargs = parser.parse_args()\n\nprint(f&quot;Processing file: {args.input} for date: {args.date}&quot;)<\/code><\/pre>\n<hr \/>\n<h2>\ud83e\udd14 Final Question \ud83d\ude0a<\/h2>\n<p>What did you think? Can you think of any more?<br \/>\nLeave it in the comments.<\/p>\n<p>Cheers! \ud83d\ude80<\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"mh-excerpt\"><p>Best Practices When Programming (with code) Many enjoy programming and solving algorithmic mazes in their favorite language, but what nobody likes is diving into old <a class=\"mh-excerpt-more\" href=\"https:\/\/geekosas.com\/index.php\/2020\/05\/23\/best-practices-when-programming-with-code\/\" title=\"Best Practices When Programming (with code)\">[&#8230;]<\/a><\/p>\n<\/div>","protected":false},"author":1,"featured_media":3123,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"class_list":["post-3235","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-sin-categoria"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2022\/05\/logo.png?fit=1280%2C640&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8vjqF-Qb","jetpack-related-posts":[{"id":3243,"url":"https:\/\/geekosas.com\/index.php\/2026\/05\/09\/comparison-between-julia-python-and-r\/","url_meta":{"origin":3235,"position":0},"title":"Comparison between Julia, Python, and R","author":"Daniel Fischer","date":"2026-05-09","format":false,"excerpt":"The discussion about which language is best for data analysis can lead to conversations more passionate than topics like religion or politics. But as Data Scientists we must focus on empirical evidence; the dimensions for comparison are many: Community, Performance, Editors, Package Manager, Code Encapsulation, etc. I have evaluated several\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2019\/09\/external-content.duckduckgo.com_.jpg?fit=474%2C312&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":3216,"url":"https:\/\/geekosas.com\/index.php\/2026\/05\/16\/3216\/","url_meta":{"origin":3235,"position":1},"title":"Sudoku","author":"Daniel Fischer","date":"2026-05-16","format":false,"excerpt":"The first question on my first optimization exam was to formulate a mathematical model for solving a Sudoku puzzle. Well, at that moment I made a huge mistake: I included nonlinear constraints, which resulted in a score of 0. And since this was one of my favorite subjects, I\u2019ve always\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3328,"url":"https:\/\/geekosas.com\/index.php\/2018\/05\/23\/videogames-presentation-spanish\/","url_meta":{"origin":3235,"position":2},"title":"Videogames Presentation (spanish)","author":"Daniel Fischer","date":"2018-05-23","format":false,"excerpt":"https:\/\/www.youtube.com\/watch?v=rrWCckdfO38 code:\u00a0https:\/\/github.com\/danielfm123\/userchile_metacritic","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/10\/Mario64_1.jpg?fit=610%2C343&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/10\/Mario64_1.jpg?fit=610%2C343&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/10\/Mario64_1.jpg?fit=610%2C343&ssl=1&resize=525%2C300 1.5x"},"classes":[]},{"id":3302,"url":"https:\/\/geekosas.com\/index.php\/2017\/05\/23\/sparse-matrix-in-r\/","url_meta":{"origin":3235,"position":3},"title":"Sparse Matrix in R","author":"Daniel Fischer","date":"2017-05-23","format":false,"excerpt":"The values 0 in matrices are very frequent, especially in dummy variables, so in R there is a package called Matrix which allows creating sparse matrices, in other words, matrices that do not use memory when an element's value is 0. But skipping the zeros is computationally complex, so it\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/10\/mb-1024x1024.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/10\/mb-1024x1024.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/10\/mb-1024x1024.png?resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.geekosas.com\/wp-content\/uploads\/2017\/10\/mb-1024x1024.png?resize=700%2C400 2x"},"classes":[]},{"id":3326,"url":"https:\/\/geekosas.com\/index.php\/2018\/05\/23\/write-in-aws-redshift-at-fill-speed\/","url_meta":{"origin":3235,"position":4},"title":"Write in AWS Redshift at Fill Speed!","author":"Daniel Fischer","date":"2018-05-23","format":false,"excerpt":"As many know, Redshift is a fork of Postgres made by Amazon to provide a Data Warehouse service. The big difference between these two products is that the former is a columnar and compressed database, while the latter is not. Columnar databases are very fast for aggregations and joins, but\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/08\/RtoRedshift.png?fit=600%2C300&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/08\/RtoRedshift.png?fit=600%2C300&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2018\/08\/RtoRedshift.png?fit=600%2C300&ssl=1&resize=525%2C300 1.5x"},"classes":[]},{"id":3241,"url":"https:\/\/geekosas.com\/index.php\/2019\/05\/23\/how-many-people-have-to-get-infected-to-end-covid-19\/","url_meta":{"origin":3235,"position":5},"title":"How many people have to get infected to end COVID-19?","author":"Daniel Fischer","date":"2019-05-23","format":false,"excerpt":"Well, given that apparently people do not get sick twice, the easy answer should be EVERYONE, but it's not that simple... Let me explain... each sick person infects healthy people. Currently there are 30,000 sick and 4,000 infected in the last day (approximately), so we can say that daily we\u2026","rel":"","context":"In &quot;Sin categor\u00eda&quot;","block_context":{"text":"Sin categor\u00eda","link":"https:\/\/geekosas.com\/index.php\/category\/sin-categoria\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2020\/05\/covid.jpg?fit=1170%2C700&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2020\/05\/covid.jpg?fit=1170%2C700&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2020\/05\/covid.jpg?fit=1170%2C700&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2020\/05\/covid.jpg?fit=1170%2C700&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/geekosas.com\/wp-content\/uploads\/2020\/05\/covid.jpg?fit=1170%2C700&ssl=1&resize=1050%2C600 3x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/posts\/3235","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/comments?post=3235"}],"version-history":[{"count":1,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/posts\/3235\/revisions"}],"predecessor-version":[{"id":3236,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/posts\/3235\/revisions\/3236"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/media\/3123"}],"wp:attachment":[{"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/media?parent=3235"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/categories?post=3235"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/geekosas.com\/index.php\/wp-json\/wp\/v2\/tags?post=3235"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}