{"id":4059,"date":"2020-10-09T13:26:13","date_gmt":"2020-10-09T11:26:13","guid":{"rendered":"https:\/\/immune.institute\/?p=4059"},"modified":"2020-10-09T13:26:13","modified_gmt":"2020-10-09T11:26:13","slug":"introduccion-al-machine-learning","status":"publish","type":"post","link":"https:\/\/immune.institute\/en\/blog\/introduccion-al-machine-learning\/","title":{"rendered":"Introduction to Machine Learning"},"content":{"rendered":"<p>Nowadays, it's rare to find someone who hasn't heard of Machine Learning. Perhaps they don't know it, but they have used some application or virtual assistant at some point. With the aim of providing a brief introduction to Machine Learning and demystifying some phrases that are often repeated around ML, we're connecting at IMMUNE to talk about this topic, which is so fashionable today.<\/p>\n<h2><b>\u00c9s la capacitat de les m\u00e0quines per aprendre i millorar a partir de dades, sense estar programades expl\u00edcitament.<\/b><\/h2>\n<p>Depending on the level of depth you want to reach, you can find different variations of the same definition. If you are looking for an informal definition:<\/p>\n<p style=\"text-align: center;\"><b>It is making predictions from data.<\/b><\/p>\n<p>Instead, if you dig a little deeper, you can find a slightly more formal definition:<\/p>\n<p style=\"text-align: center;\"><b>The construction of a statistical model that is an \u201cunderlying\u201d distribution from which the data has been drawn.<\/b><\/p>\n<p>But wait! There's more! You can even take it to a more formal definition using mathematics.<\/p>\n<p style=\"text-align: center;\"><b>A training dataset&nbsp;<\/b><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" class=\"size-full wp-image-8194 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/10\/conjunto-de-datos.png\" alt=\"\" width=\"272\" height=\"30\"><br \/>\n<b>A hypothesis class H:<\/b><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" class=\"size-full wp-image-8196 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/10\/hypothesis.png\" alt=\"\" width=\"144\" height=\"31\"><br \/>\n<b>An objective function and an optimisation method<\/b><b><br \/>\n<\/b><img decoding=\"async\" class=\"size-full wp-image-8198 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/10\/objective.png\" alt=\"\" width=\"157\" height=\"24\"><br \/>\n<b>The overview is a mapping:<\/b><\/p>\n<p><img decoding=\"async\" class=\"size-full wp-image-8199 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/10\/mapping.png\" alt=\"\" width=\"256\" height=\"41\"><\/p>\n<p><b>Division of ML problems<\/b><\/p>\n<p>Normally we always come across the typical division of <i>Supervised <\/i>o <i>Unsupervised Learning. <\/i>However, there are more ways to divide Machine Learning problems and based on your problem, we will talk about one or the other.<\/p>\n<p style=\"text-align: center;\"><b>Supervised Learning | Unsupervised Learning<\/b><b><br \/>\n<\/b><b>Parametric Models | Non-parametric Models<\/b><b><br \/>\n<\/b><b>Modeling Approach | Optimization Techniques<\/b><\/p>\n<p>When working with ML problems, a question might arise before we begin: <b>Which is more important, drawing conclusions from data or making very good predictions?<\/b><\/p>\n<p><img decoding=\"async\" class=\"size-full wp-image-8200 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/10\/inference.png\" alt=\"\" width=\"605\" height=\"244\"><\/p>\n<p>That question is entirely valid and, in fact, a logical one to ask, it's called<i> interpretabilit\u00e9 \u2013 pr\u00e9dictions. <\/i>When we talk about inference, we usually talk about drawing clear conclusions from the data, such as how variable Y is affected by X, etc. But on the other hand, when we talk about predictions, we are talking about obtaining a clear and precise output from our model. They are two opposing viewpoints, but in practice, a mixture of both is usually worked with.<\/p>\n<p><img decoding=\"async\" class=\"size-full wp-image-8201 alignright\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/10\/flexibility.png\" alt=\"\" width=\"628\" height=\"358\"><\/p>\n<p>This makes us realise that there are models that are more easily interpretable than others. For example, linear regressions are very easy to <b><i>interpreting <\/i><\/b>but instead, they are few <b><i>flexible <\/i><\/b>since they only generate linear functions. Polynomial functions, on the other hand, are more <b><i>flexible<\/i><\/b>, as it can generate a larger number of \u201cshapes\u201d, but they are more complicated to interpret.<\/p>\n<p><b>But... why does Machine Learning work?<\/b><\/p>\n<p>Basically, Machine Learning works because we have an enormous amount of data (Big Data) alongside the mathematics that lie behind each model. <i>The Law of Large Numbers<\/i> He speaks to us about this very matter, in summary he says that the more data we have, the closer we will get to the original data distribution, meaning our model will be better.<\/p>\n<p><b>Machine Learning in Industry<\/b><b><br \/>\n<\/b><\/p>\n<p>When a company tries to implement Machine Learning models in its projects, it may encounter several issues, here are some of the most common ones:<\/p>\n<p><b>1. Run very powerful models<\/b><b><br \/>\n<\/b>Sometimes there is a lack of resources to run them, and the cost of having a very powerful model running 24 hours a day is very high, which not all companies can afford. Sometimes, it's simply a problem of how to adapt models (BERT, GPT-2, ...) to your use case.<\/p>\n<p><b>2.- Model deployments<\/b><b><br \/>\n<\/b>Deploying Machine Learning models is not a trivial task; it's part of the end-to-end process of any ML project and can sometimes be challenging. This can be due to a lack of resources or because of project requirements you need to meet (latency, availability, etc.).<\/p>\n<p><b>3.- Data<\/b><b><br \/>\n<\/b>Data is a fundamental part of ML, however, there are sometimes restrictions on its use. Restrictions that are totally necessary because it is what allows us to protect the user, and as data scientists, we must promote that philosophy. Other times, there is simply not enough data governance, meaning it is not being valued within the company, and it is complicated to make use of it.<\/p>\n<p>In summary, in this session we were with <a href=\"https:\/\/www.linkedin.com\/in\/alejandro-diaz-santos-8aab812a\/\" target=\"_blank\" rel=\"noopener\">Alejandro Diaz<\/a> talking about a brief introduction to Machine Learning and how to demystify some of the comments surrounding it. If you'd like more webinars like this, let us know, as well. <a href=\"https:\/\/immune.institute\/en\/data-science?utm_campaign=MDS2021_2&amp;utm_source=Embajador\">here<\/a> You will be able to find more information about our programmes.<\/p>","protected":false},"excerpt":{"rendered":"<p>Hoy en d\u00eda es raro encontrar a alguna persona que no haya escuchado \u00bfqu\u00e9 es el Machine Learning?. Quiz\u00e1s no lo sabe, pero s\u00ed que ha usado alguna aplicaci\u00f3n o asistente virtual alguna vez. Con el objetivo de hacer una peque\u00f1a introducci\u00f3n al machine learning y desmitificar algunas frases que se suelen repetir alrededor del [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":8202,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"ai_generated_summary":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-4059","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/posts\/4059","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/comments?post=4059"}],"version-history":[{"count":0,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/posts\/4059\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/media\/8202"}],"wp:attachment":[{"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/media?parent=4059"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/categories?post=4059"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/tags?post=4059"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}