{"id":3944,"date":"2020-09-15T18:00:56","date_gmt":"2020-09-15T16:00:56","guid":{"rendered":"https:\/\/immune.institute\/?p=3944"},"modified":"2020-09-15T18:00:56","modified_gmt":"2020-09-15T16:00:56","slug":"extranos-datasets-para-machine-learning","status":"publish","type":"post","link":"https:\/\/immune.institute\/en\/blog\/extranos-datasets-para-machine-learning\/","title":{"rendered":"Strange datasets for Machine Learning"},"content":{"rendered":"<h3><b><span style=\"color: #ffffff;\">A review of unusual datasets for your models<\/span><\/b><\/h3>\n<p><span style=\"font-weight: 400;\">When starting out in the field of machine learning, data sets are typically used as <\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/MNIST_database\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">MNIST<\/span><\/a><span style=\"font-weight: 400;\">,<\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/Iris_flower_data_set\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">Iris<\/span><\/a><span style=\"font-weight: 400;\">, o <\/span><a href=\"http:\/\/qwone.com\/~jason\/20Newsgroups\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">20 newsgroups<\/span><\/a><span style=\"font-weight: 400;\">, entre otros&#8230; Pero hay cientos de datasets raros e interesantes que se pueden encontrar online. En el <\/span><a href=\"https:\/\/immune.institute\/en\/?utm_campaign=IMMUNE&amp;utm_source=Embajador\"><b>Immune Technology Institute<\/b><\/a><span style=\"font-weight: 400;\"> we asked our teachers to create a list of the strangest datasets they have encountered. <\/span><b>Here we go!<\/b><\/p>\n<h3><span style=\"color: #ffffff;\"><b>Price of marijuana<\/b><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">This is a repository that contains a record of the <\/span><b>marijuana prices over the years<\/b><span style=\"font-weight: 400;\">, los cuales var\u00edan bastante de un estado a otro. Pero la cuesti\u00f3n aqu\u00ed es c\u00f3mo se han obtenido los datos&#8230;<\/span><\/p>\n<p><a href=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/09\/frysuspicious-1.svg\"><img decoding=\"async\" class=\"size-full wp-image-8209 aligncenter\" role=\"img\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/09\/frysuspicious-1.svg\" alt=\"\" width=\"300\" height=\"225\"><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Although it may seem like a useless set of data, it can be very relevant in the times we live in, since it <\/span><b>many countries are considering legalising marijuana.<\/b><\/p>\n<h3><span style=\"color: #ffffff;\"><b>What is the optimal size for a chopstick?<\/b><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">If you've never wondered, as usual, what the optimal size of chopsticks is, don't worry, someone has wondered before. A team of researchers evaluated the effects of chopstick length on the eating performance of adults and children. For this reason, they created <\/span><a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/15676839\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">this dataset<\/span><\/a><span style=\"font-weight: 400;\"> to find the optimal length of the sticks.<\/span><\/p>\n<p><img decoding=\"async\" class=\"size-full wp-image-8210 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/09\/pexels-foodie-factor-539430-1024x683-1.jpeg\" alt=\"\" width=\"1024\" height=\"683\" srcset=\"https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/pexels-foodie-factor-539430-1024x683-1.jpeg 1024w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/pexels-foodie-factor-539430-1024x683-1-256x171.jpeg 256w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/pexels-foodie-factor-539430-1024x683-1-512x342.jpeg 512w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/pexels-foodie-factor-539430-1024x683-1-768x512.jpeg 768w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/pexels-foodie-factor-539430-1024x683-1-18x12.jpeg 18w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">They concluded that the process of poking food was significantly affected by the length of the chopsticks. The researchers suggested that families with children should provide chopsticks of <\/span><b>240 and 180 mm in length<\/b><span style=\"font-weight: 400;\">. Restaurants should provide toothpicks for <\/span><b>210 mm long<\/b><span style=\"font-weight: 400;\">to find a balance between ergonomics and price.<\/span><\/p>\n<h3><span style=\"color: #ffffff;\"><b>Images of rice grains<\/b><\/span><\/h3>\n<p><a href=\"https:\/\/archive.ics.uci.edu\/ml\/datasets\/Rice+%28Cammeo+and+Osmancik%29\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">This dataset<\/span><\/a><span style=\"font-weight: 400;\"> contains more than 3500 images of rice grains of two different species. Different properties were extracted from each rice grain, such as:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The longest line that can be drawn on the grain of rice.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The shortest line that can be drawn on the grain of rice.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Or the perimeter of each grain.<\/span><\/li>\n<\/ul>\n<h3><span style=\"color: #ffffff;\"><b>Popular dog names in Sweden<\/b><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Did you know that the most popular dog name in Sweden is Molly?<\/span><\/p>\n<p><img decoding=\"async\" class=\"size-full wp-image-8211 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/09\/image6-1024x247-1.png\" alt=\"\" width=\"1024\" height=\"247\" srcset=\"https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image6-1024x247-1.png 1024w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image6-1024x247-1-256x62.png 256w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image6-1024x247-1-512x124.png 512w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image6-1024x247-1-768x185.png 768w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image6-1024x247-1-18x4.png 18w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">This dataset lists the most popular dog names in Sweden in 2018 by number of animals. Bella was the second most popular name, with almost six thousand animals, followed by Charlie with approximately 4600 animals.<\/span><\/p>\n<h3><span style=\"color: #ffffff;\"><b>Flags<\/b><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">I'm pretty sure Sheldon will love this one. <\/span><a href=\"https:\/\/archive.ics.uci.edu\/ml\/datasets\/Flags\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">data set<\/span><\/a><span style=\"font-weight: 400;\">&#8230; Este dataset contiene las banderas y detalles de varias pa\u00edses, como:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The religion of each country.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">The predominant colour of the flag.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">If the flag contains a crescent moon or sun stars.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Si contiene un \u00e1guila, un \u00e1rbol, &#8230;<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">It might be interesting to try to predict the religion of a country by its size and the colours of its flag.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Sometimes it is also interesting to see how people extract relationships in data where they are not visible to the naked eye. <\/span><a href=\"http:\/\/www.tylervigen.com\/spurious-correlations\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">This website<\/span><\/a><span style=\"font-weight: 400;\"> is an expert at finding correlations where no one else can find them, for example:<\/span><\/p>\n<h4><span style=\"color: #ffffff;\"><b>Cheese consumption vs. number of people who died from entanglement in their bed sheets<\/b><\/span><\/h4>\n<p><img decoding=\"async\" class=\"size-full wp-image-8212 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/09\/image7-1024x403-1.png\" alt=\"\" width=\"1024\" height=\"403\" srcset=\"https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image7-1024x403-1.png 1024w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image7-1024x403-1-256x101.png 256w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image7-1024x403-1-512x202.png 512w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image7-1024x403-1-768x302.png 768w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image7-1024x403-1-18x7.png 18w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<h4><span style=\"color: #ffffff;\"><b>PhDs in mathematics vs. stored uranium in US nuclear power plants.<\/b><\/span><\/h4>\n<p><img decoding=\"async\" class=\"size-full wp-image-8214 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/09\/image2-1-1024x403-1.png\" alt=\"\" width=\"1024\" height=\"403\" srcset=\"https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image2-1-1024x403-1.png 1024w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image2-1-1024x403-1-256x101.png 256w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image2-1-1024x403-1-512x202.png 512w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image2-1-1024x403-1-768x302.png 768w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image2-1-1024x403-1-18x7.png 18w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<h4><span style=\"color: #ffffff;\"><b>Total revenue generated by arcades vs. computer science Ph.<\/b><\/span><\/h4>\n<p><img decoding=\"async\" class=\"size-full wp-image-8215 aligncenter\" src=\"https:\/\/principal.immune.institute\/wp-content\/uploads\/2020\/09\/image5-1-1024x403-1.png\" alt=\"\" width=\"1024\" height=\"403\" srcset=\"https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image5-1-1024x403-1.png 1024w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image5-1-1024x403-1-256x101.png 256w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image5-1-1024x403-1-512x202.png 512w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image5-1-1024x403-1-768x302.png 768w, https:\/\/immune.institute\/wp-content\/uploads\/2020\/09\/image5-1-1024x403-1-18x7.png 18w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">You can discover new correlations on this website. Share with <\/span><a href=\"https:\/\/twitter.com\/immuneinstitute?lang=es\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">we<\/span><\/a><span style=\"font-weight: 400;\"> your results! ?<\/span><\/p>\n<h3><span style=\"color: #ffffff;\"><b>Who are we?<\/b><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">At <\/span><a href=\"https:\/\/immune.institute\/en\/?utm_campaign=IMMUNE&amp;utm_source=Embajador\"><b>Immune Technology Institute<\/b><\/a><span style=\"font-weight: 400;\"> We try to apply and teach the most advanced technology in the field of computing. In addition, we love to share knowledge as we believe that is when it becomes powerful.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">So if you want to learn how to develop real-world applications or handle large amounts of data, you may be interested in our <\/span><a href=\"https:\/\/bit.ly\/2E3fz8h\" target=\"_blank\" rel=\"noopener\"><b>Master of Data Science<\/b><\/a><span style=\"font-weight: 400;\">. It is a programme aimed at professionals who want to specialise in Data Science, learn the main techniques of data mining and analysis. <\/span><b>Artificial Intelligence<\/b><span style=\"font-weight: 400;\"> and how to apply them in different industries.&nbsp;<\/span><\/p>\n<p><b>24 September<\/b><span style=\"font-weight: 400;\"> we will have an online information session with the director of the master's degree, <\/span><b>Monica Villas<\/b><span style=\"font-weight: 400;\">. <\/span><b>IMMUNE <\/b><span style=\"font-weight: 400;\">can help you to boost your career through their <\/span><span style=\"font-weight: 400;\">partner companies<\/span><span style=\"font-weight: 400;\"> y <\/span><span style=\"font-weight: 400;\">contacts with recruiters and industry professionals<\/span><span style=\"font-weight: 400;\">. You can register <\/span><a href=\"https:\/\/bit.ly\/3j9vWio\" target=\"_blank\" rel=\"noopener\"><b>HERE<\/b><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h3><span style=\"color: #ffffff;\"><b>Espera una cosa m\u00e1s &#8211; Datathon<\/b><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Want to be a data scientist through and through? Sign up for the <\/span><a href=\"https:\/\/bit.ly\/3kk7RWF\" target=\"_blank\" rel=\"noopener\"><b>Datathon<\/b><\/a><span style=\"font-weight: 400;\"> organised by <\/span><a href=\"https:\/\/bit.ly\/3hvwVrX\" target=\"_blank\" rel=\"noopener\"><b>Immune Technology Institute<\/b><\/a><span style=\"font-weight: 400;\"> in cooperation with <\/span><b>Spanish Startups<\/b><span style=\"font-weight: 400;\"> on 19 September. It will be an online event featuring top data experts and a <\/span><b>great challenge<\/b><span style=\"font-weight: 400;\"> to test your knowledge. <\/span><b>It has a prize!<\/b> <b>You can register <\/b><a href=\"https:\/\/bit.ly\/3kk7RWF\" target=\"_blank\" rel=\"noopener\"><b>HERE<\/b><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p style=\"text-align: right;\"><span style=\"font-weight: 400;\">This article was written by:<\/span><a href=\"https:\/\/medium.com\/u\/3b43171da13b?source=post_page-----da0503717a62----------------------\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">Alejandro Diaz Santos<\/span><\/a><span style=\"font-weight: 400;\">- (<\/span><a href=\"https:\/\/www.linkedin.com\/in\/alejandro-diaz-santos-8aab812a\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">LinkedIn<\/span><\/a><span style=\"font-weight: 400;\">,<\/span><a href=\"https:\/\/github.com\/alejandrods\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400;\">GitHub<\/span><\/a><span style=\"font-weight: 400;\">) for IMMUNE Technology Institute.<\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>Un repaso a datasets poco comunes para tus modelos Cuando se comienza en el campo del aprendizaje autom\u00e1tico, se suelen utilizar los conjuntos de datos como MNIST, Iris, o 20 newsgroups, entre otros&#8230; Pero hay cientos de datasets raros e interesantes que se pueden encontrar online. En el Immune Technology Institute hemos pedido a nuestros [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":8210,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_crdt_document":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-3944","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/posts\/3944","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/comments?post=3944"}],"version-history":[{"count":0,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/posts\/3944\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/media\/8210"}],"wp:attachment":[{"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/media?parent=3944"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/categories?post=3944"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/immune.institute\/en\/wp-json\/wp\/v2\/tags?post=3944"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}