When starting out in the field of machine learning, data sets are typically used as MNIST, Iris, o 20 newsgroupsamong others... But there are hundreds of rare and interesting datasets that can be found online. In the Immune Technology Institute we asked our teachers to create a list of the strangest datasets they have encountered. Here we go!
This is a repository that contains a record of the marijuana prices over the yearswhich vary quite a lot from state to state. But the question here is how the data has been obtained....
Although it may seem like a useless set of data, it can be very relevant in the times we live in, since it many countries are considering legalising marijuana.
If you've never wondered, as usual, what the optimal size of chopsticks is, don't worry, someone has wondered before. A team of researchers evaluated the effects of chopstick length on the eating performance of adults and children. For this reason, they created this dataset to find the optimal length of the sticks.
They concluded that the process of poking food was significantly affected by the length of the chopsticks. The researchers suggested that families with children should provide chopsticks of 240 and 180 mm in length. Restaurants should provide toothpicks for 210 mm longto find a balance between ergonomics and price.
This dataset contains more than 3500 images of rice grains of two different species. Different properties were extracted from each rice grain, such as:
Did you know that the most popular dog name in Sweden is Molly?
This dataset lists the most popular dog names in Sweden in 2018 by number of animals. Bella was the second most popular name, with almost six thousand animals, followed by Charlie with approximately 4600 animals.
I'm pretty sure Sheldon will love this one. data set... This dataset contains the flags and details of several countries, such as:
It might be interesting to try to predict the religion of a country by its size and the colours of its flag.
Sometimes it is also interesting to see how people extract relationships in data where they are not visible to the naked eye. This website is an expert at finding correlations where no one else can find them, for example:
You can discover new correlations on this website. Share with we your results! ?
At Immune Technology Institute We try to apply and teach the most advanced technology in the field of computing. In addition, we love to share knowledge as we believe that is when it becomes powerful.
So if you want to learn how to develop real-world applications or handle large amounts of data, you may be interested in our Master of Data Science. It is a programme aimed at professionals who want to specialise in Data Science, learn the main techniques of data mining and analysis. Artificial Intelligence and how to apply them in different industries.
24 September we will have an online information session with the director of the master's degree, Monica Villas. IMMUNE can help you to boost your career through their partner companies y contacts with recruiters and industry professionals. You can register HERE.
Want to be a data scientist through and through? Sign up for the Datathon organised by Immune Technology Institute in cooperation with Spanish Startups on 19 September. It will be an online event featuring top data experts and a great challenge to test your knowledge. It has a prize! You can register HERE.
This article was written by: Alejandro Diaz Santos- (LinkedIn, GitHub) for IMMUNE Technology Institute.