What is Machine Learning?
A straightforward definition of Machine Learning relies in its name: making the computers learn how to make predictions, take decisions, or make connections. In the ML world, these three types of tasks are called: regression, classification and clustering. Many problems can be reduced to such tasks.
Source: Moogsoft – Understanding the Machine Learning in AIOps
Quick examples:
- Regression: Predicting the number of pieces that will leave the warehouse this month
- Classification: Deciding whether an image represents a flower or a tree
- Clustering: Regrouping documents based on their topics
Programs or systems which do this kind of tasks are called models. They are built using extrapolations from any kind of data: bits, numbers, texts, images, sounds, videos… Anything that can be fed to a computer.
Now, the quality and quantity of the data are important. Having blurry images will make it more complicated to differentiate flowers from trees and, having too few examples will make it hard for the model to be able to generalize properly when facing new examples.
Finally, this data can come either from your company’s internal data or from external sources. To forecast inventory needs for example, the required data has most likely been collected by your ERP or could be somewhere on your servers but for many other problems, lots of open-source datasets are available online.
How does it work?
The machine needs to learn somehow, and it does it in a rather intuitive way by learning using examples. However, before diving into this, a quick difference needs to be explained.
In classification and regression problems you can know the answer to past examples of the question the machine is learning to answer. You can check your inventory levels or verify the images directly for example. This is called supervised learning.
On the other hand, in clustering tasks which are part of the unsupervised learning discipline, the groups you want to form are not known beforehand. Hence, you cannot give concrete examples of how to perform the grouping task but have to rely on the characteristics of the data. Multiple grouping solutions exist and only some of them are useful for the application you have in mind. You could group documents solely by their lengths but this would not be useful if you wanted to have articles related in terms of topics like it is oftentime the case for recommender systems.
How supervised learning works:
When you have been taught to perform supervised tasks (classify or predict things) such as differentiating trees from flowers, you have been told many time: “this is a flower!”, “this is a big tree!” and so on. So, one way to see what happened is that you had data (coming from your eyes: images) and labels (the answers: tree or flower) and this allowed you to build a model that, given new images of flower or tree, can decide what is what.
For some tasks however, you were not given just raw data (images in this example) with the corresponding labels. You have also been taught to pay attention to specific details. You know that if the plant you look at is 20 meters high it is unlikely to be a flower. These characteristics that help you decide are called features in the ML jargon. They are attributes of the data you want to take decisions on, that are quite correlated with the labels.
What is beautiful about ML (compared to deterministic programming) is that it mimics real learning. When designing a classifier or a predictive algorithm you do not need to explicitly input all the rules on which the decision should be based on. There is no need to envision all possible cases and write them all down in a code. And this would in most cases be literally impossible.
As if you were teaching to do this task to someone, you need to explain which features are important, make examples available and then let the student (computer) train itself to this task. To check that the computer is learning properly, you need to test it with new examples and if it performs well, then you have your model!
How unsupervised learning works:
Because of its subjective nature, unsupervised problems cannot be solved “the right way”. There is no clear right and wrong: there is no label. However, when collecting and processing the data such that it contains relevant information great insights can be isolated or group/connections (clusters) made.
In order to do so, the computer measures differences between the data points (representing documents). For example, when regrouping documents, one feature could be the document length and another one could be the writing style. The computer will then compute the mathematical distances between the documents: two short, sports-related news article will be close together and far from a group of long, medieval literature essays. The model then regroups documents that are close together according to this distance metric, allowing you to easily retrieve related documents.
Conclusion
This was a brief introduction to the world of ML and the different types of problems it can tackle.
If you want to learn more about specific applications that could be used for your business or want to have them developed by a team of Switzerland’s top talents, do not hesitate to contact us. We provide consulting services and implement such solutions for our clients.
Finally, if you are interested to learn more, please follow Visium to read more articles.