Machine Learning Tutorial Part 1: Introduction

Machine Learning is a term that is used often and is misinterpreted many a times.

Certain movies create an unrealistic expectation by showing extra-ordinary things which could be achieved with the help of Machine Learning. It is important to understand what Machine Learning can help us achieve and what not. It is important to understand that Machine Learning is a branch of Artificial Intelligence that helps in achieving certain goals.

It can be understood as a process where data is provided to the machine and no explicit rules are defined for the machine to learn from. It just learns from the data it has been provided. It doesn’t consist of hardcoded ‘if’ and ‘else’ statements in it.

The Machine Learning algorithms extract patterns from data and learn from them, like how humans learn based on experiences.

Machine Learning algorithms can be classified into different types of learning based on the input and the type of input that is supplied. They can also be classified based on what they output, how they learn and what modules can be used to implement them.

Machine Learning is widely used for applications such as data mining, computer vision, Natural Language Processing, Search engines, credit card fraud detection, speech and handwriting recognition, strategy games, robotics, and much more.

Machine Learning Terminologies

Batch size: It refers to the number of examples that are present in the group/batch.
Categorical data: Every feature in a dataset might take up different (discrete) values based on other features.
Data Analysis: This process helps in gaining better understanding of the data with the help of sampling, measuring and visualizing data.
Discrete feature: A feature that has a fixed set of values which wouldn’t change.
Epoch: One full cycle of executing a training phase with respect to a dataset. It can be understood as the value which is obtained by calculating total number of examples in the dataset / batch size.
Feature: An input variable or an entry from the dataset that is used to make predictions on the dataset.
Feature engineering: The process used to determine which features would be useful and help in training the model better, thereby helping make better predictions. Once the right/relevant features are selected, they are converted into features.
Feature extraction: It is just another name for feature engineering. It could also refer to extracting intermediate features which need to be passed as inputs to other machine learning models.
Feature set: The set of features on which the machine learning model is trained upon
Feature vector: The feature values in the form of a list which is used to represent the example that is passed as an input to the machine learning model.
Fine tuning: The process of performing operations on the already trained model by tuning the parameters so that it predicts better and gives better outputs.
Ground truth: The correct value or the answer to a task or problem in hand.
Heuristics: A quick solution to a problem in hand. This might not be the optimal or well-planned but canbe thought of as a quick fix.
Hyperparameter: The parameters which are fine-tuned when a model is trained.
Instance: It is a synonym for ‘example’. It refers to a single row of a dataset or a part of the dataset orthe entire dataset itself.
Label: This is in the context of ‘supervised learning’ since models that implement supervised algorithms learn from examples that are labelled. It refers to a naming that is provided to a specific feature which distinguishes it from the other features.
Labelled example: An example in the dataset that contains features and label.
Loss: It is useful in determining how good/bad the model is. This can be determined by defining a ‘loss’ function that essentially tells how far the predicted values are from the original ones.
Loss curve: It helps in determining how well the model fits the data. It also helps in determining whether the data and model overfit, underfit, or converge.
Model: It represents the learning of the machine learning system, which would have learnt patterns and other learning from the input data supplied to it.
Model capacity: It tells about how complicated problems the machine learning system can learn. This corresponds to the capacity, the more complicated problem that the system can learn, higher is its capacity. The capacity of the model increases with increase in number of parameters of the model.
Model training: The process of finding out the best model to train the data on to get good results with predictions.
Model function: A method of implementing the processes involved in machine learning. They include training, evaluation and inference.
Noise: Something which shouldn’t be a part of dataset but is a part. It can be an entity that doesn’t allow effective training of data. It could be because a device mis-measured something or a human made a mistake while noting values or labelling the data.
Normalization: The process of conversion of a range of values into a standardized range of values, which could be between -1 to +1 or 0 to 1. Data can be normalized with the help of subtraction and division.
Numerical data: Features in a dataset that are represented as integers or real-valued numbers. They are also known as continuous features. When a feature is represented as a numerical value, it can be inferred that the values have a mathematical relationship with each other.
Objective: The goal which the machine learning algorithm is trying to achieve. It can also be defined as a metric which is trying to be optimized so that the algorithm would yield better predictions.
Outliers: Values which don’t synchronize or fall in range with the other values present in the same dataset.
Parameter: A variable value of a model which is used by the machine learning system to be trained.
Pipeline: The infrastructure which is used with the machine learning algorithm. This include data gathering, placing this data into separate files, training this data and exporting the obtained models into production environment.
Prediction: The output of a model after it has been trained on input data.

Demystifying Machine Learning

We all use the word Machine Learning to imply that machines are learning by themselves with the help of data which was supplied to it. But let us dig a little deeper into it and understand what it actually is, and why it has to be used (not everywhere but only when certain vital conditions are fulfilled). In simple words, machine learning can be considered as the art of teaching machines to learn from the data it has been supplied with (by humans).

It is equally important to know the hierarchy of machine learning. It is a sub-class of Artificial Intelligence (AI) which doesn't need a human to explicitly write rules from which the machines would learn. The humans are responsible in providing data with the help of which the machine would learn. The below image shows the hierarchy of machine learning:

The word Machine learning has been hyped up due to which people assume that it can be used to replace humans. But that is not true. Machines are made to help humans do tasks in a better way, by taking much less time, thereby saving resources.

Visualize it this way

Humans needs to learn the process of addition and then add numbers. If the numbers are very huge, the human would take more time. On the other hand, computers are just programmed how addition works. Irrespective of whether the number is huge or not, they give the result in fraction of seconds.

Apply the same idea to a slightly more complicated task

Suppose a person keeps getting 100 spam emails every day and just 20 important emails. What is the general intuition? The person goes through the subject of the email, sometimes the subject could be misleading too. Assume the worst-case scenario (which means the subject line is misleading), hence the person has to open every email and scan through the contents to know if it is an important mail or just an advertisement/promotion/spam.

Now the same person decides to write a program that would detect spam emails for them. He scans through the 80 spam emails, finds out words, sentences, number of words, the way in which these words have been phrased, etc and writes rules which state that when such a condition is encountered, the email has to be classified as a spam email. This works just fine.

Now assume a spam email comes which is completely different and doesn’t follow the rules that the person wrote. The spam email would be classified as an important email. The person would again need to rewrite the rules or add in another set of rules to identify such spam emails.

How long will this go on?

This is where Machine Learning algorithms come into play. To be specific, classification algorithms which help the user in classifying whether an email that they received is spam or important.

Your next question would be- What is the input data to such a classification algorithm? It is the user feedback about emails. When an email comes in, the user has the option to categorize it as spam or important. If the user does it a couple more times, the classifier learns to categorize spam emails. In best cases, about 99 percent of the emails are classified properly. This way, the user wouldn’t need to scan through every email to check its importance. But when the classifier can’t categorize an email as spam or important, it just puts the email in ‘important’ folder so that the user doesn’t miss out on anything important.

The above explained example is a simple use case of machine learning, which is where it all started.

When should Machine Learning be used instead of writing logic for programs to perform tasks?

When there is a requirement to write thousands to millions of rules for a specific task: Instead of writing/designing the logic, a machine learning algorithm can be implemented which helps in avoiding writing the rules.
When new data comes in very often and the rules have to be rewritten, changed or added: When new data pops in frequently, instead of changing rules, relevant machine learning algorithms can be used since they have the ability to adapt to new data.
When the amount of data is huge and useful insights need to be extracted from them: Traditional approach wouldn't work, and this is when machine learning tools can be used to visualize large amounts of data, predict values of the near future and help in taking important decisions for businesses.

Ready to upskill and advance your career? LenovoPRO Community has partnered with Knowledgehut to provide valuable learning resources to our members, including this tutorial series on Machine Learning!

Want to continue this course and become a master in Machine Learning? Click here to complete the entire tutorial on Knowledgehut.

These chapters were originally featured on Knowledgehut at the following URL:

Want to continue this course on Knowledgehut? Click here for the next chapter.

About Knowledgehut: KnowledgeHut is a global ed-tech company, helping organizations and professionals unlock productive excellence through skills development.

If you’re looking to upskill to advance your career, visit Knowledgehut today – plus, LenovoPRO Community members get 20% off your first course on the site (use promo code LENOVO)

What are your thoughts on machine learning? Do you have any familiarity with it when it comes to your business practices?

Leave your comments below to kick off the conversation!