Introduction
This is the first post in a series of posts I am planning to write on various topics in Machine Learning. The idea is to revise concepts I have learned before and present them in an easy to understanding way. In this post, I dive into the history of Machine Learning and explain its essence using definitions from various sources.
History
We start this journey all the way back in 1882. Charles Babbage, also known as the father of modern computing, proposed a machine called the “Difference Engine”. It was made to compute values of polynomial functions. It automated the process of calculating values by eliminating the need to perform multiplication or division operations. This was a major breakthrough in automation and computing but the difference engine was still limited in what it could do.
In 1837, he proposed a second version called the “Analytical Engine”, which had all the different components used in modern computers such as an Arithmetic Logic Unit (ALU), integrated memory, and control flow (conditional branching, looping). And although the first computer was built almost a century later, it was based directly on this architecture. The analytical engine could be programmed using punch cards. Ada Lovelace, who was a mathematician and corresponded with Charles Babbage during the development of the Analytical Engine, created the first-ever known program which was an algorithm to calculate Bernoulli’s Numbers. She is known as the first computer programmer even though no programming languages were invented when she wrote her program.
In the field of Machine Learning, the initial research which somewhat simulated learning was mostly statistical. One of the most prominent theorems is the Bayes Theorem which allows finding conditional probabilities. One of the major breakthroughs was in 1943 when Warren McCulloch and Walter Pitts described the structure of a mathematical neuron, now known as the McCulloch-Pitts model of a neuron. The first Machine Learning program was written by Arthur Samuel at IBM. He wrote a program that could play checkers and is also attributed for coining the term Machine Learning. The next major breakthrough was the Perceptron model, proposed by Frank Rosenblatt, which is used for binary classification.
There were a few further breakthroughs such as backpropagation, recurrent neural networks, reinforcement learning, etc. when in 1997, IBM’s deep blue beat the then World Chess Champion, Garry Kasparov at a game of chess. Although this was a major breakthrough, many skeptics dismissed it as a closed domain game with very few constraints and rules compared to what is required for general intelligence. However, with recent advances in the field, Machine Learning has come a long way. Now let’s look at what actually the term stands for.
What is Machine Learning?
A trite definition of ML can definitely be given by, as specified before, to imitate human learning. But the truth is we are very far away from completely simulating the human brain, and this gives a false sense of dystopia where machines take over the world!
If we look at how machine learning differs from traditional computer programming, the idea becomes a bit more clear. Since the beginning of computing, the idea behind creating computers was to automate tasks that are repetitive. With traditional computer programs, the idea is to solve a problem using a predefined series of steps. It relies on domain knowledge and on the design of the algorithm to solve the problem.
Machine Learning on the other hand tries to create or modify the algorithm based on the data it learns from. One helpful model to think about the process is given by UC Berkely:
Decision Process: This is a process that will process the input data based on the current structure of the algorithm and give an output which is the “best guess” based on what it currently understands. Here we say it is a guess because the model will output different results with an associated probability. For example, I’m 80% certain that this is an image of a cat.
An error function: This is a feed back loop that determines the “goodness” or the accuracy of the guess and feeds the error back to the model
An updating or the optimization process: This is the step where the learning happens, as the model now tweaks the algorithm such that the error in the subsequent stage should be less than what it is now.
All of the 3 steps have a lot of nuances which I will try to explain in upcoming posts. This post was to give a brief introduction to machine learning by abstracting out the complex details.
I hope you found this article interesting. Until next time. Cheers!
Much liked your essay & compilation on Machine Learning, providing pertinent & valuable insights into its historical development and contemporary significance. Well researched & equally well articulated. 👌👍👍👍
Informative! Looking forward to the next posts.