The Nature of Machine Learning

Pierre Michel Hardy
17 min readDec 3, 2020

The things we think about when we think about machine learning, through the holy trinity of ML — data, model, and hardware.

An Instagram story I shared after finishing a coding test. The code includes data wrangling and machine learning training.

This is the second part of my three-part blog about what the f*ck AI really is. It’s a reflection of my journey to get my master’s degree in business intelligence. If you wish to see the first part of this blog (the introduction), click here.

First, we will talk about the nature of machine learning projects. I consider these to be its holy trinity: data, model, and hardware. Admittedly, while I was studying, they only focused on the input data and machine learning models. However, we quickly truncated some assignments because our laptops simply cannot handle the computations. So indirectly, we also learned how much raw computing power specific models need.

A brief note before I continue regarding the terminologies: for many people, they interchange between the terms Machine Learning (ML) and Artificial Intelligence (AI). Technically, there is a difference, though not everyone agrees with it. In general, my cohort and I understand that ML is merely a subset of AI. In practice, though? We barely have time to care. When in a project to map a service and build the corresponding AI or ML component, we go straight into talking about which algorithm our model uses, how will it crunch the input data, and whatnot. The usage of the terms AI and ML are mainly used to dazzle investors or business people who are not as well-versed with data analytics. Some may see this as a way of fooling them but not so! It is merely a way to simplify our proposals that enables us to explain our ideas better.

Trinity One: Data

First is data, the lifeblood of any data-centric project. One ground truth to establish is that ML models are garbage in and garbage out. This means that the quality of the model is directly reliant on the quality of data you feed it. In general, excellent quality data trumps a deluge of mixed quality data. I personally think that this is the most crucial part in the AI trinity (this includes the methods for collecting said data, like in sensors). For example, the dominance of China over facial recognition systems is not on what ML model they used. In fact, they use the standard open-source algorithms that’s available for free online. What…

--

--

Pierre Michel Hardy
0 Followers

A Eurasian business intelligence analyst and smart service designer aiming to contribute his economics and data skills to the betterment of our burning world.