The deference between Big Data, Data Mining, Machine learning, Deep learning and Data Science in a simple way.

AndReda Mind
4 min readMay 13, 2021
Big Data

In this digital world, everyone leaves a trace.

The increasing number of internet connected devices that we interact with on a daily basis record vast amounts of data about us.

There’s even a name for it and it‘s Big Data.

Ernst and Young offers the following definition: “Big Data refers to the dynamic, large and disparate volumes of data being created by people, tools, and machines. It requires new, innovative, and scalable technology to collect, host, and analytically process the vast amount of data gathered in order to derive real-time business insights that relate to consumers, risk, profit, performance, productivity management, and enhanced shareholder value.”

There is no one definition of Big Data, but there are certain elements that are common across the different definitions, such as velocity, volume, variety, veracity, and value. These are the V’s of Big Data, lets talk about every one of it:

The V’s of Big Data

1 Velocity: is the speed at which data accumulates. Data is being generated extremely fast, in a process that never stops. Near or real-time streaming, local, and cloud-based technologies can process information very quickly.

2 Volume: is the scale of the data, or the increase in the amount of data stored.

3 Variety: is the diversity of the data and its sources.

4 Veracity: is the quality and origin of data, and its conformity to facts and accuracy.

The data Attributes should include consistency, completeness, integrity, and ambiguity.

5 Value: is our ability and need to turn data into value.

Your goal with working with the data is to take a business decision by Digital Transformation.

Digital transformation is the integration of digital technology into all areas of a business, resulting in fundamental changes in how a business operates and the value they deliver to their customers.

Data mining

Data mining is the process of automatically searching and analyzing data, discovering previously unrevealed patterns. It involves preprocessing the data to prepare it and transforming it into an appropriate format.

After that, insights and patterns are mined and extracted using various tools and techniques ranging from simple data visualization tools to machine learning and statistical models.

Machine learning (subset of the AI)

It’s using computer algorithms to analyze data and make intelligent decisions based on what it is learned without being explicitly programmed.

Deep learning (subset of machine learning)

It’s using layered neural networks to simulate human decision-making. Deep learning algorithms can label and categorize information and identify patterns. It is what enables AI systems to continuously learn on the job and improve the quality and accuracy of results by determining whether decisions were correct.

Neural networks

A neural network in AI is a collection of small computing units called neurons that take incoming data and learn to make decisions over time. Neural networks are often layer-deep and are the reason deep learning algorithms become more efficient as the data sets increase in volume

Data Science

It’s the process and method for extracting knowledge and insights from large volumes of disparate data. It’s an interdisciplinary field involving mathematics, statistical analysis, data visualization, machine learning, and more. It’s what makes it possible for us to appropriate information, see patterns, find meaning from large volumes of data and use it to make decisions that drive business.

--

--