Reputation: 10342
I am currently learning data mining and I have the following questions.
Upvotes: 1
Views: 2273
Reputation: 11
Although overlap between data Data mining and Machine Learning, we can distinguish between them; simply, such as: Data mining search for patterns to predict and/or describe huge data, Machine Learning goes further to use these patterns to learn. And both based on Statistics.
Upvotes: 1
Reputation: 1724
A comprehensive answer was already given by @SpeedBirdNine. As a side note:
Regarding your last question, in my opinion, in any meaningful research, you either need to apply some statistical methods on big data and this is when DM/ML comes in handy, or you need to apply a DM/ML method which is already designed based on classical statistics. These are the two sections that every DM/ML research is involved, and statistics is not excluded, let alone when the goal is to come up with a noble DM/ML algorithm to analyze/cluster/classify big data.
Upvotes: 0
Reputation: 4676
Data mining is the process of extracting useful information from data, such as patterns, trends, customer/user behavior, liking/disliking etc. This involves the use of algorithms that are related to Artificial Intelligence and statistics.
Wikipedia's definition of Data Mining is:
Data Mining (the analysis step of the Knowledge Discovery in Databases process,[1] or KDD), a relatively young and interdisciplinary field of computer science,[2][3] is the process of discovering new patterns from large data sets involving methods from statistics and artificial intelligence but also database management. In contrast to for example machine learning, the emphasis lies on the discovery of previously unknown patterns as opposed to generalizing known patterns to new data.
Machine Learning involves making the computers "learn" that behavior, trend etc, and to act according. For example, in credit card fraud, the computer "learns" the behavior of a customer, and if something strange occurs (a transaction involving very high amounts etc), it flags that transaction for potential fraud.
Wikipedia's definition of machine learning is:
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases. Machine Learning is concerned with the development of algorithms allowing the machine to learn via inductive inference based on observing data that represents incomplete information about statistical phenomenon. Classification which is also referred to as pattern recognition, is an important task in Machine Learning, by which machines “learn” to automatically recognize complex patterns, to distinguish between exemplars based on their different patterns, and to make intelligent decisions.
Machine learning uses Data Mining to learn the pattern, behavior, trend etc, because Data Mining is the way of extracting this information from a set of data. Data Mining and Machine Learning both use Statistics make decisions. So yes statistics is involved and is very important in Data Mining and Machine learning.
Upvotes: 4
Reputation: 500923
There tends to be a lot of overlap between what different people call machine learning, data mining and statistics. The very definitions of the terms would depend on whom you ask.
Here is a nice overview, with lots of great links.
Upvotes: 3