What Is Data Mining?
Data mining refers to extracting or mining knowledge from large amounts of data. The term is actually a misnomer. Thus, data mining should have
been more appropriately named as knowledge mining which emphasis on mining from
large amounts of data.
It is the computational process of discovering patterns in large data
sets involving methods at the intersection of artificial intelligence, machine
learning, statistics, and database systems. The overall goal of the data mining
process is to extract information from a data set and transform it into an
understandable structure for further use.
The key properties of data mining are
• Automatic discovery of patterns
• Prediction of likely outcomes •
Creation of actionable information
• Focus on large datasets and databases
The Scope of Data Mining:
Data mining derives its name from the similarities between searching for the valuable business information in a large database - for example, finding linked
products in gigabytes of store scanner data -- and mining a mountain for a vein
of valuable ore. Both processes require either sifting through an immense
amount of material, or intelligently probing it to find exactly where the value
resides. Given databases of sufficient size and quality, data mining technology
can generate new business opportunities by providing these capabilities:
Automated prediction of trends and behaviors.
Data mining automates the process
of finding predictive information in large databases. Questions that
traditionally required extensive hands-on analysis can now be answered directly
from the data - quickly. A typical example of a predictive problem is targeted
marketing. Data mining uses data on past promotional mailings to identify the
targets most likely to maximize return on investment in future mailings. One
predictive problem includes forecasting bankruptcy and other forms of default,
and identify segments of a population likely to respond similarly to given
events.
Automated discovery of previously unknown patterns.
Data mining tools sweep through
databases and identify previously hidden patterns in one step. An example of
pattern discovery is the analysis of retail sales data to identify seemingly
unrelated products that are often purchased together. Other pattern discovery
problems include detecting fraudulent credit card transactions and identifying
anomalous data that could represent data entry keying errors.
Tasks of Data Mining
Data mining involves six common classes of tasks:
• Anomaly detection (Outlier/change/deviation detection) - The
identification of
unusual data records, that might be interesting or data errors that
require further investigation
Association rule learning (Dependency modeling) –
Searches for relationships between variables. For example a supermarket
might gather data on customer purchasing habits. Using association rule
learning, the supermarket can determine which products are frequently bought
together and use this information for marketing purposes. This is sometimes
referred to as market basket analysis.
Clustering –
is the task of discovering groups and structures in the data that are in
some way or another similar", without using known structures in the data.
Classification –
is the task of generalizing known
structures to apply to new data. For example, an e-mail program might attempt to
classify an e-mail as "legitimate" or as "spam".
Regression –
attempts to find a function which models the data with the least error
data mining Example:
Consider a marketing head of telecom service provides who
wants to increase revenues of long-distance services. For high ROI on his sales
and marketing efforts, customer profiling is important. He has a vast data pool
of customer information like age, gender, income, credit history, etc. But it is impossible to determine the characteristics of people who prefer long-distance
calls with manual analysis. Using data mining techniques, he may uncover
patterns between high long-distance call users and their characteristics.
For
example, he might learn that his best customers are married females between the
age of 45 and 54 who make more than $80,000 per year. Marketing efforts can be
targeted to such demographics.
What is the role of data mining?
Data mining is the
process of finding anomalies, patterns, and correlations within large data sets to
predict outcomes. Using a broad range of techniques, you can use this
information to increase revenues, cut costs, improve customer relationships,
reduce risks and more.
No comments:
Post a Comment