What is data mining

1. What is data mining? In your answer, address the following:
a) Is it another hype?
b) Is it a simple transformation or application of technology developed from databases, statistics,
machine learning, and pattern recognition?
c) We have presented a view that data mining is the result of the evolution of database
technology. Do you think that data mining is also the result of the evolution of machine
learning research? Can you present such views based on the historical progress of this
discipline? Do the same for the fields of statistics and pattern recognition.
d) Describe the steps (1-2 lines each step) involved in data mining when viewed as a process of
knowledge discovery.
2. How is a data warehouse different from a database? How are they similar?
3. Define each of the following data mining functionalities: characterization, discrimination,
association and correlation analysis, classification, regression, clustering, and outlier analysis. Give
examples of each data mining functionality, using a real-life database that you are familiar with.
4. Present an example where data mining is crucial to the success of a business. What data mining
functionalities does this business need (e.g., think of the kinds of patterns that could be mined)?
5. Describe three challenges to data mining regarding data mining methodology and user interaction
6. What are the major challenges of mining a huge amount of data (such as billions of tuples) in
comparison with mining a small amount of data (such as a few hundred tuple data set)?
7. Briefly describe the following advanced database systems and applications: object-relational
databases, spatial databases, text databases, multimedia databases, the World Wide Web.
8. Outliers are often discarded as noise. However, one person’s garbage could be another’s treasure.
For example, exceptions in credit card transactions can help us detect the fraudulent use of credit
cards. Using fraudulence detection as an example, provide three similar examples where outliers are
important to detect.