
Data mining and warehousing mcq
1. What is true about data mining?
- Data Mining is defined as the procedure of extracting information from huge sets of data
- Data mining also involves other processes such as Data Cleaning, Data Integration, Data Transformation
- Data mining is the procedure of mining knowledge from data.
- All of the above
All of the above
2. How many categories of functions involved in Data Mining?
- 1
- 2
- 3
- 4
2
3. The mapping or classification of a class with some predefined group or class is known
as?
- Data Characterization
- Data Discrimination
- Data Set
- Data Sub Structure
Data Discrimination
4. The analysis performed to uncover interesting statistical correlations between
associated-attribute-value pairs is called?
- Mining of Association
- Mining of Clusters
- Mining of Correlations
- None of the above
Mining of Correlations
5. ______ may be defined as the data objects that do not comply with the general
behavior or model of the data available.
- Outlier Analysis
- Evolution Analysis
- Prediction
- Classification
Outlier Analysis
6. “Efficiency and scalability of data mining algorithms” issues comes under?
A. Mining Methodology and User Interaction Issues
B. Performance Issues
C. Diverse Data Types Issues
D. None of the above
Performance Issues
7. To integrate heterogeneous databases, how many approaches are there in Data
Warehousing?
- 1
- 2
- 3
- 4
2
8. Which of the following is correct advantage of Update-Driven Approach in Data
Warehousing?
A. This approach provides high performance.
B. The data can be copied, processed, integrated, annotated, summarized and
restructured in the semantic data store in advance.
C. Both A and B
D. None Of the above
Both A and B
9. What is the use of data cleaning?
A. to remove the noisy data
B. correct the inconsistencies in data
C. transformations to correct the wrong data.
D. All of the above
All of the above
10. Data Mining System Classification consists of?
A. Database Technology
B. Machine Learning
C. Information Science
D. All of the above
All of the above
Data mining and warehousing mcq sppu
11. Which of the following is a good alternative to the star schema?
- snow flake schema
- star schema
- star snow flake schema
- fact constellation
fact constellation
12. Patterns that can be discovered from a given database are which type…
- More than one type
- Multiple type always
- One type only
- No specific type
More than one type
13. Background knowledge is…
- It is a form of automatic learning.
- A neural network that makes use of a hidden layer
- The additional acquaintance used by a learning algorithm to facilitate the learning process
- None of these
The additional acquaintance used by a learning algorithm to facilitate the learning process
14. Which of the following is true for Classification?
- subdivision of a set
- A measure of the accuracy
- The task of assigning a classification
- All of these
subdivision of a set
Data mining and Warehousing mcq
15. Data mining is?
- time variant non-volatile collection of data
- The actual discovery phase of a knowledge
- The stage of selecting the right data
- None of these
The actual discovery phase of a knowledge
16. ——- is not a data mining functionality?
A) Clustering and Analysis
B) Selection and interpretation
C) Classification and regression
D) Characterization and Discrimination
Selection and interpretation
17. Which of the following can also applied to other forms?
a) Data streams & Sequence data
b) Networked data
c) Text & Spatial data
d) All of these
All of these
18. ——– is the out put of KDD
a) Query
b) Useful Information
c) Data
d) information
Useful Information
19. What is noise?
a) component of a network
b) context of KDD and data mining
c) aspects of a data warehouse
d) None of these
context of KDD and data mining
data mining and warehousing mcq sppu
20. Firms that are engaged in sentiment mining are analyzing data collected from?
A. social media sites.
B. in-depth interviews.
C. focus groups.
D. experiments.
social media sites.
21. Which of the following forms of data mining assigns records to one of a
predefined set of classes?
(A). Classification
(B). Clustering
(C). Both A and B
(D). None
Clustering
22. The learning which is used to find the hidden pattern in unlabeled data is called?
(A). Unsupervised learning
(B). Supervised learning
(C). Reinforcement learning
Unsupervised learning
23. The learning which is the example of Self-organizing maps?
(A). Reinforcement learning
(B). Supervised learning
(C). Unsupervised learning
(D). Missing data imputation
Unsupervised learning
24. According to storks’ population size, find the total number of babies from the
following example of predicting the number of babies.
(A). feature
(B). outcome
(C). attribute
(D). observation
outcome
25. Which of the following is not belong to data mining?
(A). Knowledge extraction
(B). Data transformation
(C). Data exploration
(D). Data archaeology
Data archaeology
26. The learning which is used for inferring a model from labeled training data is
called?
(A). Unsupervised learning
(B). Reinforcement learning
(C). Supervised learning
(D). Missing data imputation
Supervised learning
27. Which of the following is the right approach to Data Mining?
(A). Infrastructure, exploration, analysis, exploitation, interpretation
(B). Infrastructure, exploration, analysis, interpretation, exploitation
(C). Infrastructure, analysis, exploration, interpretation, exploitation
(D). None of these
Infrastructure, exploration, analysis, interpretation, exploitation
28. Which of the following terms is used as a synonym for data mining?
(A). knowledge discovery in databases
(B). data warehousing
(C). regression analysis
(D). parallel processing in databases
knowledge discovery in databases
29. …………………..is an essential process where intelligent methods are applied to
extract data patterns
A) Data Warehousing
B) Data Mining
C) Data Base
D) Data Structure
Data Mining
30. Data mining requires
- Large quantities of operational data stored over a period of time
- Lots of tactical data
- Several tape drives to store archival data
- Large mainframe computers
Large quantities of operational data stored over a period of time
data mining and warehousing mcq questions
31. Data by itself is not useful unless
- It is massive
- It is processed to obtain information
- It is collected as a raw data from diverse sources
- It is properly stated
It is processed to obtain information
32. Which of the following is NOT example of ordinal attributes?
- Zip codes
- Ordered numbers
- Ascending or descending names
- Military ranks
Zip codes
33. In asymmetric attribute
- Order of values is important
- All values are equals
- Only non-zero value is important
- Range of values is important
Only non-zero value is important
34. Identify the example of Nominal attribute
- Temperature
- Mass
- Salary
- Gender
Gender
35. Which of the following is not a data pre-processing methods?
- Data Visualization
- Data Discretization
- Data Cleaning
- Data Reduction
Data Visualization
36. Correlation analysis is used for __
- Handling missing values
- Identifying redundant attributes
- Handling different data formats
- Eliminating noise
Identifying redundant attributes
37. ______combines data from multiple sources into a coherent store
- Data Characterization
- Data Classification
- Data Integration
- Data Selection
Data Integration
38. Which of the following is / are attribute subset selection criterion(s) ?
- Forward selection
- Backward elimination
- Decision tree induction
- All of the above
All of the above
39. Data mining can also applied to other forms such as…………….
i) Data streams
ii) Sequence data
iii) Networked data
iv) Text data
v) Spatial data
A)i, ii, iii and v only
B) ii, iii, iv and v only
C) i, iii, iv and v only
D) All i, ii, iii, iv and v
All i, ii, iii, iv and v
40. ____ normalization is not very well efficient in handling the outliers
Min max
- Min max
- Z Score
- Decimal Scaling
- None of the above
Min max
Data mining and warehousing mcq with answers
41. The full form of KDD is………………
A) Knowledge Database
B) Knowledge Discovery Database
C) Knowledge Data House
D) Knowledge Data Definition
Knowledge Discovery Database
Data Analytics sppu mcq
42. A collection of interesting and useful patterns in database is called ___
A. knowledge.
B. information.
C. data.
D. algorithm
knowledge.
43. Data ………………. is the process of finding a model that describes and
distinguishes data classes or concepts.
a)Characterization
b)Mining
c) clustering
d )Classification
Classification
44. To remove noise and inconsistent data __ is needed
- Data Transformation
- Data Reduction
- Data Integration
- Data Cleaning
Data Cleaning
45.The terms equality and roll up are associated with _
- OLTP
- Visualization
- Data mart
- Decision Tree
Data mart
46. An operational system is which of the following?
A. A system that is used to run the business in real time and is based on historical data.
B. A system that is used to run the business in real time and is based on current
data.
C. A system that is used to support decision making and is based on current data.
D. A system that is used to support decision making and is based on historical data.
A system that is used to run the business in real time and is based on current data.
47. Data warehouse is which of the following?
A. Can be updated by end users.
B. Contains numerous naming conventions and formats.
C. Organized around important subject areas.
D. Contains only current data.
Organized around important subject areas.
48. Data transformation includes which of the following?
A. A process to change data from a detailed level to a summary level
B. A process to change data from a summary level to a detailed level
C. Joining data from one source into various sources of data
D. Separating data from one source into various sources of data
A process to change data from a detailed level to a summary level
49. The ……………… allows the selection of the relevant information necessary for
the data warehouse.
A top-down view
B data warehouse view
C data source view
D business query view
A top-down view
50. Which of the following is not a component of a data warehouse?
A Metadata
B Current detail data
C Lightly summarized data
D Component Key
Component Key
51. Which of the following is not a kind of data warehouse application?
A Information processing
B Analytical processing
C Data mining
D Transaction processing
Transaction processing
52. ___ is not associated with data cleaning process.
- Deduplication
- Domain consistency
- Segmentation
- Disambiguation
Segmentation
53. Dimensionality refers to
- Cardinality of key values in a star schema
- The data that describes the transactions in the fact table
- The level of detail of data that is held in the fact table
- The level of detail of data that is held in the dimension table
The data that describes the transactions in the fact table
54. Expansion for DSS in DW is
- Decisive Strategic System
- Data Support System
- Data Store System
- Decision Support system
Decision Support system
55. Data in a data warehouse
- in a flat file format
- can be normalised but often is not
- must be in normalised form to at least 3NF
- must be in normalised form to at least 2NF
can be normalised but often is not
56. Friendship structure of users in a social networking site can be considered as
an example of ____
- Record data
- Ordered data
- Graph data
- None of the above
Graph data
57. A café owner wanted to compare how much revenue he gained from lattes
across different months of the year. What type of variable is ‘month’?
- Continuous
- Categorical
- Discrete
- Nominal
Categorical
58. An outlier is a _
- Description of records in the data
- Data point which is considered different from other data points
- Record with missing attributes
- Duplicate record
Data point which is considered different from other data points
59. Which of the following operations can be performed on ordinal attributes?
- Distictness
- Documents
- Both of the above
- None of the above
Both of the above
60. Height of a person, can be considered as an attribute of _____type?
- Nominal
- Ordinal
- Interval
- Ratio
Ratio
61. The cosine similarity measure counts for _
- The Euclidian distance between vectors
- The Manhattan distance between vectors
- The similarity of documents
- The dissimilarity of vectors
The similarity of documents
62. Formula for dissimilarity computation between two objects for categorical
variable is – here p is categorical variable and m denotes number of matches
- D ( i, j ) = p – m / p
- D ( i, j ) = p – m / m
- D ( i, j ) = m – p / p
- D ( i, j ) = m – p / m
D ( i, j ) = p – m / p
63. Euclidean and Manhattan distances between the objects P, Q and R (1, 2, 3) and
(2, 1, 0) are _
- 3.32, 4 respectively
- 3.32, 5 respectively
- 5, 3.32 respectively
- 3.30, 3 respectively
3.32, 5 respectively
64. The main organisational justification for implementing a data warehouse is to
provide
- ETL from operation systems to strategic systems
- Large scale transaction processing
- Storing large volumes of data
- Decision support
Decision support
65. A data warehouse
a. must import data from transactional systems whenever significant changes occur in the
transactional data
b. works on live transactional data to provide up to date and valid results
c. takes regular copies of transaction data
d. takes preprocessed transaction data and stores in a way that is optimised for
analysis
takes preprocessed transaction data and stores in a way that is optimised for analysis
66. Data warehouse contains ________data that is seldom found in the operational environment
- informational
- normalized
- denormalized
- summary
summary
67. In a snowflake schema which of the following types of tables is considered?
- Fact
- Dimension
- Both (a) and (b)
- None of the above
Both (a) and (b)
68. Which of the following statements about data warehouse is true?
- A data warehouse is necessary to all those organisations that are using relational OLTP
- A data warehouse is useful to all organisations that currently use OLTP
- A data warehouse is valuable to the organisations that need to keep an audit trail of their activities
- A data warehouse is valuable only if the organisation has an interest in analysing historical data
A data warehouse is valuable only if the organisation has an interest in analysing historical data
69. When you ____ the data, you are aggregating the data to a higher level
- Slice
- Roll Up
- Roll Down
- Drill Down
Roll Up
70. The process of viewing the cross-tab (Single dimensional) with a fixed value of
one attribute is _
- Slicing
- Dicing
- Pivoting
- Both Slicing and Dicing
Slicing
71. What do data warehouses support?
- OLAP
- OLTP
- OLAP and OLTP
- Operational databases
OLAP
72. A data cube consist of _
- Dimensional data
- Multidimensional data
- No dimensional data
- 1 dimensional data
Multidimensional data
73. Which type of data storage architecture gives fastest performance?
- ROLAP
- MOLAP
- HOLAP
- DOLAP
MOLAP
74. Dissimilarity can be defined as __
- How much certain objects differ from each other
- How much certain objects simillar from each other
- Dissimilarities are non negative numbers d(i,j) that are small when i and j are close to each other and that become large when i and j are very different
- Both (a) and (c)
Both (a) and (c)
75. ______supports basic OLAP operations, including slice and dice, drill-down,
roll-up and pivoting
- Information processing
- Analytical processing
- Data processing
- Transaction processing
Analytical processing
data mining and warehousing mcq, data mining and warehousing mcq sppu, data mining and warehousing mcq questions, data mining and data warehousing mcq, data mining and warehouse mcq, dmw mcq, data mining and warehousing multiple choice questions, data mining and warehousing mcq pdf, data warehouse mcq questions and answers, data mining quiz