Data Warehousing MCQs

Data Warehousing MCQs

Try to answer these 100+ Data Warehousing MCQs and check your understanding of the Data Warehousing subject.
Scroll down and let's begin!

1: Which of the following implements Multidimensional OLAP?

A.   Oracle Essbase

B.   IBM TM1

C.   Microsoft SSAS

D.   All of the Above

2: Which of the following is true about Data Scrubbing?

A.   It is same as data validation

B.   Harmonization of data cannot be considered as Data Scrubbing

C.   It involves data cleansing

D.   Standardization of data cannot be considered as Data Scrubbing

3: Operational Database(OLTP) is based on which of the following?

A.   Snowflake Schema

B.   Entity Relationship Model

C.   Star Schema

D.   Fact Constellation Schema

4: The ‘Slice’ operation deals with?

A.   Selecting the cells of any one dimension of data cube

B.   Merging the cells of all but one dimension

C.   Merging the cells along one dimension

D.   Selecting all but one dimension of the data cube.

5: Which of the following is NOT a benefit of data warehousing?

A.   To maintain data history

B.   To enhance data quality

C.   To collate data from multiple sources into a multiple databases

D.   To integrate data from multiple source systems

6: Where does cleansing take place during the ETL phase of data bound for a MOLAP based data warehouse?

A.   ODS

B.   Cleanroom Table

C.   Staging Area

D.   No distinct cleansing phase, data cleansed during MDX queries

7: Which of the following is true about Specialized SQL servers?

A.   It uses the two level of data storage representation to handle dense and sparse data sets

B.   It provides advanced query language and query processing support for SQL queries over star and snowflake schemas

C.   It facilitates OLTP operations in SQL

D.   It uses array-based multidimensional storage engines

8: A data warehouse is built as a separate repository of data, different from the operational data of an enterprise because?

A.   It is necessary to keep the operational data free of any warehouse operations

B.   It contains data derived from multiple sources

C.   A data warehouse cannot afford to allow corrupted data within it

D.   A data warehouse contains summarized data whereas the operational database contains transactional data

9: What is an example of an OLAP dice operation?

A.   specifying a particular year and region

B.   randomizing the year and region

C.   specifying a particular year

D.   randomizing the year

10: What table is at the center of a star schema?

A.   fact table

B.   core table

C.   metadata table

D.   dimension table

11: Which of the following is TRUE about ROLAP?

A.   Can handle large amounts of data

B.   Performance can be slow

C.   All of the given options are valid

D.   Can leverage functionalities inherent in the relational database

12: Which of the following is VALID about time-variant characteristic of a data warehouse?

A.   All of the given options are valid

B.   Data warehouses contain data that is generally loaded from the operational databases on a regular interval

C.   Time horizon of a data warehouse is significanlty longer than that of operational systems

D.   Data Warehouse maintains both historical and (nearly) current data

13: Which of the following is TRUE about MOLAP?

A.   MOLAP cubes are built for fast data retrieval

B.   MOLAP cubes are optimal for slicing and dicing operations

C.   Data is stored in a multidimensional cube.

D.   All of the given options are valid

14: Which of the following is true about Data mining?

A.   It allows users to analyze data from many different dimensions or angles

B.   All of the given options are valid

C.   It is the process of analyzing data from different perspectives and summarizing it into useful information

D.   It is the process of finding correlations or patterns among various fields

15: Which of the following systems is optimized for multidimensional analysis?

A.   SQL

B.   ETL

C.   OLAP

D.   OLTP

16: Which of the following are valid Logical Extraction methods?

A.   Incremental Extraction

B.   Both Full Extraction and Incremental Extraction

C.   Full Extraction

D.   Online extraction

17: Which of the following is true about Data Warehousing?

A.   All of the given options are true

B.   It is copy of transaction data specifically structured for query and analysis

C.   It is designed to facilitate reporting and analysis

D.   It is a non-volatile time-variant repository

18: Which of the following is true about Snowflake schema?

A.   It is represented by centralized fact tables

B.   Its a logical arrangement of tables in a multidimensional database

C.   All of the above

D.   It is a variation of the star schema

19: What is the process by which raw data is migrated into a data warehouse?

A.   Extract, Transform, Load

B.   Extract Test Language

C.   Export, Transmit, Load

D.   Export, Translate, Load

20: Data Warehouse (OLAP) is based on which of the following?

A.   Star Schema

B.   Snowflake Schema

C.   Fact Constellation Schema

D.   All of given options are valid

21: Which of the following uses the idea of a data cube to represent the dimensions of data ?

A.   OLTP

B.   Relational Database

C.   ODS

D.   OLAP or Multidimensional Database

22: Which system is appropriate for recording customer transactions in real time?

A.   Star Schema

B.   OLTP

C.   OLAP

D.   Snowflake Schema

23: When is it appropriate to use a Joiner Transformation during ETL?

A.   When the two sources have multiple matching columns

B.   When the two sources have a primary-key to foreign-key relationship

C.   All of the Above

D.   When the two sources are heterogeneous

24: What is the "grain" of a fact table?

A.   the most atomic level at which the facts may be defined

B.   the raw data from which the facts are derived

C.   the direction along which additive measures can be combined

D.   the ratio of facts to dimensions

25: Which of the following operations navigate data from less detailed data to highly detailed data?

A.   Drill-up

B.   Roll-down

C.   Roll-up

D.   Drill-down

26: Which MDX operation returns all possible combinations of one or more sets?

A.   Crossjoin

B.   AllMembers

C.   Leaves

D.   Distinct

27: Which language is appropriate for querying a non-relational OLAP database?

A.   Fortran

B.   MDX

C.   SQL

D.   SPSS

28: Modifying an OLAP Cube to view information from only a single year is an example of which operation?

A.   dice

B.   join

C.   pivot

D.   slice

29: Which of the following is an example of a non-aggregatable measure?

A.   transaction volume

B.   gross profits

C.   costs

D.   probability of default

30: Which of the following is true about 'Pivot' operation?

A.   It rotates the data axes in view in order to provide an alternative presentation of data

B.   Two consecutive slice operations in two different dimensions

C.   It is also known as rotation

D.   All of the given options are valid

31: Which of the following is a method for delta detection during the ETL process?

A.   CRC

B.   (both of these choices)

C.   (none of these choices)

D.   audit columns

32: Which line is not a valid MDX comment?

A.   /* this line */

B.   // this line

C.   ## this line

D.   -- this line

33: What is a Data Mart?

A.   an access layer comprising a subset of a data warehouse

B.   an online, open exchange in which organizations can trade business information

C.   a schema that organizes data into facts and dimensions

D.   a central repository where separate organizations can securely backup data

34: Choose the most correct statement:

A.   Star and snowflake schema contains two Fact tables

B.   Snowflake schema contains two Fact tables

C.   Fact Constellation schema contains two Fact tables

D.   Star schema contains two Fact tables

35: Which of the following would not be a part of the fact table of a star schema representing sales data?

A.   Product name when a Product dimension table exists

B.   Store UID when a Store dimension table exists

C.   Units sold

D.   None of the Above

36: Which of the following is not a VALID characteristic of data warehouse?

A.   Subject Oriented

B.   Volatile

C.   Nonvolatile

D.   Integrated

37: Which of the following environments has a higher amount of Joins

A.   Database

B.   OLAP

C.   OLTP

D.   Data Warehousing env

38: Which of the following operation is performed by introducing new dimensions?

A.   Roll-up

B.   Roll-down

C.   Drill-down

D.   Drill-up

39: Which of the following contains historical data?

A.   OLAP

B.   OLTP

C.   Both OLAP and OLTP

D.   Neither OLAP nor OLTP

40: What is an OLAP Cube?

A.   an array in which data is stored and characterized by multiple dimensions

B.   None of the Above

C.   a three-dimensional array for Online Analytical Processing

D.   a dimensional-reduction operation that summarizes data

41: Which of the following can present summary data from an OLAP server?

A.   normalization

B.   All of the Above

C.   pivot tables

D.   primary keys

42: Which of the following enviroments are in third normal form 3NF?

A.   All of the options are correct

B.   OLAP

C.   Data Warehousing env

D.   OLTP

43: Which of the following is true about multidimensional models?

A.   It uses just one level of data storage representation to handle sparse data sets

B.   It uses just one level of data storage representation to handle dense data sets

C.   It uses one level of data storage representation to handle both dense and sparse data sets

D.   It uses two level of data storage representation to handle dense and sparse data sets

44: Which of the following is true about 'Dice' operation?

A.   The Dice operation performs selection of two or more dimension on a given cube

B.   It forms a new sub-cube by selecting one or more dimensions

C.   The Dice operation performs selection of one dimension on a given cube

D.   It navigates the data from less detailed data to highly detailed data

45: Which operation on a multidimensional database is most equivalent to a relational join?

A.   dice

B.   slice

C.   rotating

D.   drill-across

46: In which of the data storing approaches does transaction data get partitioned into facts?

A.   normalized approach

B.   Neither dimensional nor normalized approach

C.   Both dimensional and normalized approach

D.   dimensional approach

47: What Amazon Reshift compression encodings would be most appropriate for the columns (Date, City, Description), respectively?

A.   Delta, Byte-dictionary, LZO

B.   MOSTLY8, Runlength, Raw

C.   Byte-dictionary, LZO, Delta

D.   LZO, Delta, Raw

48: Which of the following is a property of Data Warehouse data?

A.   Read and Write

B.   Read Only

C.   Write Only

D.   Write Deconditional

49: Which of following stages involve data update in the warehouse for every transaction performed on the source data?

A.   Integrated data warehouse

B.   On time data warehouse

C.   Offline operational data warehouse

D.   Offline data warehouse

50: Which of the following is not a type of OLAP?

A.   Relational

B.   Multidimensional

C.   Hybrid

D.   Analytical

51: ‘ROLAP’ is preferred over ‘MOLAP’ in which of the following cases?

A.   A data warehouse and relational database are separable

B.   When there is a requirement to perform complex calculation

C.   When fast data retrieval is required

D.   When slicing and dicing is required

A.   The first layer of the architecture

B.   The Second layer of the architecture

C.   The third layer of the architecture

D.   The fourth layer of the architecture

53: Which of the following operations performs aggregation on a data cube by dimension reduction?

A.   Roll-up

B.   Roll-down

C.   Drill-down

D.   Drill-up

54: In order to populate the data warehouse, which of the following set of operations are appropriate?

A.   Query and update

B.   Insert and delete

C.   Refresh and load

D.   Create and edit

55: Among the following, which is NOT the objective of clustering?

A.   To cover natural groupings

B.   It facilitates OLAP operations

C.   To find consistent and valid organizations of the data

D.   To initiate hypothesis about the data

56: Which of the following is the most appropriate data structure for indexing a MOLAP Cube?

A.   Index table

B.   B-Tree

C.   Bitmap index

D.   Huffman encoding

57: Which of the following applies to Fact Tables?

A.   Partially denormalized

B.   Completely normalized

C.   Completely denormalized

D.   Partially normalized

58: When is a column-based engine a more appropriate architecture for a Data Warehouse than a row-based engine?

A.   most columns in a table will be required for a query

B.   dynamic, unpredictable query patterns

C.   small number of known indexes to be created, static query patterns

D.   a column-based engine is only appropriate for OLTP applications

59: Which of the following is not a property of a perspective?

A.   Can provide business-specific viewpoints of a cube

B.   Can securely restrict access to objects in a cube

C.   Can hide any number of dimensions from a cube

D.   Can display hierarchies from a cube

60: What is the principal disadvantage of Type 3 Slowly Changing Data management?

A.   Does not track historical data

B.   Stores only a finite number of historical changes

C.   Significantly increases storage requirements per record

D.   Updates obsolete data with new values infrequently

61: Which of the following is an appropriate method for mitigating the database explosion associated with sparse dimensions?

A.   populate the sparse dimension with filler values

B.   apply Type 6 updating to the sparse dimension

C.   isolate the sparse dimension in a separate cube

D.   create a composite to stand-in for the sparse dimension

62: Which of the following is not TRUE about subject oriented characteristics of a data warehouse?

A.   Data that are never deleted once they have been added

B.   Data should not change once it is entered into the warehouse

C.   Data is manipulated to provide information about a particular subjec

D.   Data warehouse has a defined scope

63: Which of the following is not a valid context for the MDX From clause?

A.   a perspective (subset of measures and dimensions

B.   a cube

C.   a subcube (nested MDX query)

D.   None of the Above

64: An ETL Joiner Transformation is performed on Source A with 10,000 rows and Source B with 25,000 rows. Which source should be the master?

A.   None of the Above

B.   Source B to prevent rows from being omitted

C.   Source A to speed up the join process

D.   There is no master-detail distinction, Joiner Transformations gives both sources parity

65: Which of the following is not an advisable implementation of division in MDX?

A.   measures.[Value A]/measures.[Value B]

B.   Divide(measures.[Value A]/measures.[Value B], null)

C.   None of the Above

D.   iff(measures.[Value A]=0, null, measures.[Value A]/measures.[Value B])

66: What is true of querying normalized data in a relational database for the purposes of business analysis?

A.   All of the Above

B.   Aggregating relevant data requires the creation of new primary keys

C.   Aggregating relevant data requires multiple table joins

D.   The data must first be placed into a data warehouse via the ETL process

67: Which of the following environments has more indexes?

A.   Data Warehousing env

B.   OLTP

C.   OLAP

D.   Database

68: What are the following is not a requirement of First Normal Form?

A.   Separate tables for each group of related data with a unique identifier for each row

B.   No columns that are not dependent on a primary key

C.   No duplicate columns within a single table

D.   None of the Above

69: What is a slice in the context of Amazon Redshift?

A.   A logical partition of the data layer shared between distributed compute nodes

B.   A parallel operation isolating a single layer of a Redshift cube

C.   A partition allowing each of core of a multi-processor node to process the node's workload in parallel

D.   A client-facing partition that isolates different parts of the data warehouse for different types of users in an organization

70: Without deliberate security measures, which of the following MDX clauses is susceptible to injection attack?

A.   FROM

B.   WITH

C.   All of the Above

D.   WHERE

71: Where is the OLAP page pool stored in memory in an Oracle Database 10g and later?

A.   Private SQL Area

B.   User Global Area

C.   Program Global Area

D.   SQL Work Area

72: What property do Router Transformations and Filter Transformations have in common?

A.   All of the Above

B.   will not block input rows

C.   single input group transformation

D.   single output group transformation

73: Online extraction process falls under which type of data extraction method?

A.   Full Extraction

B.   Logical Extraction

C.   Physical Extraction

D.   Incremental Extraction

74: Which of the following is a required property of information passed to the Microsoft Sequence Clustering Algorithm?

A.   sequence information stored as a nested table

B.   normalized data with key sequence columns on each dimension table

C.   star schema with key sequence columns on each dimension table

D.   sequence information stored as a flat table

75: A data warehouse is an ‘integrated’ collection of data because?

A.   It contains summarized data

B.   It is a collection of data of derived from multiple sources

C.   It is a relational database

D.   It is a collection of data of different types

76: All else being equal, which of the following elements of a single MDX WHERE clause will evaluate first?

A.   MEMBER [Measures].[A Squared] AS [Measures].[Measure A] * [Measures].[Measure A], SOLVE_ORDER = 2

B.   MEMBER [Measures].[C Less One] AS [Measures].[Measure C] - 1, SOLVE_ORDER = 1

C.   All elements evaluate simultaneously

D.   MEMBER [Measures].[B Twice] AS [Measures].[Measure B] + [Measures].[Measure B], SOLVE_ORDER = 4

77: Which of the following are the modes of OLAP?

A.   MOLAP

B.   ROLAP

C.   KOLAP

78: OLAP queries can be characterised as on-line transactions that do not:

A.   Access small amounts of data

B.   Analyse the relationships between many types of business elements e.g. sales, products, regions, and channels

C.   Compare aggregated data over hierarchical time periods e.g. monthly, quarterly, yearly

D.   Present data in different perspectives e.g. sales by region vs. sales by channels by product within each region

E.   Respond quickly to user requests, so that users can pursue an analytical thought process without being stymied by the system

79: Normalisation is:

A.   The process of organising data in accordance with the rules of a relational database

B.   The process of cleansing the data

C.   The process of integrating the data into the datawarehouse from legacy systems

D.   The process of compressing the data

E.   The process of eliminating invalid data before it is introduced into the data warehouse

80: Which of the following would not be considered as a variable affecting the design of an OLAP system?

A.   Query demand

B.   Source of data

C.   Number of dimensions

D.   Atomic data volume

E.   Data volatility

81: A slice is:

A.   A subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset

B.   A subset of a multi-dimensional array corresponding to multiple values for one or more members of the dimensions not in the subset

C.   A subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions in the subset

D.   A subset of a multi-dimensional array corresponding to multiple values for one or more members of the dimensions in the subset

E.   A subset of a multi-dimensional array not corresponding to a single value for one member of the dimensions not in the subset

82: Which of the following techniques can be used to improve query performance?

A.   Denormalization

B.   Partitioning

C.   Summarization

D.   Denormalization and Partitioning

E.   Denormalization, Partitioning and Summarization

83: The term OLAP was coined by:

A.   Date

B.   Codd

C.   IBM

D.   Oracle

E.   Microsoft

84: The main objects used by OLAP programs are:

A.   Multidimensional cubes

B.   Metadata

C.   RDBMS tables

D.   Fact tables

E.   Pivot tables

85: Granularity refers to the:

A.   Validity of the data stored in a data warehouse

B.   The level of detail of the facts stored in a data warehouse

C.   The timeliness of the data stored in a data warehouse

D.   The redundancy of the data stored in a data warehouse

E.   Compactness of the data stored in a data warehouse

86: Which of the following queries would be correlated with a Data warehouse?

A.   What is the current account balance of this customer?

B.   How many customers have not paid their balances on time?

C.   What is the total number of customers in the middle region?

D.   hich product line sells best in middle region and how does this correlate to demographic data?

E.   Which customer makes the maximum purchases?

87: Data Volatility describes:

A.   The degree to which data and data structures change over time

B.   The redundancy of the data

C.   The volume of the data

D.   The compactness of the data

E.   The validity of the data

88: A multi-dimensional data set is sparse if:

A.   The data to be analysed is less in volume

B.   If a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain missing data

C.   If a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain invalid data

D.   If a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain valid data

E.   If a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain outdated data

89: Normalization applied to the dimension tables of a star schema is known as:

A.   Snowflaking

B.   Synchronization

C.   Slicing and Dicing

D.   Replication

E.   Data transformation

90: Replication refers to the:

A.   Physical copying of data from one database to another

B.   Cleansing of the data

C.   Integration of data from various sources into the data warehouse

D.   Analysis of the data

E.   Recovery of data

91: ROLAP stands for:

A.   Recyclic On-line Analytical Processing

B.   Relational On-line Analytical Processing

C.   Reduced On-line Analytical Processing

D.   Rotated On-line Analytical Processing

E.   Redundant On-line Analytical Processing

92: In a star schema, a table which contains data about one of the dimensions is called a:

A.   Fact table

B.   Meta table

C.   Data Dictionary

D.   Pivot table

E.   Dimension table

93: Which of the following would not be an application of Data Mining in the banking field?

A.   Detect patterns of fraudulent credit card use

B.   Ascertaining the number of transactions made in a day

C.   Determine credit card spending by customer groups

D.   Find hidden correlation between different financial indicators

E.   Predict the customers likely to change their credit card affiliation

94: HOLAP stands for:

A.   Hierarchical On-line Analytical Processing

B.   Hybrid On-line Analytical Processing

C.   Horizontal On-line Analytical Processing

D.   Hyper On-line Analytical Processing

E.   HyperCube On-line Analytical Processing

95: A structure that stores multi-dimensional information, having one cell for each possible combination of dimensions is known as:

A.   Table

B.   Section

C.   Partition

D.   Cube

E.   Repository

96: Which Data Mining function/technique is used to analyse a collection of records over a period of time?

A.   Classification

B.   Associations

C.   Sequential/Temporal patterns

D.   Clustering

E.   Segmentation

97: Which technique of Data Mining involves developing mathematical structures with the ability to learn?

A.   Clustering and Segmentation

B.   Neural Networks

C.   Fuzzy Logic

D.   Linear Regression Analysis

E.   Rule based Analysis

98: A datawarehouse should be able to implement advanced query functionality. This means :

A.   The RDBMS must provide a complete set of analytic operations including core sequential and statistical operations

B.   The RDBMS must not have any architectural limitations

C.   The RDBMS server must support hundreds, even thousands, of concurrent users while maintaining acceptable query performance

D.   Query performance must not be dependent on the size of the database, but rather on the complexity of the query

E.   The warehouse must ensure local consistency, global consistency, and referential integrity

99: The modification of data as it is moved into the data warehouse is:

A.   Data Transformation

B.   Replication

C.   Synchronization

D.   Data migration

E.   Normalization

100: Which of the following statements is incorrect regarding Data Mining?

A.   It is the process of turning data into information

B.   It is a collection of many techniques

C.   It is a replacement for OLAP

D.   It is based on machine generated hypothesis

E.   It is used in Decision Support, Prediction, Forecasting and Estimation