Try to answer these 100+ Data Warehousing MCQs and check your understanding of the Data Warehousing subject.
Scroll down and let's begin!
A. Oracle Essbase
B. IBM TM1
C. Microsoft SSAS
D. All of the Above
A. It is same as data validation
B. Harmonization of data cannot be considered as Data Scrubbing
C. It involves data cleansing
D. Standardization of data cannot be considered as Data Scrubbing
A. Snowflake Schema
B. Entity Relationship Model
C. Star Schema
D. Fact Constellation Schema
A. Selecting the cells of any one dimension of data cube
B. Merging the cells of all but one dimension
C. Merging the cells along one dimension
D. Selecting all but one dimension of the data cube.
A. To maintain data history
B. To enhance data quality
C. To collate data from multiple sources into a multiple databases
D. To integrate data from multiple source systems
A. ODS
B. Cleanroom Table
C. Staging Area
D. No distinct cleansing phase, data cleansed during MDX queries
A. It uses the two level of data storage representation to handle dense and sparse data sets
B. It provides advanced query language and query processing support for SQL queries over star and snowflake schemas
C. It facilitates OLTP operations in SQL
D. It uses array-based multidimensional storage engines
A. It is necessary to keep the operational data free of any warehouse operations
B. It contains data derived from multiple sources
C. A data warehouse cannot afford to allow corrupted data within it
D. A data warehouse contains summarized data whereas the operational database contains transactional data
A. specifying a particular year and region
B. randomizing the year and region
C. specifying a particular year
D. randomizing the year
A. fact table
B. core table
C. metadata table
D. dimension table
A. Can handle large amounts of data
B. Performance can be slow
C. All of the given options are valid
D. Can leverage functionalities inherent in the relational database
A. All of the given options are valid
B. Data warehouses contain data that is generally loaded from the operational databases on a regular interval
C. Time horizon of a data warehouse is significanlty longer than that of operational systems
D. Data Warehouse maintains both historical and (nearly) current data
A. MOLAP cubes are built for fast data retrieval
B. MOLAP cubes are optimal for slicing and dicing operations
C. Data is stored in a multidimensional cube.
D. All of the given options are valid
A. It allows users to analyze data from many different dimensions or angles
B. All of the given options are valid
C. It is the process of analyzing data from different perspectives and summarizing it into useful information
D. It is the process of finding correlations or patterns among various fields
A. SQL
B. ETL
C. OLAP
D. OLTP
A. Incremental Extraction
B. Both Full Extraction and Incremental Extraction
C. Full Extraction
D. Online extraction
A. All of the given options are true
B. It is copy of transaction data specifically structured for query and analysis
C. It is designed to facilitate reporting and analysis
D. It is a non-volatile time-variant repository
A. It is represented by centralized fact tables
B. Its a logical arrangement of tables in a multidimensional database
C. All of the above
D. It is a variation of the star schema
A. Extract, Transform, Load
B. Extract Test Language
C. Export, Transmit, Load
D. Export, Translate, Load
A. Star Schema
B. Snowflake Schema
C. Fact Constellation Schema
D. All of given options are valid
A. OLTP
B. Relational Database
C. ODS
D. OLAP or Multidimensional Database
A. Star Schema
B. OLTP
C. OLAP
D. Snowflake Schema
A. When the two sources have multiple matching columns
B. When the two sources have a primary-key to foreign-key relationship
C. All of the Above
D. When the two sources are heterogeneous
A. the most atomic level at which the facts may be defined
B. the raw data from which the facts are derived
C. the direction along which additive measures can be combined
D. the ratio of facts to dimensions
A. Drill-up
B. Roll-down
C. Roll-up
D. Drill-down
A. Crossjoin
B. AllMembers
C. Leaves
D. Distinct
A. Fortran
B. MDX
C. SQL
D. SPSS
A. dice
B. join
C. pivot
D. slice
A. transaction volume
B. gross profits
C. costs
D. probability of default
A. It rotates the data axes in view in order to provide an alternative presentation of data
B. Two consecutive slice operations in two different dimensions
C. It is also known as rotation
D. All of the given options are valid
A. CRC
B. (both of these choices)
C. (none of these choices)
D. audit columns
A. /* this line */
B. // this line
C. ## this line
D. -- this line
A. an access layer comprising a subset of a data warehouse
B. an online, open exchange in which organizations can trade business information
C. a schema that organizes data into facts and dimensions
D. a central repository where separate organizations can securely backup data
A. Star and snowflake schema contains two Fact tables
B. Snowflake schema contains two Fact tables
C. Fact Constellation schema contains two Fact tables
D. Star schema contains two Fact tables
A. Product name when a Product dimension table exists
B. Store UID when a Store dimension table exists
C. Units sold
D. None of the Above
A. Subject Oriented
B. Volatile
C. Nonvolatile
D. Integrated
A. Database
B. OLAP
C. OLTP
D. Data Warehousing env
A. Roll-up
B. Roll-down
C. Drill-down
D. Drill-up
A. OLAP
B. OLTP
C. Both OLAP and OLTP
D. Neither OLAP nor OLTP
A. an array in which data is stored and characterized by multiple dimensions
B. None of the Above
C. a three-dimensional array for Online Analytical Processing
D. a dimensional-reduction operation that summarizes data
A. normalization
B. All of the Above
C. pivot tables
D. primary keys
A. All of the options are correct
B. OLAP
C. Data Warehousing env
D. OLTP
A. It uses just one level of data storage representation to handle sparse data sets
B. It uses just one level of data storage representation to handle dense data sets
C. It uses one level of data storage representation to handle both dense and sparse data sets
D. It uses two level of data storage representation to handle dense and sparse data sets
A. The Dice operation performs selection of two or more dimension on a given cube
B. It forms a new sub-cube by selecting one or more dimensions
C. The Dice operation performs selection of one dimension on a given cube
D. It navigates the data from less detailed data to highly detailed data
A. dice
B. slice
C. rotating
D. drill-across
A. normalized approach
B. Neither dimensional nor normalized approach
C. Both dimensional and normalized approach
D. dimensional approach
A. Delta, Byte-dictionary, LZO
B. MOSTLY8, Runlength, Raw
C. Byte-dictionary, LZO, Delta
D. LZO, Delta, Raw
A. Read and Write
B. Read Only
C. Write Only
D. Write Deconditional
A. Integrated data warehouse
B. On time data warehouse
C. Offline operational data warehouse
D. Offline data warehouse
A. Relational
B. Multidimensional
C. Hybrid
D. Analytical
A. A data warehouse and relational database are separable
B. When there is a requirement to perform complex calculation
C. When fast data retrieval is required
D. When slicing and dicing is required
A. The first layer of the architecture
B. The Second layer of the architecture
C. The third layer of the architecture
D. The fourth layer of the architecture
A. Roll-up
B. Roll-down
C. Drill-down
D. Drill-up
A. Query and update
B. Insert and delete
C. Refresh and load
D. Create and edit
A. To cover natural groupings
B. It facilitates OLAP operations
C. To find consistent and valid organizations of the data
D. To initiate hypothesis about the data
A. Index table
B. B-Tree
C. Bitmap index
D. Huffman encoding
A. Partially denormalized
B. Completely normalized
C. Completely denormalized
D. Partially normalized
A. most columns in a table will be required for a query
B. dynamic, unpredictable query patterns
C. small number of known indexes to be created, static query patterns
D. a column-based engine is only appropriate for OLTP applications
A. Can provide business-specific viewpoints of a cube
B. Can securely restrict access to objects in a cube
C. Can hide any number of dimensions from a cube
D. Can display hierarchies from a cube
A. Does not track historical data
B. Stores only a finite number of historical changes
C. Significantly increases storage requirements per record
D. Updates obsolete data with new values infrequently
A. populate the sparse dimension with filler values
B. apply Type 6 updating to the sparse dimension
C. isolate the sparse dimension in a separate cube
D. create a composite to stand-in for the sparse dimension
A. Data that are never deleted once they have been added
B. Data should not change once it is entered into the warehouse
C. Data is manipulated to provide information about a particular subjec
D. Data warehouse has a defined scope
A. a perspective (subset of measures and dimensions
B. a cube
C. a subcube (nested MDX query)
D. None of the Above
A. None of the Above
B. Source B to prevent rows from being omitted
C. Source A to speed up the join process
D. There is no master-detail distinction, Joiner Transformations gives both sources parity
A. measures.[Value A]/measures.[Value B]
B. Divide(measures.[Value A]/measures.[Value B], null)
C. None of the Above
D. iff(measures.[Value A]=0, null, measures.[Value A]/measures.[Value B])
A. All of the Above
B. Aggregating relevant data requires the creation of new primary keys
C. Aggregating relevant data requires multiple table joins
D. The data must first be placed into a data warehouse via the ETL process
A. Data Warehousing env
B. OLTP
C. OLAP
D. Database
A. Separate tables for each group of related data with a unique identifier for each row
B. No columns that are not dependent on a primary key
C. No duplicate columns within a single table
D. None of the Above
A. A logical partition of the data layer shared between distributed compute nodes
B. A parallel operation isolating a single layer of a Redshift cube
C. A partition allowing each of core of a multi-processor node to process the node's workload in parallel
D. A client-facing partition that isolates different parts of the data warehouse for different types of users in an organization
A. FROM
B. WITH
C. All of the Above
D. WHERE
A. Private SQL Area
B. User Global Area
C. Program Global Area
D. SQL Work Area
A. All of the Above
B. will not block input rows
C. single input group transformation
D. single output group transformation
A. Full Extraction
B. Logical Extraction
C. Physical Extraction
D. Incremental Extraction
A. sequence information stored as a nested table
B. normalized data with key sequence columns on each dimension table
C. star schema with key sequence columns on each dimension table
D. sequence information stored as a flat table
A. It contains summarized data
B. It is a collection of data of derived from multiple sources
C. It is a relational database
D. It is a collection of data of different types
A. MEMBER [Measures].[A Squared] AS [Measures].[Measure A] * [Measures].[Measure A], SOLVE_ORDER = 2
B. MEMBER [Measures].[C Less One] AS [Measures].[Measure C] - 1, SOLVE_ORDER = 1
C. All elements evaluate simultaneously
D. MEMBER [Measures].[B Twice] AS [Measures].[Measure B] + [Measures].[Measure B], SOLVE_ORDER = 4
A. MOLAP
B. ROLAP
C. KOLAP
A. Access small amounts of data
B. Analyse the relationships between many types of business elements e.g. sales, products, regions, and channels
C. Compare aggregated data over hierarchical time periods e.g. monthly, quarterly, yearly
D. Present data in different perspectives e.g. sales by region vs. sales by channels by product within each region
E. Respond quickly to user requests, so that users can pursue an analytical thought process without being stymied by the system
A. The process of organising data in accordance with the rules of a relational database
B. The process of cleansing the data
C. The process of integrating the data into the datawarehouse from legacy systems
D. The process of compressing the data
E. The process of eliminating invalid data before it is introduced into the data warehouse
A. Query demand
B. Source of data
C. Number of dimensions
D. Atomic data volume
E. Data volatility
A. A subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset
B. A subset of a multi-dimensional array corresponding to multiple values for one or more members of the dimensions not in the subset
C. A subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions in the subset
D. A subset of a multi-dimensional array corresponding to multiple values for one or more members of the dimensions in the subset
E. A subset of a multi-dimensional array not corresponding to a single value for one member of the dimensions not in the subset
A. Denormalization
B. Partitioning
C. Summarization
D. Denormalization and Partitioning
E. Denormalization, Partitioning and Summarization
A. Date
B. Codd
C. IBM
D. Oracle
E. Microsoft
A. Multidimensional cubes
B. Metadata
C. RDBMS tables
D. Fact tables
E. Pivot tables
A. Validity of the data stored in a data warehouse
B. The level of detail of the facts stored in a data warehouse
C. The timeliness of the data stored in a data warehouse
D. The redundancy of the data stored in a data warehouse
E. Compactness of the data stored in a data warehouse
A. What is the current account balance of this customer?
B. How many customers have not paid their balances on time?
C. What is the total number of customers in the middle region?
D. hich product line sells best in middle region and how does this correlate to demographic data?
E. Which customer makes the maximum purchases?
A. The degree to which data and data structures change over time
B. The redundancy of the data
C. The volume of the data
D. The compactness of the data
E. The validity of the data
A. The data to be analysed is less in volume
B. If a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain missing data
C. If a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain invalid data
D. If a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain valid data
E. If a relatively high percentage of the possible combinations (intersections) of the members from the data set's dimensions contain outdated data
A. Snowflaking
B. Synchronization
C. Slicing and Dicing
D. Replication
E. Data transformation
A. Physical copying of data from one database to another
B. Cleansing of the data
C. Integration of data from various sources into the data warehouse
D. Analysis of the data
E. Recovery of data
A. Recyclic On-line Analytical Processing
B. Relational On-line Analytical Processing
C. Reduced On-line Analytical Processing
D. Rotated On-line Analytical Processing
E. Redundant On-line Analytical Processing
A. Fact table
B. Meta table
C. Data Dictionary
D. Pivot table
E. Dimension table
A. Detect patterns of fraudulent credit card use
B. Ascertaining the number of transactions made in a day
C. Determine credit card spending by customer groups
D. Find hidden correlation between different financial indicators
E. Predict the customers likely to change their credit card affiliation
A. Hierarchical On-line Analytical Processing
B. Hybrid On-line Analytical Processing
C. Horizontal On-line Analytical Processing
D. Hyper On-line Analytical Processing
E. HyperCube On-line Analytical Processing
A. Table
B. Section
C. Partition
D. Cube
E. Repository
A. Classification
B. Associations
C. Sequential/Temporal patterns
D. Clustering
E. Segmentation
A. Clustering and Segmentation
B. Neural Networks
C. Fuzzy Logic
D. Linear Regression Analysis
E. Rule based Analysis
A. The RDBMS must provide a complete set of analytic operations including core sequential and statistical operations
B. The RDBMS must not have any architectural limitations
C. The RDBMS server must support hundreds, even thousands, of concurrent users while maintaining acceptable query performance
D. Query performance must not be dependent on the size of the database, but rather on the complexity of the query
E. The warehouse must ensure local consistency, global consistency, and referential integrity
A. Data Transformation
B. Replication
C. Synchronization
D. Data migration
E. Normalization
A. It is the process of turning data into information
B. It is a collection of many techniques
C. It is a replacement for OLAP
D. It is based on machine generated hypothesis
E. It is used in Decision Support, Prediction, Forecasting and Estimation