Data Analytics MCQs

Data Analytics MCQs

Try to answer these 100+ Data Analytics MCQs and check your understanding of the Data Analytics subject.
Scroll down and let's begin!

1:

The values of X and Y are given in f‌igure-1 Of the image. Choose the correct value of 2X — 5Y from figure-2.

A.  

A

B.  

B

C.  

c

D.  

D

2: Which of the following types of time series analysis aims at separating periodic or cyclical components in a time series?

A.   Explanative analysis

B.   Spectral analysis

C.   Forecasting

D.   Descriptive analysis

3: With respect to the Microsoft sequence clustering algorithm, which of the following options is the correct syntax of the PredictCaseLikelihood (DMX) function?

A.   PredictCaseLikelihood()

B.   PredictCaseLikelihood([NORMALIZEDINONNORMALIZEDD

C.   PredictCaseLikelihood(, [])

D.   PredictCaseLikelihood(, [))

4: Which of the following is the correct syntax of the PredictVariance (DMX) prediction function used in Microsoft logistic regression algorithm?

A.   PredictVariance( I )

B.   PredictVariance()

C.   PredictVariance()

D.   PredictVariance()

5: Which of the following options represent(s) the correct application of association rule mining?

A.   Catalog design

B.   Basket data analysis

C.   Cross-marketing

D.   Loss-leader analysis

E.   All of the above

F.   None of the above

6: Which of the following options is/are the correct application(s) of text mining?

A.   It can automatically process messages and emails.

B.   It can investigate competitors by crawling their web sites.

C.   It can analyze open-ended survey responses.

D.   It can analyze warranty or insurance claims.

E.   All of the above.

7:

Select the value of X given in f‌igure-1, from the options given in figure-2.

A.  

A

B.  

B

C.  

c

D.  

0

8:


Consider the matrix Z given in f‌igure-1 of the image. Using the matrix methods. find the 1x3 vector. 9

A.  

A

B.  

B

C.  

C

D.  

D

9:

For what purpose is the following R function run?

print(getwd)


A.  

To get and print the current working directory. 

B.  

To get and print all working directories.

C.  

To count and print all working directories.

D.  

To print the location of a working directory.

10: With respect to Microsoft neural network algorithm. which of the following options is the neuron type that represents predictable attribute values for a data mining model?

A.   Input neuron

B.   Hidden neuron

C.   Output neuron

D.   None of the above

11: Which of the following options is/are correct about the Microsoft naive bayes algorithm?

A.   It is used for calculating the conditional probability between input and predictable columns and it assumes that the columns are independent.

B.   It is used for performing automatic feature selection to limit the number of values that are considered when building a model.

C.   It is provided by Microsoft SQL Server analysis services for use in predictive modeling.

D.   It is used for considering each pair of input attribute values and output attribute values.

E.   All of the above.

12: Which of the following options is correct about the logistic regression technique?

A.   It is used for encouraging group effect in case of highly correlated variables.

B.   It is used for finding the probability of event=Success and event=Failure.

C.   It is used for adding and removing predictors as needed for each step.

D.   It is used for penalizing the absolute size of the regression coeff‌icients.

13: In data mining, which of the following options is correct about the regression algorithm?

A.   It is used for predicting one or more continuous numeric variables; for example. profit or loss that is based on other attributes in a dataset.

B.   It is used for finding correlations between different attributes in a dataset.

C.   It is used for dividing data into groups or clusters of items that have similar properties.

D.   It is used for summarizing frequent sequences or episodes in data; for example. a series of log events preceding machine maintenance.

14: As per the Microsoft association rules model. which of the following options is the correct viewer tab that combines information about itemsets and their relative value?

A.   ltemsets

B.   Dependency Network

C.   Rules

D.   None of the above

15: Which of the following statements is correct about the intervention analysis type of the time series analysis?

A.   It is used for f‌inding whether an event can lead to a change in a time series.

B.   It is used for f‌inding a trend or pattern in a time series through the use of graphs or other tools.

C.   It is used extensively in budgeting. which is based on historical trends.

D.   It is used for studying the cross correlation between two time series and their dependence on another.

16: Which of the following is the correct syntax of the PredictAssociation prediction function used in the Microsoft association rule algorithm?

A.  

PredictAssociation(<NodelD>)   

B.  

PredictAssociation(<cluster column reference>, [<predicted state>)) 

C.  

PredictAssociation(<scalar column reference>)

D.  

PredictAssociation(<tabIe column reference>, optionl, option2, n ...)       

17: Which of the following is the correct default value of the MAXIMUM_ITEMSET_SIZE parameter, which is used with the Microsoft association rules algorithm?

A.   10

B.   3

C.   1

D.   0.4

18: With respect to advanced statistics, which of the following options is the correct syntax Of the glm() function?

A.   glm(formula, family=familytype(link=linkfunction), data=)

B.   glm(formula, data=, method=,control=)

C.   glm(vector, start=. end=, frequency=)

D.   glm(bootobject. conf=, type=)

19:

Find the output of the following R programming language code.

z1 <- c(7,5,8,4,4,16)

z2 <- c(9,6)

add.result <- 21+22

print(add.result)

sub.result <- 21-22

print(sub.result)


A.  

[1]161184416

[1}2-184416


B.  

[1]1511171313 25

[1] -2 -1 2 -2 -210


C.  

[1]1611171013 22

[1] -2 -1 -1 -2 -5 10 


D.  

[1]151074814

[1]-1-216 -4 -3


20:

Find the output of the following code of the R programming language.

z1 <- c(4,3,TRUE,2+6i)

z2 <- c(4,7,TRUE.2+7i)

print(z1&22)


A.  

[1] TRUE TRUE TRUE TRUE

B.  

[1] TRUE FALSE TRUE FALSE

C.  

[1] FALSE TRUE FALSE TRUE

D.  

[1] FALSE FALSE FALSE FALSE

21:

What will be the output of the following R code?

c(4,7,TRUE,3+7i) -> v1

c(9,6,FALSE,3+7i) ->> v2

print(v1)

print(v2)


A.  

[114+01 4+1i 7+01 3+7i

[1] 9+0i 9+1i 6+0i 3+7i 


B.  

[1]4+0i7+0i1+0i3+7i

[1] 9101 6+0i 0+01 3+7i  


C.  

[1) 4+0i 7+7l1+1i 3+7i

[1] 9+Oi 9+1i 6+6i 3+7i


D.  

[1]4+4i7+7i1+1i3+7i

[119+9i 6+6i1+1i 3+7i


22: Which of the following is the correct syntax of the command that will verify the installation of the xlsx package and load the library into R workspace?

A.   grepl.any(installed.packages("xlsx")) library("xlsx")

B.   any(grepl("xlsx“,installed.package())) library("xlsx")

C.   any.grepl(xlsx,installed.package50) |ibrary(xlsx)

D.   grepl(any(installed.packages(xlsx))) |ibrary(xlsx)

23: As per the Microsoft sequence clustering algorithm, which of the following options is the correct syntax of the Cluster (DMX) prediction function?

A.   Cluster()

B.   Cluster([])

C.   Cluster()

D.   Cluster([])

24:

In the given image, which set of vectors is linearly independent?


A.  

A

B.  

B

C.  

C

D.  

0

25: Which of the following text mining techniques can be used for f‌inding groups of documents with similar content?

A.   Clustering

B.   Categorization

C.   Visualization

D.   Information extraction

26:

What will be the output of the following code of the R programming language?

a <- c(9,0.FALSE,2+9i)

b <- c(8,0,TRUE,2+7i)

print(alb)


A.  

[1] FALSE TRUE FALSE FALSE

B.  

[1] TRUE TRUE TRUE FALSE 

C.  

[1] TRUE FALSE TRUE TRUE 

D.  

[1] FALSE FALSE FALSE TRUE

27:

Find the output of the following R programming language code.

a <- c(7.5.FALSE.4+4i)

b <- c(6,0,TRUE,4+7i)

print(a&&b)


A.  

[1] FALSE

B.  

[1] FALSE TRUE

C.  

(1] FALSE FALSE

D.  

[1] TRUE

28: IN SOL Server data mining, which of the following algorithm types predicts one or more discrete variables that are based on other attributes in a dataset?

A.   Segmentation algorithm

B.   Classif‌ication algorithm

C.   Sequence analysis algorithm

D.   Association algorithm

29:

What will the following R code do?

mydata$v2 <- mydata$v4 <- NULL


A.  

It will replace the value of variable v2 with v4 and will delete the variable v4.

B.  

It will replace the value of variable v4 with v2 and will delete the variable v2.

C.  

It will delete the variables v2 and v4.

D.  

None of the above.

30: In data mining, which of the following options is the correct syntax for association?

A.   match associations [as pattern_name] analyze {measure(s) }

B.   mine associations [as pattern_name] analyze classifying_attribute_or_dimension

C.   mine associations [as [pattern_name]] {matching {metapattern}}

D.   mine associations [as pattern_name] analyze prediction_attribute_or_dimension {set [attribute_or_dimension_i= value_i}]

31:

Choose True or False.

Text mining is used in spam filtering. content enrichment and contextual advertising.


A.  

True 

B.  

False

32:

A user wants to read and print the contents of a CSV file named myexample-csv that is present in his current working directory. Which of the following is the correct syntax of the command that should be executed by him to accomplish this task?


A.  

data <- read(myexample.csv)

print(data) 


B.  

data <- read.f‌ile(”myexample.csv")
print(data)  

C.  

data <- read.csv("myexample.csv")
print(data)   

D.  

data <~ read.data(myexample.csv)
print(data)

33: Which of the following regression techniques attempts maximizing the prediction power with minimum number of predictor variables?

A.   Stepwise regression

B.   Polynomial regression

C.   Linear regression

D.   Logistic regression

34: Which of the following is the correct syntax of the PredictSupport (DMX) prediction function used with Microsoft linear regression algorithm?

A.   PredictSupport(, [])

B.   PredictSupport( l )

C.   PredictSupport( I )

D.   PredictSupport( l )

35: Which of the following statements is correct about the Predictable column supported by the Microsoft linear regression algorithm?

A.   It supports the cyclical, key and table content types.

B.   It supports the key, table and ordered content types.

C.   It supports the continuous, key and table content types.

D.   It supports the continuous, cyclical and ordered content types.

36:

Using the following information, find the correct syntax of the R function used for creating binary f‌iles.

Assume object as the binary file to be written. n as the number Of bytes and con as the connection object.


A.  

writeBin(object, n, con)  

B.  

writeBin(object)  

C.  

writeBin(object, n) 

D.  

writeBin(object, con)

37: Which of the following statements is correct about the PREDICTION_SMOOTHING parameter used in the Microsoft time series algorithm?

A.   It specif‌ies how a model should be mixed for optimizing forecasting.

B.   It specifies which algorithm to use for analysis and prediction.

C.   It specif‌ies a numeric value between 0 and 1 that detects periodicity.

D.   It specif‌ies the minimum number of time slices that are required to generate a split in each time series tree.

38:

Find the output of the following code of the R programming language.

Iista <- Iist(5:7)

print(lista)

Iistb <-Iist(12:14)

print(listb)

x1 <- unlist(lista)

x2 <- unlist(listb)

print(xl)

print(x2)

r <- x1+x2

print(r)


A.  

[[1]]

[1] 5 6 7

[[1]]

[1] 12 13 14

[1] 5 6 7

[1]  12 13 14

[1]  17 19 21 


B.  

[[1]]

[1] 5 6 7

[[1]]

[1] 12 13 14

[1] 5 6 7

[1]  12 13 14

[1]  15 16 17


C.  

[[1]]

[1] 5 6 7

[[1]]

[1] 12 13 14

[1] 5 6 7

[1]  12 13 14

[1] 24 25 26


D.  

[[1]]

[1] 5 6 7

[[1]]

[1] 12 13 14

[1] 5 6 7

[1]  12 13 14

[1]  11 12 13


39: Which of the following is the correct default value for the INSTABILITY_SENSITIVITY parameter used with the Microsoft time series algorithm?

A.   0.6

B.   0.1

C.   10

D.   1

40: Which of the following is the correct syntax of the command used for merging two data frames, myFrame1 and myFrame2, by ID and Country?

A.   total <~ merge(data myFrame1 with myFrame2, by=c(lD,Country))

B.   total <- merge(data myFrame1,data myFrame2,by=c("lD","Country"))

C.   total <- merge(data by=c("lD","Country") for myFrame1, myFrame2)

D.   total <- merge(data for myFrame1, myFrame2,by=C(lD,Country))

41:

From f‌igure-2 Of the given image, select the Option representing the inverse of the matrix given in f‌igure-1.


A.  

A

B.  

B

C.  

c

D.  

D

42:

Which of the following options represent correct application of the time series analysis?

i) Yield Projections

ii) Workload Projections

iii) Census Analysis

iv) Inventory Studies


A.  

Only options i) and ii)

B.  

Only options ii) and iv)

C.  

Only Options i). ii) and iv)

D.  

Only options ii). iii) and iv)

E.  

All options i), ii). iii) and iv)

43: Which of the following is the correct syntax for the PredictAdjustedProbability (DMX) prediction function used with the Microsoft association rules algorithm?

A.   PredictAdjustedProbability(. [))

B.   PredictAdjustedProbability()

C.   PredictAdjustedProbability()

D.   PredictAdjustedProbabilityo

44: With respect to advanced statistics, which of the following options is correct about the arimaO function?

A.   It can be used to produce an unrotated principal component analysis.

B.   It can be used to produce maximum likelihood factor analysis.

C.   It can be used to bootstrap the structural equation model.

D.   It can be used to f‌it an autoregressive integrated moving average model.

45: In data mining, which of the following options is correct about the F-score measure for text retrieval?

A.   F-score = recall - precision + (recall x precision) / 9

B.   F-score = recall + precision - (recall x precision) I 7

C.   F-score = recall x precision / (recall + precision) / 2

D.   F-score = recall I precision x (recall - precision) / 5

46: Which of the following is the default value of the parameter HISTORICAL_MODEL_GAP used in Microsoft time series algorithm?

A.   10

B.   1

C.   0

D.   5

47: Which of the following advanced statistics techniques is used for identifying latent variables that form groups?

A.   Regression analysis

B.   ANOVA

C.   Factor analysis

D.   Logistic regression

48: In data mining, which of the following options correctly def‌ines Precision, which is used for assessing the quality of text retrieval?

A.   precision: l[Relevant] n [Retrieved]l / l[Retrieved]l

B.   Precision= l[Retrieved} U [F-score]l + l[F-score}l

C.   Precision= l[Recall] / [F-scorejl x l[RecalI]l

D.   Precision= l[F-score] x [Recalljl - l[F—score)l

49:

Which of the given options will be the output of the following code when it is executed in R?

var <— c(8.4.NA.12)

mean(var, na.rm=TRUE)


A.  

[1) 2 

B.  

[114

C.  

[1) 8

D.  

[1110

E.  

The code will throw an error.

50: Which of the following text retrieval measures is the percentage of documents, which are relevant to the query and were actually retrieved?

A.   Precision

B.   Recall

C.   F-score

D.   None of the above