Weekend Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: sntaclus

In association rules, given items X and Y, what does lift measure?

A.

Percentage of transactions that contain an itemset with X

B.

Percentage of transactions with Xthat also contain Y

C.

Difference in the probability ofX and Y appearing together compared with expectations as if they were statistically independent

D.

How many times more often X and Y occur together than expected if they were statistically independent, expressed as a ratio

Which analytic technique would be appropriate to estimate home sale price in U.S. dollars as a function of square footage, number of bedrooms, and lot size?

A.

Time series analysis

B.

Linear regression

C.

Naive Bayesian classification

D.

K-means clustering

You build a decision tree to classify five different types of customers based on their browsing history from a sample of 500. The resulting decision tree has 17 layers. One of the leaf nodes has only three customers.

What do you conclude?

A.

The decision tree needs to be rebuilt without the three customers

B.

The decision tree needs to be rebuilt to see if the results change

C.

The sample size is too small, so the classes may not be accurate

D.

Due to large number of layers, there may be an overfitting problem

How should project results be communicated to executives and the project sponsor?

A.

Focus on business outcomes and benefits

B.

Demonstrate your technical prowess to establish credibility

C.

Provide model performance visualizations

D.

Emphasize coding details and technical requirements

What converts SQL-like commands into either Tez, Spark, or MapReduce jobs that are submitted to the Hadoop cluster?

A.

Pig

B.

HBase

C.

Hive

D.

Mahout

In the data preparation phase of the data analytics lifecycle, what does the term “data conditioning” refer to?

A.

Building training and testing datasets

B.

Identifying relationships and correlations among variables

C.

Deploying the model and monitoring its performance

D.

Cleaning the data, normalizing datasets. and performing transformations

What data asset is an example of quasi-structured data?

A.

Excel file

B.

Clickstream data

C.

Relational database table

D.

Comma-separated value file

Refer to the exhibit.

To predict whether or not a customer will renew their annual property insurance policy, an insurance company built and operationalized a naïve Bayes classification model. In the model, there are two class labels, renewal and non-renewal, that are assigned to each customer based on their attributes.

A subset of the key attributes, their values, and corresponding conditional probabilities are provided in the exhibit.

A customer has the following attributes:

● Age is greater than 65 years

● Owns their own home

● Renewal month is August

If 20% of customers do not renew the police every year, what is the score for a renewal in the naïve Bayesian model for the customer described above?

A.

0.0022

B.

0 0027

C.

0.0270

D.

0.0216

In ANOVA, what is the null hypothesis for k population means?

A.

All population means are equal to each other

B.

At least two population means are equal

C.

At least two population means are not equal

D.

At most k-1 population means are equal

What is part of the model output for a linear regression?

A.

The assignment of each input datum to a cluster

B.

Coefficients indicating relative impact of the input variables on the outcome

C.

The set of all rules X -> Y with minimum support and confidence

D.

Probability score for each possible class label