Summer Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: exc65

In which phase of the data analytics lifecycle do Data Scientists spend the most time in a project?

A.

Discovery

B.

Data Preparation

C.

Model Building

D.

Communicate Results

Projecting a multi-dimensional dataset onto which vector has the greatest variance?

A.

first principal component

B.

first eigenvector

C.

not enough information given to answer

D.

second eigenvector

E.

second principal component

Of all the smokers in a particular district, 40% prefer brand A and 60% prefer brand B. Of those smokers who prefer brand A. 30% are females, and of those who prefer brand B. 40% are female. What is the probability that a randomly selected smoker prefers brand A, given that the person selected is a female?

Which of the following is a best way to solve this problem?

A.

Bays Theorem

B.

Poisson Distribution

C.

Binomial Distribution

D.

None of the above

Select the correct option which applies to L2 regularization

A.

Computational efficient due to having analytical solutions

B.

Non-sparse outputs

C.

No feature selection

You have collected the 100's of parameters about the 1000's of websites e.g. daily hits, average time on the websites, number of unique visitors, number of returning visitors etc. Now you have find the most important parameters which can best describe a website, so which of the following technique you will use

A.

PCA (Principal component analysis)

B.

Linear Regression

C.

Logistic Regression

D.

Clustering

Classification and regression are examples of___________.

A.

supervised learning

B.

un-supervised learning

C.

Clustering

D.

Density estimation

Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has

rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 10% of the time. Which of the following will you use to calculate the probability whether it will rain on the

day of Marie’s wedding?

A.

Naive Bayes

B.

Logistic Regression

C.

Random Decision Forests

D.

All of the above

Refer to the Exhibit.

In the Exhibit, the table shows the values for the input Boolean attributes "A", "B", and "C". It also shows the values for the output attribute "class". Which decision tree is valid for the data?

A.

Tree A

B.

Tree B

C.

Tree C

D.

Tree D

The figure below shows a plot of the data of a data matrix M that is 1000 x 2. Which line represents the first principal component?

A.

yellow

B.

blue

C.

Neither

What is one modeling or descriptive statistical function in MADlib that is typically not provided in a standard relational database?

A.

Expected value

B.

Variance

C.

Linear regression

D.

Quantiles