Summer Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: exc65

A data engineering team has created a Structured Streaming pipeline that processes data in micro-batches and populates gold-level tables. The microbatches are triggered every minute.

A data analyst has created a dashboard based on this gold-level data. The project stakeholders want to see the results in the dashboard updated within one minute or less of new data becoming available within the gold-level tables.

Which of the following cautions should the data analyst share prior to setting up the dashboard to complete this task?

A.

The required compute resources could be costly

B.

The gold-level tables are not appropriately clean for business reporting

C.

The streaming data is not an appropriate data source for a dashboard

D.

The streaming cluster is not fault tolerant

E.

The dashboard cannot be refreshed that quickly

Which statement describes descriptive statistics?

A.

A branch of statistics that uses a variety of data analysis techniques to infer properties of an underlying distribution of probability.

B.

A branch of statistics that uses summary statistics to categorically describe and summarize data.

C.

A branch of statistics that uses summary statistics to quantitatively describe and summarize data.

D.

A branch of statistics that uses quantitative variables that must take on a finite or countably infinite set of values.

Which statement about subqueries is correct?

A.

Subqueries are not available in Databricks SQL

B.

Subqueries can be used like other user-defined functions to transform data into different data types.

C.

Subqueries can retrieve data without requiring the creation of a table or view.

D.

Subqueries can be used like other built-in functions to transform data into different data types.

Data professionals with varying titles use the Databricks SQL service as the primary touchpoint with the Databricks Lakehouse Platform. However, some users will use other services like Databricks Machine Learning or Databricks Data Science and Engineering.

Which of the following roles uses Databricks SQL as a secondary service while primarily using one of the other services?

A.

Business analyst

B.

SQL analyst

C.

Data engineer

D.

Business intelligence analyst

E.

Data analyst

A data scientist has asked a data analyst to create histograms for every continuous variable in a data set. The data analyst needs to identify which columns are continuous in the data set.

What describes a continuous variable?

A.

A quantitative variable that never stops changing

B.

A quantitative variable Chat can take on a finite or countably infinite set of values

C.

A quantitative variable that can take on an uncountable set of values

D.

A categorical variable in which the number of categories continues to increase over time

A data analyst needs to use the Databricks Lakehouse Platform to quickly create SQL queries and data visualizations. It is a requirement that the compute resources in the platform can be made serverless, and it is expected that data visualizations can be placed within a dashboard.

Which of the following Databricks Lakehouse Platform services/capabilities meets all of these requirements?

A.

Delta Lake

B.

Databricks Notebooks

C.

Tableau

D.

Databricks Machine Learning

E.

Databricks SQL

A data analyst has created a Query in Databricks SQL, and now they want to create two data visualizations from that Query and add both of those data visualizations to the same Databricks SQL Dashboard.

Which of the following steps will they need to take when creating and adding both data visualizations to the Databricks SQL Dashboard?

A.

They will need to alter the Query to return two separate sets of results.

B.

They will need to add two separate visualizations to the dashboard based on the same Query.

C.

They will need to create two separate dashboards.

D.

They will need to decide on a single data visualization to add to the dashboard.

E.

They will need to copy the Query and create one data visualization per query.

Delta Lake stores table data as a series of data files, but it also stores a lot of other information.

Which of the following is stored alongside data files when using Delta Lake?

A.

None of these

B.

Table metadata, data summary visualizations, and owner account information

C.

Table metadata

D.

Data summary visualizations

E.

Owner account information

In which of the following situations will the mean value and median value of variable be meaningfully different?

A.

When the variable contains no outliers

B.

When the variable contains no missing values

C.

When the variable is of the boolean type

D.

When the variable is of the categorical type

E.

When the variable contains a lot of extreme outliers

A data analyst has been asked to produce a visualization that shows the flow of users through a website.

Which of the following is used for visualizing this type of flow?

A.

Heatmap

B.

IChoropleth

C.

Word Cloud

D.

Pivot Table

E.

Sankey