Pre-Summer Sale Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: sntaclus

A company is collaborating with a partner that does not use Databricks but needs access to a large historical dataset stored in Delta format. The data engineer needs to ensure that the partner can access the data securely, without the need for them to set up an account, and with read-only access.

How should the data be shared?

A.

Share the dataset using Delta Sharing, which allows your partner to access the data using a secure, read-only URL without requiring a Databricks account, ensuring that they cannot modify the data.

B.

Share the dataset using Unity Catalog, ensuring that both teams have full write access to the data within the same organization.

C.

Share the dataset by exporting it to a CSV file and manually transferring the file to the partner ' s system.

D.

Grant your partner access to your Databricks workspace and assign them full write permissions to the Delta table, enabling them to modify the dataset.

In order for Structured Streaming to reliably track the exact progress of the processing so that it can handle any kind of failure by restarting and/or reprocessing, which of the following two approaches is used by Spark to record the offset range of the data being processed in each trigger?

A.

Checkpointing and Write-ahead Logs

B.

Structured Streaming cannot record the offset range of the data being processed in each trigger.

C.

Replayable Sources and Idempotent Sinks

D.

Write-ahead Logs and Idempotent Sinks

E.

Checkpointing and Idempotent Sinks

What is the functionality of AutoLoader in Databricks?

A.

Auto Loader automatically ingests and processes new files from cloud storage, handling batch data with support for schema evolution.

B.

Auto Loader automatically ingests and processes new files from cloud storage, handling only streaming data with no support for schema evolution.

C.

Auto Loader automatically ingests and processes new files from cloud storage, handling batch and streaming data with no support for schema evolution.

D.

Auto Loader automatically ingests and processes new files from cloud storage, handling both batch and streaming data with support for schema evolution.

A data engineer is onboarding a new Bronze ingestion pipeline in Databricks with Unity Catalog. The team wants Databricks to handle storage layout, apply platform optimizations over time, and simplify lifecycle management so that when a table is dropped, its underlying data is also cleaned up according to Databricks-managed retention policies.

Which table type should the data engineer create for these ingestion tables?

A.

Managed tables so that Unity Catalog manages both metadata and underlying data lifecycle

B.

External tables with a LOCATION pointing to an external volume for full control of file layout

C.

Foreign tables federated from an external catalog to delegate optimization to the source system

D.

Temporary views over files to avoid table-level governance and lifecycle coupling

Which of the following must be specified when creating a new Delta Live Tables pipeline?

A.

A key-value pair configuration

B.

The preferred DBU/hour cost

C.

A path to cloud storage location for the written data

D.

A location of a target database for the written data

E.

At least one notebook library to be executed

A data engineer needs to conduct Exploratory Data Analysis (EDA) on data residing in a database within the company’s custom-defined cloud network . The data engineer is using SQL for this task.

Which type of SQL Warehouse will enable the data engineer to process large numbers of queries quickly and cost-effectively?

A.

All-purpose compute cluster

B.

Pro SQL Warehouse

C.

SQL Serverless Warehouse

D.

Classic SQL Warehouse

A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.

Which of the following commands could the data engineering team use to access sales in PySpark?

A.

SELECT * FROM sales

B.

There is no way to share data between PySpark and SQL.

C.

spark.sql( " sales " )

D.

spark.delta.table( " sales " )

E.

spark.table( " sales " )

A data engineer is processing ingested streaming tables and needs to filter out NULL values in the order_datetime column from the raw streaming table orders_raw and store the results in a new table orders_valid using DLT.

Which code snippet should the data engineer use?

A)

B)

C)

D)

A.

Option A

B.

Option B

C.

Option C

D.

Option D

What is stored in a Databricks customer ' s cloud account?

A.

Data

B.

Cluster management metadata

C.

Databricks web application

D.

Notebooks

Which of the following SQL keywords can be used to convert a table from a long format to a wide format?

A.

PIVOT

B.

CONVERT

C.

WHERE

D.

TRANSFORM

E.

SUM