Summer Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: exc65

You are responsible for managing Cloud Storage buckets for a research company. Your company has well-defined data tiering and retention rules. You need to optimize storage costs while achieving your data retention needs. What should you do?

A.

Configure the buckets to use the Archive storage class.

B.

Configure a lifecycle management policy on each bucket to downgrade the storage class and remove objects based on age.

C.

Configure the buckets to use the Standard storage class and enable Object Versioning.

D.

Configure the buckets to use the Autoclass feature.

You are developing a data ingestion pipeline to load small CSV files into BigQuery from Cloud Storage. You want to load these files upon arrival to minimize data latency. You want to accomplish this with minimal cost and maintenance. What should you do?

A.

Use the bq command-line tool within a Cloud Shell instance to load the data into BigQuery.

B.

Create a Cloud Composer pipeline to load new files from Cloud Storage to BigQuery and schedule it to run every 10 minutes.

C.

Create a Cloud Run function to load the data into BigQuery that is triggered when data arrives in Cloud Storage.

D.

Create a Dataproc cluster to pull CSV files from Cloud Storage, process them using Spark, and write the results to BigQuery.

Your organization needs to implement near real-time analytics for thousands of events arriving each second in Pub/Sub. The incoming messages require transformations. You need to configure a pipelinethat processes, transforms, and loads the data into BigQuery while minimizing development time. What should you do?

A.

Use a Google-provided Dataflow template to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.

B.

Create a Cloud Data Fusion instance and configure Pub/Sub as a source. Use Data Fusion to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.

C.

Load the data from Pub/Sub into Cloud Storage using a Cloud Storage subscription. Create a Dataproc cluster, use PySpark to perform transformations in Cloud Storage, and write the results to BigQuery.

D.

Use Cloud Run functions to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.

You work for a financial organization that stores transaction data in BigQuery. Your organization has a regulatory requirement to retain data for a minimum of seven years for auditing purposes. You need to ensure that the data is retained for seven years using an efficient and cost-optimized approach. What should you do?

A.

Create a partition by transaction date, and set the partition expiration policy to seven years.

B.

Set the table-level retention policy in BigQuery to seven years.

C.

Set the dataset-level retention policy in BigQuery to seven years.

D.

Export the BigQuery tables to Cloud Storage daily, and enforce a lifecycle management policy that has a seven-year retention rule.

You need to design a data pipeline that ingests data from CSV, Avro, and Parquet files into Cloud Storage. The data includes raw user input. You need to remove all malicious SQL injections before storing the data in BigQuery. Which data manipulation methodology should you choose?

A.

EL

B.

ELT

C.

ETL

D.

ETLT

Your organization uses Dataflow pipelines to process real-time financial transactions. You discover that one of your Dataflow jobs has failed. You need to troubleshoot the issue as quickly as possible. What should you do?

A.

Set up a Cloud Monitoring dashboard to track key Dataflow metrics, such as data throughput, error rates, and resource utilization.

B.

Create a custom script to periodically poll the Dataflow API for job status updates, and send email alerts if any errors are identified.

C.

Navigate to the Dataflow Jobs page in the Google Cloud console. Use the job logs and worker logs to identify the error.

D.

Use the gcloud CLI tool to retrieve job metrics and logs, and analyze them for errors and performance bottlenecks.

Your organization uses a BigQuery table that is partitioned by ingestion time. You need to remove data that is older than one year to reduce your organization’s storage costs. You want to use the most efficient approach while minimizing cost. What should you do?

A.

Create a scheduled query that periodically runs an update statement in SQL that sets the “deleted" column to “yes” for data that is more than one year old. Create a view that filters out rows that have been marked deleted.

B.

Create a view that filters out rows that are older than one year.

C.

Require users to specify a partition filter using the alter table statement in SQL.

D.

Set the table partition expiration period to one year using the ALTER TABLE statement in SQL.

Your organization sends IoT event data to a Pub/Sub topic. Subscriber applications read and perform transformations on the messages before storing them in the data warehouse. During particularly busy times when more data is being written to the topic, you notice that the subscriber applications are not acknowledging messages within the deadline. You need to modify your pipeline to handle these activity spikes and continue to process the messages. What should you do?

A.

Retry messages until they are acknowledged.

B Implement flow control on the subscribers

B.

Forward unacknowledged messages to a dead-letter topic.

C.

Seek back to the last acknowledged message.

You work for a retail company that collects customer data from various sources:

    Online transactions: Stored in a MySQL database

    Customer feedback: Stored as text files on a company server

    Social media activity: Streamed in real-time from social media platformsYou need to design a data pipeline to extract and load the data into the appropriate Google Cloud storage system(s) for further analysis and ML model training. What should you do?

A.

Copy the online transactions data into Cloud SQL for MySQL. Import the customer feedback into BigQuery. Stream the social media activity into Cloud Storage.

B.

Extract and load the online transactions data into BigQuery. Load the customer feedback data into Cloud Storage. Stream the social media activity by using Pub/Sub and Dataflow, and store the data in BigQuery.

C.

Extract and load the online transactions data, customer feedback data, and social media activity into Cloud Storage.

D.

Extract and load the online transactions data into Bigtable. Import the customer feedback data into Cloud Storage. Store the social media activity in Cloud SQL for MySQL.

You want to process and load a daily sales CSV file stored in Cloud Storage into BigQuery for downstream reporting. You need to quickly build a scalable data pipeline that transforms the data while providing insights into data quality issues. What should you do?

A.

Create a batch pipeline in Cloud Data Fusion by using a Cloud Storage source and a BigQuery sink.

B.

Load the CSV file as a table in BigQuery, and use scheduled queries to run SQL transformation scripts.

C.

Load the CSV file as a table in BigQuery. Create a batch pipeline in Cloud Data Fusion by using a BigQuery source and sink.

D.

Create a batch pipeline in Dataflow by using the Cloud Storage CSV file to BigQuery batch template.