Spring Sale Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: sntaclus

One of your models is trained using data provided by a third-party data broker. The data broker does not reliably notify you of formatting changes in the data. You want to make your model training pipeline more robust to issues like this. What should you do?

A.

Use TensorFlow Data Validation to detect and flag schema anomalies.

B.

Use TensorFlow Transform to create a preprocessing component that will normalize data to the expected distribution, and replace values that don’t match the schema with 0.

C.

Use tf.math to analyze the data, compute summary statistics, and flag statistical anomalies.

D.

Use custom TensorFlow functions at the start of your model training to detect and flag known formatting errors.

You work for a biotech startup that is experimenting with deep learning ML models based on properties of biological organisms. Your team frequently works on early-stage experiments with new architectures of ML models, and writes custom TensorFlow ops in C++. You train your models on large datasets and large batch sizes. Your typical batch size has 1024 examples, and each example is about 1 MB in size. The average size of a network with all weights and embeddings is 20 GB. What hardware should you choose for your models?

A.

A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM

B.

A cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB RAM

C.

A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM

D.

A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM

You work for an auto insurance company. You are preparing a proof-of-concept ML application that uses images of damaged vehicles to infer damaged parts Your team has assembled a set of annotated images from damage claim documents in the company ' s database The annotations associated with each image consist of a bounding box for each identified damaged part and the part name. You have been given a sufficient budget to tram models on Google Cloud You need to quickly create an initial model What should you do?

A.

Download a pre-trained object detection mode! from TensorFlow Hub Fine-tune the model in Vertex Al Workbench by using the annotated image data.

B.

Train an object detection model in AutoML by using the annotated image data.

C.

Create a pipeline in Vertex Al Pipelines and configure the AutoMLTrainingJobRunOp compon it to train a custom object detection model by using the annotated image data.

D.

Train an object detection model in Vertex Al custom training by using the annotated image data.

You have a demand forecasting pipeline in production that uses Dataflow to preprocess raw data prior to model training and prediction. During preprocessing, you employ Z-score normalization on data stored in BigQuery and write it back to BigQuery. New training data is added every week. You want to make the process more efficient by minimizing computation time and manual intervention. What should you do?

A.

Normalize the data using Google Kubernetes Engine

B.

Translate the normalization algorithm into SQL for use with BigQuery

C.

Use the normalizer_fn argument in TensorFlow ' s Feature Column API

D.

Normalize the data with Apache Spark using the Dataproc connector for BigQuery

You work on a data science team at a bank and are creating an ML model to predict loan default risk. You have collected and cleaned hundreds of millions of records worth of training data in a BigQuery table, and you now want to develop and compare multiple models on this data using TensorFlow and Vertex AI. You want to minimize any bottlenecks during the data ingestion state while considering scalability. What should you do?

A.

Use the BigQuery client library to load data into a dataframe, and use tf.data.Dataset.from_tensor_slices() to read it.

B.

Export data to CSV files in Cloud Storage, and use tf.data.TextLineDataset() to read them.

C.

Convert the data into TFRecords, and use tf.data.TFRecordDataset() to read them.

D.

Use TensorFlow I/O’s BigQuery Reader to directly read the data.

Your team is working on an NLP research project to predict political affiliation of authors based on articles they have written. You have a large training dataset that is structured like this:

You followed the standard 80%-10%-10% data distribution across the training, testing, and evaluation subsets. How should you distribute the training examples across the train-test-eval subsets while maintaining the 80-10-10 proportion?

A)

B)

C)

D)

A.

Option A

B.

Option B

C.

Option C

D.

Option D

You are the Director of Data Science at a large company, and your Data Science team has recently begun using the Kubeflow Pipelines SDK to orchestrate their training pipelines. Your team is struggling to integrate their custom Python code into the Kubeflow Pipelines SDK. How should you instruct them to proceed in order to quickly integrate their code with the Kubeflow Pipelines SDK?

A.

Use the func_to_container_op function to create custom components from the Python code.

B.

Use the predefined components available in the Kubeflow Pipelines SDK to access Dataproc, and run the custom code there.

C.

Package the custom Python code into Docker containers, and use the load_component_from_file function to import the containers into the pipeline.

D.

Deploy the custom Python code to Cloud Functions, and use Kubeflow Pipelines to trigger the Cloud Function.