Summer Sale Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: sntaclus

A company needs to analyze a large dataset that is stored in Amazon S3 in Apache Parquet format. The company wants to use one-hot encoding for some of the columns.

The company needs a no-code solution to transform the data. The solution must store the transformed data back to the same S3 bucket for model training.

Which solution will meet these requirements?

A.

Configure an AWS Glue DataBrew project that connects to the data. Use the DataBrew interactive interface to create a recipe that performs the one-hot encoding transformation. Create a job to apply the transformation and write the output back to an S3 bucket.

B.

Use Amazon Athena SQL queries to perform the one-hot encoding transformation.

C.

Use an AWS Glue ETL interactive notebook to perform the transformation.

D.

Use Amazon Redshift Spectrum to perform the transformation.

An ML engineer wants to use, prepare, and load data from Amazon S3 for analytics. The ML engineer must run an extract, transform, and load (ETL) job to discover the schema of the data and to store the metadata.

Which solution will meet these requirements with the LEAST manual effort?

A.

Use AWS Glue to run the ETL job. Use the job to discover the schema and to store the associated metadata in the AWS Glue Data Catalog.

B.

Create an Amazon SageMaker Data Wrangler flow to run the ETL job. Use the job to discover the schema and to store the associated metadata in an S3 bucket.

C.

Create an ETL pipeline by using Amazon Athena integrated with AWS Step Functions. Use the pipeline to run the ETL job to discover the schema and to store the associated metadata in an S3 bucket.

D.

Launch an Amazon EC2 instance that includes the scikit-learn library to run the ETL job. Use the job to discover the schema and to store the associated metadata in Amazon Redshift.

An ML engineer is using Amazon Quick Suite (previously known as Amazon QuickSight) anomaly detection to detect very high or very low machine operating temperatures compared to normal. The ML engineer sets the Severity parameter to Low and above. The ML engineer sets the Direction parameter to All.

What effect will the ML engineer observe in the anomaly detection results if the ML engineer changes the Direction parameter to Lower than expected?

A.

Increased anomaly identification frequency and increased recall

B.

Decreased anomaly identification frequency and decreased recall

C.

Increased anomaly identification frequency and decreased recall

D.

Decreased anomaly identification frequency and increased recall

A company uses an ML model to recommend videos to users. The model is deployed on Amazon SageMaker AI. The model performed well initially after deployment, but the model ' s performance has degraded over time.

Which solution can the company use to identify model drift in the future?

A.

Create a monitoring job in SageMaker Model Monitor. Then create a baseline from the training dataset.

B.

Create a baseline from the training dataset. Then create a monitoring job in SageMaker Model Monitor.

C.

Create a baseline by using a built-in rule in SageMaker Clarify. Monitor the drift in Amazon CloudWatch.

D.

Retrain the model on new data. Compare the retrained model ' s performance to the original model ' s performance.

A company collects customer data daily and stores it as compressed files in an Amazon S3 bucket partitioned by date. Each month, analysts process the data, check data quality, and upload results to Amazon QuickSight dashboards.

An ML engineer needs to automatically check data quality before the data is sent to QuickSight, with the LEAST operational overhead.

Which solution will meet these requirements?

A.

Run an AWS Glue crawler monthly and use AWS Glue Data Quality rules to check data quality.

B.

Run an AWS Glue crawler and create a custom AWS Glue job with PySpark to evaluate data quality.

C.

Use AWS Lambda with Python scripts triggered by S3 uploads to evaluate data quality.

D.

Send S3 events to Amazon SQS and use Amazon CloudWatch Insights to evaluate data quality.

A company uses a hybrid cloud environment. A model that is deployed on premises uses data in Amazon 53 to provide customers with a live conversational engine.

The model is using sensitive data. An ML engineer needs to implement a solution to identify and remove the sensitive data.

Which solution will meet these requirements with the LEAST operational overhead?

A.

Deploy the model on Amazon SageMaker. Create a set of AWS Lambda functions to identify and remove the sensitive data.

B.

Deploy the model on an Amazon Elastic Container Service (Amazon ECS) cluster that uses AWS Fargate. Create an AWS Batch job to identify and remove the sensitive data.

C.

Use Amazon Macie to identify the sensitive data. Create a set of AWS Lambda functions to remove the sensitive data.

D.

Use Amazon Comprehend to identify the sensitive data. Launch Amazon EC2 instances to remove the sensitive data.

A company has deployed an ML model that detects fraudulent credit card transactions in real time in a banking application. The model uses Amazon SageMaker Asynchronous Inference. Consumers are reporting delays in receiving the inference results.

An ML engineer needs to implement a solution to improve the inference performance. The solution also must provide a notification when a deviation in model quality occurs.

Which solution will meet these requirements?

A.

Use SageMaker real-time inference for inference. Use SageMaker Model Monitor for notifications about model quality.

B.

Use SageMaker batch transform for inference. Use SageMaker Model Monitor for notifications about model quality.

C.

Use SageMaker Serverless Inference for inference. Use SageMaker Inference Recommender for notifications about model quality.

D.

Keep using SageMaker Asynchronous Inference for inference. Use SageMaker Inference Recommender for notifications about model quality.

An ML engineer at a credit card company built and deployed an ML model by using Amazon SageMaker AI. The model was trained on transaction data that contained very few fraudulent transactions. After deployment, the model is underperforming.

What should the ML engineer do to improve the model’s performance?

A.

Retrain the model with a different SageMaker built-in algorithm.

B.

Use random undersampling to reduce the majority class and retrain the model.

C.

Use Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic minority samples and retrain the model.

D.

Use random oversampling to duplicate minority samples and retrain the model.

A government agency is conducting a national census to assess program needs by area and city. The census form collects approximately 500 responses from each citizen. The agency needs to analyze the data to extract meaningful insights. The agency wants to reduce the dimensions of the high-dimensional data to uncover hidden patterns.

Which solution will meet these requirements?

A.

Use the principal component analysis (PCA) algorithm in Amazon SageMaker AI.

B.

Use the t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm in Amazon SageMaker AI.

C.

Use the k-means algorithm in Amazon SageMaker AI.

D.

Use the Random Cut Forest (RCF) algorithm in Amazon SageMaker AI.

A bank needs to use Amazon SageMaker AI to create an ML model to determine which customers qualify for a new product. The bank must use algorithms that SageMaker AI directly supports. The model must be explainable to the bank ' s regulators.

Which modeling approach will meet these requirements?

A.

Train the model by using the Object2Vec algorithm.

B.

Train the model by using the linear learner algorithm.

C.

Train a neural network.

D.

Train the model by using the k-means algorithm.