Your dependent variable Y is a count, ranging from 0 to infinity. Because Y is approximately log-normally distributed, you decide to log-transform the data prior to performing a linear regression.
What should you do before log-transforming Y?
An AI practitioner incorporates risk considerations into a deployment plan and decides to log and store historical predictions for potential, future access requests.
Which ethical principle is this an example of?
You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?
Which two encoders can be used to transform categorical data into numerical features? (Select two.)
Your dependent variable data is a proportion. The observed range of your data is 0.01 to 0.99. The instrument used to generate the dependent variable data is known to generate low quality data for values close to 0 and close to 1. A colleague suggests performing a logit-transformation on the data prior to performing a linear regression. Which of the following is a concern with this approach?
Definition of logit-transformation
If p is the proportion: logit(p)=log(p/(l-p))
For each of the last 10 years, your team has been collecting data from a group of subjects, including their age and numerous biomarkers collected from blood samples. You are tasked with creating a prediction model of age using the biomarkers as input. You start by performing a linear regression using all of the data over the 10-year period, with age as the dependent variable and the biomarkers as predictors.
Which assumption of linear regression is being violated?
Which of the following regressions will help when there is the existence of near-linear relationships among the independent variables (collinearity)?
When should the model be retrained in the ML pipeline?
In addition to understanding model performance, what does continuous monitoring of bias and variance help ML engineers to do?
Which of the following is the definition of accuracy?