The selection criterion used in the forward selection method in the GLMSELECT procedure is:
A financial services manager wants to assess the probability that certain clients will default on their Home Equity Line of Credit (HELOC). A former employee left the code listed below.
The training data set is named HELOC, while a similar data set of more recent clients is named RECENT_HELOC.
Which SAS data steps will calculate the predicted probability of default on recent clients? (Choose two.)
Refer to the exhibit:
Based upon the comparative ROC plot for two competing models, which is the champion model and why?
The following LOGISTIC procedure output analyzes the relationship between a binary response and an ordinal predictor variable, wrist_size Using reference cell coding, the analyst selects Large (L) as the reference level.
What is the estimated logit for a person with large wrist size?
Click the calculator button to display a calculator if needed.
PROC GLMSELECT was used for building a model predicting the natural log of a baseball player's salary from certain performance and longevity statistics. The model used backward elimination using SBC as its selection criterion. The sequence of steps is summarized in the graphic shown below:
At Step 9 number of at bats (nAtBat) was removed from the model.
Why was it removed?
Refer to the exhibit.
These graphs were created using the GLM procedure with the plots(only)=diagnostics option.
Which plot do you use to identify influential observations?
In order to perform honest assessment on a predictive model, what is an acceptable division between training, validation, and testing data?
Refer to the ROC curve:
As you move along the curve, what changes?
Assume a $10 cost for soliciting a non-responder and a $200 profit for soliciting a responder. The logistic regression model gives a probability score named P_R on a SAS data set called VALID. The VALID data set contains the responder variable Pinch, a 1/0 variable coded as 1 for responder. Customers will be solicited when their probability score is more than 0.05.
Which SAS program computes the profit for each customer in the data set VALID?