DP-100: Designing and Implementing a Data Science Solution on Azure

19%

Question 31

You are using automated machine learning, and you want to determine the influence of features on the predictions made by the best model produced by the automated machine learning experiment.
What must you do when configuring the automated machine learning experiment?
Whitelist only tree-based algorithms.
Enable featurization.
Enable model explainability.




Answer is Enable model explainability. To generate model explanations when using automated machine learning, you must enable model explainability.

Question 32

You want to create an explainer that applies the most appropriate SHAP model explanation algorithm based on the type of model.
What kind of explainer should you create?
Mimic
Tabular
Permutation Feature Importance




Answer is Tabular. A Tabular explainer applies the most appropriate SHAP model interpretation algorithm for the type of model.

Question 33

You want to include model explanations in the logged details of your training experiment.
What must you do in your training script?
Use the Run.log_table method to log feature importance for each feature.
Use the ExplanationClient.upload_model_explanation method to upload the explanation created by an Explainer.
Save the the explanation created by an Explainer in the ./outputs folder.




Answer is Use the ExplanationClient.upload_model_explanation method to upload the explanation created by an Explainer.

To include an explanation in the run details, the training script must use the ExplanationClient.upload_model_explanation method to upload the explanation created by an Explainer.

Question 34

You have deployed a model as a real-time inferencing service in an Azure Kubernetes Service (AKS) cluster.
What must you do to capture and analyze telemetry for this service?
Enable application insights.
Implement inference-time model interpretability.
Move the AKS cluster to the same region as the Azure Machine Learning workspace.




Answer is Enable application insights. To enable telemetry analysis though Application Insghts, you must enable Application Insights for the service.

Question 35

You want to include custom information in the telemetry for your inferencing service, and analyze it using Application Insights.
What must you do in your service's entry script?
Use the Run.log method to log the custom metrics.
Save the custom metrics in the ./outputs folder.
Use a print statement to write the metrics in the STDOUT log.




Answer is Use a print statement to write the metrics in the STDOUT log. To include custom metrics, add print statements to the scoring script so that the custom information is written to the STDOUT log.

Question 36

You have trained a model using a dataset containing data that was collected last year. As this year progresses, you will collect new data.
You want to track any changing data trends that might affect the performance of the model.
What should you do?
Collect the new data in a new version of the existing training dataset, and profile both datasets.
Collect the new data in a separate dataset and create a Data Drift Monitor with the training dataset as a baseline and the new dataset as a target.
Replace the training dataset with a new dataset that contains both the original training data and the new data.




Answer is Collect the new data in a separate dataset and create a Data Drift Monitor with the training dataset as a baseline and the new dataset as a target.

To track changing data trends, create a data drift monitor that uses the training data as a baseline and the new data as a target.

Question 37

You are creating a data drift monitor.
You want to automatically notify the data science team if a significant change in data distribution is detected.
What must you do?
Define an AlertConfiguration and set a drift_threshold value.
Set the latency of the data drift monitor to allow time for data scientists to review the new data.
Register the training dataset with the model, including the email address of the data science team as a tag.




Answer is Define an AlertConfiguration and set a drift_threshold value. To notify operators about data drift, create an AlertConfiguration with the email address to be notified, and a drift threshold that defines the level of change that triggers a notification.

Question 38

You plan to build a team data science environment. Data for training models in machine learning pipelines will be over 20 GB in size.
You have the following requirements:
Personal devices must support updating machine learning pipelines when connected to a network.
You need to select a data science environment.
Which environment should you use?
Azure Machine Learning Service
Azure Machine Learning Studio
Azure Databricks
Azure Kubernetes Service (AKS)




Answer is Azure Machine Learning Service.

The Data Science Virtual Machine (DSVM) is a customized VM image on Microsoft’s Azure cloud built specifically for doing data science. Caffe2 and Chainer are supported by DSVM. DSVM integrates with Azure Machine Learning.

References:
https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

Question 39

You are solving a classification task.
The dataset is imbalanced.
You need to select an Azure Machine Learning Studio module to improve the classification accuracy.
Which module should you use?
Permutation Feature Importance
Filter Based Feature Selection
Fisher Linear Discriminant Analysis
Synthetic Minority Oversampling Technique (SMOTE)




Answer is Synthetic Minority Oversampling Technique (SMOTE)

Use the SMOTE module in Azure Machine Learning Studio (classic) to increase the number of underepresented cases in a dataset used for machine learning. SMOTE is a better way of increasing the number of rare cases than simply duplicating existing cases.

You connect the SMOTE module to a dataset that is imbalanced. There are many reasons why a dataset might be imbalanced: the category you are targeting might be very rare in the population, or the data might simply be difficult to collect. Typically, you use SMOTE when the class you want to analyze is under-represented.

Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/smote

Question 40

You are developing a data science workspace that uses an Azure Machine Learning service.
You need to select a compute target to deploy the workspace.
What should you use?
Azure Data Lake Analytics
Azure Databricks
Azure Container Service
Apache Spark for HDInsight




Answer is Azure Container Service

Azure Container Instances can be used as compute target for testing or development. Use for low-scale CPU-based workloads that require less than 48 GB of RAM.

Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where

< Previous PageNext Page >

Quick access to all questions in this exam