DP-600: Implementing Analytics Solutions Using Microsoft Fabric Certification Dump Questions Answers Examples

DP-600: Implementing Analytics Solutions Using Microsoft Fabric

73%

Question 51

You are building a solution by using a Fabric notebook.
You have a Spark DataFrame assigned to a variable named df. The DataFrame returns four columns.
You need to change the data type of a string column named Age to integer. The solution must return a DataFrame that includes all the columns.

How should you complete the code?

Check the answer area

Answer is df.withColumn("age", col("age").cast("int")).show()

Question 52

You have a Fabric warehouse that contains a table named Staging.Sales. Staging.Sales contains the following columns.

You need to write a T-SQL query that will return data for the year 2023 that displays ProductID and ProductName and has a summarized Amount that is higher than 10,000.

Which query should you use?

SELECT ProductID, ProductName, SUM(Amount) AS TotalAmount FROM Staging.Sales WHERE DATEPART(YEAR,SaleDate) = '2023' GROUP BY ProductID, ProductName HAVING SUM(Amount) > 10000

SELECT ProductID, ProductName, SUM(Amount) AS TotalAmount FROM Staging.Sales GROUP BY ProductID, ProductName HAVING DATEPART(YEAR,SaleDate) = '2023' AND SUM(Amount) > 10000

SELECT ProductID, ProductName, SUM(Amount) AS TotalAmount FROM Staging.Sales WHERE DATEPART(YEAR,SaleDate) = '2023' AND SUM(Amount) > 10000

SELECT ProductID, ProductName, SUM(Amount) AS TotalAmount\rFROM Staging.Sales WHERE DATEPART(YEAR,SaleDate) = '2023' GROUP BY ProductID, ProductName HAVING TotalAmount > 10000

Answer is

SELECT ProductID, ProductName, SUM(Amount) AS TotalAmount
FROM Staging.Sales
WHERE DATEPART(YEAR,SaleDate) = '2023'
GROUP BY ProductID, ProductName
HAVING SUM(Amount) > 10000

Question 53

You have a data warehouse that contains a table named Stage.Customers. Stage.Customers contains all the customer record updates from a customer relationship management (CRM) system.
There can be multiple updates per customer.
You need to write a T-SQL query that will return the customer ID, name. postal code, and the last updated time of the most recent row for each customer ID.
How should you complete the code?

Check the answer section

WITH CUSTOMERBASE AS (
   SELECT CustomerID, CustomerName, PostalCode, LastUpdated,
          ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY LastUpdated DESC) as X
   FROM LakehousePOC.dbo.CustomerChanges
)
SELECT CustomerID, CustomerName, PostalCode, LastUpdated
FROM CUSTOMERBASE
WHERE X = 1

Question 54

You have a Fabric tenant that contains a machine learning model registered in a Fabric workspace.
You need to use the model to generate predictions by using the PREDICT function in a Fabric notebook.

Which two languages can you use to perform model scoring?

T-SQL

DAX

Spark SQL

PySpark

Answers are;
C. Spark SQL
D. PySpark

Notebook only accepts the languages: PySpark, Spark, Spark SQL and SparkR

Reference:
https://learn.microsoft.com/en-us/azure/synapse-analytics/machine-learning/tutorial-score-model-predict-spark-pool

Question 55

You are analyzing the data in a Fabric notebook.
You have a Spark DataFrame assigned to a variable named df.
You need to use the Chart view in the notebook to explore the data manually.

Which function should you run to make the data available in the Chart view?

displayHTML

show

write

display

Answer is display

Display allow to see chart and inspect statistiques.

A is another possibility for displaying data but not based on the requirements of this question, it is separate from chart view.

Reference:
https://learn.microsoft.com/en-us/fabric/data-engineering/notebook-visualization

Question 56

You are analyzing customer purchases in a Fabric notebook by using PySpark. You have the following DataFrames:
transactions: Contains five columns named transaction_id, customer_id, product_id, amount, and date and has 10 million rows, with each row representing a transaction. customers: Contains customer details in 1,000 rows and three columns named customer_id, name, and country.
You need to join the DataFrames on the customer_id column. The solution must minimize data shuffling.
You write the following code.

from pyspark.sql import functions as F

results =

Which code should you run to populate the results DataFrame?

transactions.join(F.broadcast(customers), transactions.customer_id == customers.customer_id)

transactions.join(customers, transactions.customer_id == customers.customer_id).distinct()

transactions.join(customers, transactions.customer_id == customers.customer_id)

transactions.crossJoin(customers).where(transactions.customer_id == customers.customer_id)

Answer is transactions.join(F.broadcast(customers), transactions.customer_id == customers.customer_id)

In Apache Spark, broadcasting refers to an optimization technique for join operations. When you join two DataFrames or RDDs and one of them is significantly smaller than the other, Spark can "broadcast" the smaller table to all nodes in the cluster. This approach avoids the need for network shuffles for each row of the larger table, significantly reducing the execution time of the join operation.

Question 57

You have a Microsoft Power BI report and a semantic model that uses Direct Lake mode.
From Power BI Desktop, you open Performance analyzer as shown in the following exhibit.

Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.

Check the answer section

Answer is Automatic & DirectQuery

The picture comes from https://learn.microsoft.com/en-us/power-bi/enterprise/directlake-analyze-qp
In this article you can see there are table1 and view1, performance analyzer shows:
- First card is linked to Table1 so direct lake is used
- Second card is linked to View1 so it does direct query
As the model can use direct lake and direct query you can conclude that the fallback behavior is automatic.

Reference:
https://powerbi.microsoft.com/en-us/blog/leveraging-pure-direct-lake-mode-for-maximum-query-performance

Question 58

You have a Microsoft Power BI semantic model.
You plan to implement calculation groups.
You need to create a calculation item that will change the context from the selected date to month-to-date (MTD).

How should you complete the DAX expression?

Check the answer section

Answer is CALCULATE - SELECTEDMEASURE

Reference:
https://www.sqlbi.com/articles/using-calculation-groups-to-selectively-replace-measures-in-dax-expressions/
https://learn.microsoft.com/en-us/dax/selectedmeasure-function-dax#example

Question 59

You have a Microsoft Power BI report named Report1 that uses a Fabric semantic model.
Users discover that Report1 renders slowly.
You open Performance analyzer and identify that a visual named Orders By Date is the slowest to render. The duration breakdown for Orders By Date is shown in the following table.

What will provide the greatest reduction in the rendering duration of Report1?

Enable automatic page refresh.

Optimize the DAX query of Orders By Date by using DAX Studio.

Change the visual type of Orders By Date.

Reduce the number of visuals in Report1.

Answer is Reduce the number of visuals in Report1.

While optimizing the DAX query could slightly improve performance, the DAX query duration (27 ms) is already very low. The "Other" duration (1047 ms) is significantly higher than the DAX query (27 ms) and the visual display (39 ms) durations combined. This indicates that most of the time is spent on backend processes such as data preparation, transformations, or communication with the data source. By reducing the number of visuals in Report1, you can decrease the overall load on the report rendering process.

Reference:
https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-performance-analyzer#display-the-performance-analyzer-pane

Question 60

You have a Microsoft Fabric tenant that contains a dataflow.
You are exploring a new semantic model.
From Power Query, you need to view column information as shown in the following exhibit.

Which three Data view options should you select?

Show column value distribution

Enable details pane

Enable column profile

Show column quality details

Show column profile in details pane

Answers are;
A. Show column value distribution C. Enable column profile D. Show column quality details

Show column value distribution: This option provides a visual representation of the distribution of values in each column, which is visible in the exhibit.

Enable column profile: This option displays statistics and other detailed information about each column, including value distribution, which aligns with the data shown in the exhibit.

Show column quality details: This option shows the quality of the data in each column, indicating valid, error, and empty values, as displayed in the exhibit.

< Previous Page Next Page >

DP-600: Implementing Analytics Solutions Using Microsoft Fabric

82 QUESTIONS AS TOTAL

Click here for the answer

Click here for the answer

Click here for the answer

Click here for the answer

Click here for the answer

Click here for the answer

Click here for the answer

Click here for the answer

Click here for the answer

Click here for the answer

Quick access to all questions in this exam