DP-203: Data Engineering on Microsoft Azure

61%

Question 281

Suppose you work at a startup with limited funding. Why might you prefer Azure data storage over an on-premises solution?
To ensure you run on a specific brand of hardware which will let you to form a marketing partnership with that hardware vendor.
The Azure pay-as-you-go billing model lets you avoid buying expensive hardware.
To get exact control over the location of your data store.




Answer is "The Azure pay-as-you-go billing model lets you avoid buying expensive hardware."

There are no large, up-front capital expenditures (CapEx) with Azure. You pay monthly for only the services you use (OpEx).

Question 282

Which of the following situations would yield the most benefits from relocating an on-premises data store to Azure?
Unpredictable storage demand that increases and decreases multiple times throughout the year.
Long-term, steady growth in storage demand.
Consistent, unchanging storage demand.




Answer is "Unpredictable storage demand that increases and decreases multiple times throughout the year."

Azure data storage is flexible. You can quickly and easily add or remove capacity. You can increase performance to handle spikes in load or decrease performance to reduce costs. In all cases, you pay for only what you use.

Question 283

Suppose you have two video files stored as blobs. One of the videos is business-critical and requires a replication policy that creates multiple copies across geographically diverse datacenters. The other video is non-critical, and a local replication policy is sufficient.

True or false: To satisfy these constraints, the two blobs will need to be in separate storage accounts.
True
False




Answer is True

Replication policy is a characteristic of a storage account. Every member in the storage account must use the same policy. If you need some data to use the geo-replication strategy and other data to use the local replication strategy, then you will need two storage accounts.

Question 284

Mike is creating an Azure Data Lake Storage Gen 2 account. He must configure this account to be able to processes analytical data workloads for best performance. Which option should he configure when creating the storage account?
On the Basic Tab, set the performance option to standard
On the Basic Tab, set the performance to ON
On the Advanced tab, set the Hierarchical Namespace to enabled




Answer is On the Advanced tab, set the Hierarchical Namespace to enabled

If you want to enable the best performance for Analytical Workloads in Data Lake Storage Gen 2, then on the Advanced tab of the Storage Account creation set the Hierarchical Namespace to enabled. A is incorrect. Performance determines the types of physical storage that the storage account will use. Standard is a magnetic hard drive. B is incorrect. The Performance option set to ON is not a valid configuration.

Question 285

In which phase of big data processing is Azure Data Lake Storage located?
Ingestion
Store
Model & Serve




Answer is Store

Store is the phase in which Azure Data Lake Storage resides for processing big data solution. Ingestion is the phase that typically includes technologies for ingesting data from a source such as Azure Data Factory. Model & Serve is the phase that includes technologies for presenting data for end users such as Power BI.

Question 286

Contoso has a Data Lake Storage Account (Gen 2). Which tool would be the most appropriate tool to perform a one time without the installation or configuration of a tool to upload of a single file?
Azure Data Factory
Azure Storage Explorer
Azure Portal




Answer is Azure Portal

The Azure Portal requires no installation or configuration to work, and just requires a sign in and a click of an upload button to perform the upload. Azure Data Factory requires provisioning and configuring and Azure Storage Explorer requires installing this tool first.

Question 287

Contoso has a Data Lake Storage Account (Gen 2). Which tool would be the most appropriate tool to perform a movement of hundreds of files from Amazon S3 to Azure Data Lake Storage ?
Azure Data Factory
Azure Data Catalog
Azure Portal




Answer is Azure Data Factory

Azure Data Factory would efficiently handle the movement of data from Amazon S3 to Azure Data Lake store.

Azure Data Catalog is the wrong tool for the job required as it is used to document information about data stores.

The Azure Portal would not support the copy of files from Amazon S3, and just requires a sign in and a click of an upload button to perform the upload.

Question 288

You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB.
You plan to copy the data from the storage account to an Azure SQL data warehouse.
You need to prepare the files to ensure that the data copies quickly.

Solution: You modify the files to ensure that each row is more than 1 MB.

Does this meet the goal?
Yes
No




Answer is No

Instead modify the files to ensure that each row is less than 1 MB.

References:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/guidance-for-loading-data

Question 289

You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB.
You plan to copy the data from the storage account to an Azure SQL data warehouse.
You need to prepare the files to ensure that the data copies quickly.

Solution: You copy the files to a table that has a columnstore index.

Does this meet the goal?
Yes
No




Answer is

Instead modify the files to ensure that each row is less than 1 MB.

References:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/guidance-for-loading-data

Question 290

Each day, company plans to store hundreds of files in Azure Blob Storage and Azure Data Lake Storage. The company uses the parquet format.
You must develop a pipeline that meets the following requirements:
You need to select the appropriate data technology to implement the pipeline.

Which data technology should you implement?
Azure SQL Data Warehouse
HDInsight Apache Storm cluster
Azure Stream Analytics
HDInsight Apache Hadoop cluster using MapReduce
HDInsight Spark cluster




Answer is HDInsight Apache Storm cluster

Storm runs topologies instead of the Apache Hadoop MapReduce jobs that you might be familiar with. Storm topologies are composed of multiple components that are arranged in a directed acyclic graph (DAG). Data flows between the components in the graph. Each component consumes one or more data streams, and can optionally emit one or more streams.

Python can be used to develop Storm components.

References:
https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-overview

< Previous PageNext Page >

Quick access to all questions in this exam

Warning: file_get_contents(http://www.geoplugin.net/php.gp?ip=216.73.216.150): failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in /home/passnexa/public_html/view/question.php on line 243