Professional Data Engineer on Google Cloud Platform
86%
278 QUESTIONS AS TOTAL
Question 231
Each analytics team in your organization is running BigQuery jobs in their own projects. You want to enable each team to monitor slot usage within their projects.
What should you do?
Create a Stackdriver Monitoring dashboard based on the BigQuery metric query/scanned_bytes
Create a Stackdriver Monitoring dashboard based on the BigQuery metric slots/allocated_for_project
Create a log export for each project, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric
Create an aggregated log export at the organization level, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric
Answer is Create a Stackdriver Monitoring dashboard based on the BigQuery metric slots/allocated_for_project
Number of slots allocated to the project at any time. This can also be thought of as the number of slots being utilized by that project.
You are migrating your data warehouse to BigQuery. You have migrated all of your data into tables in a dataset. Multiple users from your organization will be using the data. They should only see certain tables based on their team membership.
How should you set user permissions?
Assign the users/groups data viewer access at the table level for each table
Create SQL views for each team in the same dataset in which the data resides, and assign the users/groups data viewer access to the SQL views
Create authorized views for each team in the same dataset in which the data resides, and assign the users/groups data viewer access to the authorized views Most Voted
Create authorized views for each team in datasets created for each team. Assign the authorized views data viewer access to the dataset in which the data resides. Assign the users/groups data viewer access to the datasets in which the authorized views reside
Answer is A. Assign the users/groups data viewer access at the table level for each table
it is feasible to provide table level access to user by allowing user to query single table and no other table will be visible to user in same dataset.
You plan to deploy Cloud SQL using MySQL. You need to ensure high availability in the event of a zone failure.
What should you do?
Create a Cloud SQL instance in one zone, and create a failover replica in another zone within the same region.
Create a Cloud SQL instance in one zone, and create a read replica in another zone within the same region.
Create a Cloud SQL instance in one zone, and configure an external read replica in a zone in a different region.
Create a Cloud SQL instance in a region, and configure automatic backup to a Cloud Storage bucket in the same region.
Answer is Create a Cloud SQL instance in one zone, and create a failover replica in another zone within the same region.
The HA (High Availability) configuration, sometimes called a cluster, provides data redundancy. A Cloud SQL instance configured for HA is also called a regional instance and is located in a primary and secondary zone within the configured region. Within a regional instance, the configuration is made up of a primary instance and a standby instance.
Reference:
https://cloud.google.com/sql/docs/mysql/high-availability
Question 234
You want to archive data in Cloud Storage. Because some data is very sensitive, you want to use the "Trust No One" (TNO) approach to encrypt your data to prevent the cloud provider staff from decrypting your data.
What should you do?
Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key and unique additional authenticated data (AAD). Use gsutil cp to upload each encrypted file to the Cloud Storage bucket, and keep the AAD outside of Google Cloud.
Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key. Use gsutil cp to upload each encrypted file to the Cloud Storage bucket. Manually destroy the key previously used for encryption, and rotate the key once.
Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in Cloud Memorystore as permanent storage of the secret.
Specify customer-supplied encryption key (CSEK) in the .boto configuration file. Use gsutil cp to upload each archival file to the Cloud Storage bucket. Save the CSEK in a different project that only the security team can access.
Answer is Use gcloud kms keys create to create a symmetric key. Then use gcloud kms encrypt to encrypt each archival file with the key and unique additional authenticated data (AAD). Use gsutil cp to upload each encrypted file to the Cloud Storage bucket, and keep the AAD outside of Google Cloud.
AAD is used to decrypt the data so better to keep it outside GCP for safety
Question 235
Which of the following is not possible using primitive roles?
Give a user viewer access to BigQuery and owner access to Google Compute Engine instances. Most Voted
Give UserA owner access and UserB editor access for all datasets in a project.
Give a user access to view all datasets in a project, but not run queries on them.
Give GroupA owner access and GroupB editor access for all datasets in a project.
Answer is Give a user access to view all datasets in a project, but not run queries on them.
Which of the following statements about Legacy SQL and Standard SQL is not true?
Standard SQL is the preferred query language for BigQuery.
If you write a query in Legacy SQL, it might generate an error if you try to run it with Standard SQL.
One difference between the two query languages is how you specify fully-qualified table names (i.e. table names that include their associated project name).
You need to set a query language for each dataset and the default is Standard SQL.
Answer is You need to set a query language for each dataset and the default is Standard SQL.
Query language is not set at dataset level. It depends on from where we are trying to access Bigquery tables. Webs default is leagacy whereas Cloud console is Standard.
Question 237
Which methods can be used to reduce the number of rows processed by BigQuery?
Splitting tables into multiple tables; putting data in partitions
Splitting tables into multiple tables; putting data in partitions; using the LIMIT clause
Putting data in partitions; using the LIMIT clause
Splitting tables into multiple tables; using the LIMIT clause
Answer is Splitting tables into multiple tables; putting data in partitions
Others options are with Limit, limit cannot stop from checking all the rows in bigquery
Question 238
If a dataset contains rows with individual people and columns for year of birth, country, and income, how many of the columns are continuous and how many are categorical?
1 continuous and 2 categorical
3 categorical
3 continuous
2 continuous and 1 categorical
Answer is 2 continuous and 1 categorical
Year can be any value, income can be any value, so continuous, and country is categorical as values are finite
Question 239
What is the recommended action to do in order to switch between SSD and HDD storage for your Google Cloud Bigtable instance?
create a third instance and sync the data from the two storage types via batch jobs
export the data from the existing instance and import the data into a new instance
run parallel instances where one is HDD and the other is SDD
the selection is final and you must resume using the same storage type
Answer is export the data from the existing instance and import the data into a new instance
When you create a Cloud Bigtable instance and cluster, your choice of SSD or HDD storage for the cluster is permanent. You cannot use the Google Cloud Platform Console to change the type of storage that is used for the cluster. If you need to convert an existing HDD cluster to SSD, or vice-versa, you can export the data from the existing instance and import the data into a new instance.
Alternatively, you can write - a Cloud Dataflow or Hadoop MapReduce job that copies the data from one instance to another.
The HBase shell is a GUI based interface that performs administrative tasks, such as creating and deleting tables.
The HBase shell is a command-line tool that performs administrative tasks, such as creating and deleting tables.
The HBase shell is a hypervisor based shell that performs administrative tasks, such as creating and deleting new virtualized instances.
The HBase shell is a command-line tool that performs only user account management functions to grant access to Cloud Bigtable instances.
Answer is
The HBase shell is a command-line tool that performs administrative tasks, such as creating and deleting tables. The Cloud Bigtable HBase client for Java makes it possible to use the HBase shell to connect to Cloud Bigtable.