What is a common oversight when granting Airflow GCP service account permissions to BigQuery, or what specific permission might be missing for query execution?
A user reports their new Airflow DAG task fails with Access Denied: Table rddt-data-prod:my_dataset.my_table: User does not have permission to query table... but they swear the service account airflow-worker@my-project.iam.gserviceaccount.com has roles/bigquery.dataViewer on the dataset.
What is roles/bigquery.jobUser
What is the status of the child resources on:
kind: AirflowWorkloadClaim
name: stg-dev-1-heartbeat
cluster: orch-staging-1
DWCTL Tree AirflowWorkload stg-dev-1-heartbeat -n achilles-dw-controllers
There is a data mismatch between upstream Postgres DB and what’s in BigQuery - why?
Debezium casts integers differently unless specifically stated. Update view to do proper conversion
The team who owns BigQuery datasets under the project rddt-dp-data1-prod want to onboard to the DW Platform to manage their Dataset IAM.
Are there any non compatible ACLs (i.e, authorized datasets) that should be looked at before migrated the project?
Hint: use DWCTL
dwctl inventory bigquery datasets datasets-acl -p rddt-dp-data1-prod (⎈|orch-test-dw:default)
Processing project: rddt-dp-data1-prod
Non compatible ACL found: View Entity
Non compatible ACL found: View Entity
PROJECT |DATASET |LEGACY ACL
rddt-dp-data1-prod |decomposed_event_views |YES
rddt-dp-data1-prod |facts_pvt |YES
Summary:
Total Projects: 1
Total Datasets: 17
Projects with Legacy ACLs: 1
Datasets with Legacy ACLs: 2
Legacy ACL Project Ratio: 100.00%
Legacy ACL Dataset Ratio: 11.76%
Users are wondering why they are not able to create tables under the dataset they gave their AirflowWorkload Write permissions to.
The user confirmed they were able to successfully merge their PR to achilles gitops a few minutes ago. But still can't create the table.
You have already confirmed the CR is valid. The user acme back 30 minutes later and confirmed it now works as intended.
What metrics from our grafana dashboard be used to understand why this is occuring? #dw-platform-monitoring
What is 'Seconds Items Stay In Queue'
Additionally: Work Queue Add Rate and Work Queue Depth
The pipeline that was set up last week and was working properly stopped working. How do we troubleshoot?
Looking at logs showed there was a schema mismatch issue. Follow runbook for refreshing schema registry
A user would like to know what the policy document looks like on a particular BigQuery datasets. Ultimately they want to answer what principals can read from the dataset and what views are authorized to it.
How can the policy document be seen easily?
What does it look like for
Project: reddit-protected-data
Dataset: events_data_staging
bq show project:dataset
bq show reddit-protected-data:events_data_staging
Why is the AirflowWorkloadClaim stg-dev-1-heartbeat not ready?
DWCTL tree to see child resources
k get resouce -oyaml to see status message
A user complains that the View and Consolidated table results don’t match - why?
Consolidated DAG didn’t us proper column partition; needed to be partitioned on two different columns to match upstream table
A user is trying to add a iam_binding to a dataset in terraform. When they applied their terraform iam_binding it results in the authorized views losing access.
Why did this happen?
Ex:
Project: reddit-protected-data
Dataset: events_data_staging
Terraform dataset access block (which grants authorized views) and iam_bindings (TF standard IAM) are mutually exclusive.