Skip to content

Commit 281bb98

Browse files
Fix-input_gcp_projects (#139)
* fix input_gcp_projects * Refactor project_list macro to improve handling of string inputs and array parsing. Simplified logic for processing project strings and enhanced handling of empty arrays. * Add documentation * Add changie * Use project_by_project_view for integration testing * add exclude to organization models as not available in test env --------- Co-authored-by: Christophe Oudar <kayrnt@gmail.com>
1 parent d0682da commit 281bb98

File tree

4 files changed

+138
-18
lines changed

4 files changed

+138
-18
lines changed
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
kind: Fixes
2+
body: Fix input_gcp_projects parsing and avoid fromjson for dbt_fusion compatibility
3+
time: 2025-07-19T21:47:03.320443+02:00
4+
custom:
5+
Author: Kayrnt
6+
Issue: ""

.github/workflows/pr_run_models.yml

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,9 @@ env:
1717
DBT_ENV_SECRET_BIGQUERY_TEST_SERVICE_ACCOUNT: ${{ secrets.DBT_ENV_SECRET_BIGQUERY_TEST_SERVICE_ACCOUNT }}
1818
DBT_ENV_SECRET_BIGQUERY_TEST_STORAGE_PROJECT: ${{ secrets.DBT_ENV_SECRET_BIGQUERY_TEST_STORAGE_PROJECT }}
1919
DBT_ENV_SECRET_BIGQUERY_TEST_EXECUTION_PROJECT: ${{ secrets.DBT_ENV_SECRET_BIGQUERY_TEST_EXECUTION_PROJECT }}
20-
DBT_ENV_SECRET_BIGQUERY_TEST_LOCATION: ${{ secrets.DBT_ENV_SECRET_BIGQUERY_TEST_LOCATION }}
20+
DBT_ENV_SECRET_BIGQUERY_TEST_LOCATION: "us"
2121
DBT_BQ_MONITORING_GCP_PROJECTS: ${{ secrets.DBT_BQ_MONITORING_GCP_PROJECTS }}
22+
DBT_BQ_MONITORING_GOOGLE_INFORMATION_SCHEMA_MODELS_MATERIALIZATION: project_by_project_view
2223

2324
concurrency:
2425
cancel-in-progress: true
@@ -28,6 +29,9 @@ jobs:
2829
integration-bigquery:
2930
name: Run dbt models on BigQuery to test model validity
3031
runs-on: ubuntu-latest
32+
33+
env:
34+
DBT_COMMON_FLAGS: "--empty --exclude information_schema_streaming_timeline_by_organization information_schema_jobs_timeline_by_organization information_schema_table_storage_by_organization information_schema_table_storage_by_organization information_schema_write_api_timeline_by_organization information_schema_jobs_by_organization information_schema_table_storage_usage_timeline_by_organization"
3135

3236
steps:
3337
- name: Checkout
@@ -64,34 +68,34 @@ jobs:
6468
- name: Run all models once
6569
run: |
6670
cd integration_tests
67-
dbt build -s dbt_bigquery_monitoring --full-refresh --empty
71+
dbt build -s dbt_bigquery_monitoring --full-refresh ${{ env.DBT_COMMON_FLAGS }}
6872
6973
- name: Run all models again to test incremental
7074
run: |
7175
cd integration_tests
72-
dbt build -s dbt_bigquery_monitoring --empty
76+
dbt build -s dbt_bigquery_monitoring ${{ env.DBT_COMMON_FLAGS }}
7377
7478
- name: Run all models again with cloud audit logs
7579
run: |
7680
cd integration_tests
7781
DBT_BQ_MONITORING_SHOULD_COMBINE_AUDIT_LOGS_AND_INFORMATION_SCHEMA=true DBT_BQ_MONITORING_GCP_BIGQUERY_AUDIT_LOGS=true \
78-
dbt run -s jobs_from_audit_logs+ --full-refresh --empty
82+
dbt run -s jobs_from_audit_logs+ --full-refresh ${{ env.DBT_COMMON_FLAGS }}
7983
8084
- name: Run all models again to test incremental with cloud audit logs
8185
run: |
8286
cd integration_tests
8387
DBT_BQ_MONITORING_SHOULD_COMBINE_AUDIT_LOGS_AND_INFORMATION_SCHEMA=true DBT_BQ_MONITORING_GCP_BIGQUERY_AUDIT_LOGS=true \
84-
dbt run -s jobs_from_audit_logs+ --empty
88+
dbt run -s jobs_from_audit_logs+ ${{ env.DBT_COMMON_FLAGS }}
8589
8690
# Disable until billing is enabled as DML is not available in the free tier
8791
# - name: Run all models again in project mode
8892
# run: |
8993
# cd integration_tests
9094
# DBT_BQ_MONITORING_GCP_PROJECTS="['${{ secrets.DBT_ENV_SECRET_BIGQUERY_TEST_STORAGE_PROJECT }}']" \
91-
# dbt build -s dbt_bigquery_monitoring --full-refresh --empty
95+
# dbt build -s dbt_bigquery_monitoring --full-refresh ${{ env.DBT_COMMON_FLAGS }}
9296

9397
# - name: Run all models again in project mode
9498
# run: |
9599
# cd integration_tests
96100
# DBT_BQ_MONITORING_GCP_PROJECTS="['${{ secrets.DBT_ENV_SECRET_BIGQUERY_TEST_STORAGE_PROJECT }}']" \
97-
# dbt build -s dbt_bigquery_monitoring --empty
101+
# dbt build -s dbt_bigquery_monitoring ${{ env.DBT_COMMON_FLAGS }}

docs/configuration/configuration.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,44 @@ vars:
5454
input_gcp_projects: [ 'my-gcp-project', 'my-gcp-project-2' ]
5555
```
5656

57+
#### Supported Input Formats
58+
59+
The `input_gcp_projects` setting accepts multiple input formats for maximum flexibility:
60+
61+
**1. dbt project variables (dbt_project.yml):**
62+
```yml
63+
vars:
64+
input_gcp_projects: "single-project" # Single project as string
65+
input_gcp_projects: ["project1", "project2"] # Multiple projects as array
66+
```
67+
68+
**2. CLI variables:**
69+
```bash
70+
dbt run --vars '{"input_gcp_projects": "test"}' # Single project
71+
dbt run --vars '{"input_gcp_projects": ["test1", "test2"]}' # Multiple projects
72+
```
73+
74+
**3. Environment variables:**
75+
```bash
76+
# Single project
77+
export DBT_BQ_MONITORING_GCP_PROJECTS="single-project"
78+
79+
# Multiple projects with quotes
80+
export DBT_BQ_MONITORING_GCP_PROJECTS='["project1","project2"]'
81+
82+
# Multiple projects without quotes (also supported)
83+
export DBT_BQ_MONITORING_GCP_PROJECTS='[project1,project2]'
84+
85+
# Single project in array format
86+
export DBT_BQ_MONITORING_GCP_PROJECTS='["project1"]'
87+
export DBT_BQ_MONITORING_GCP_PROJECTS='[project1]'
88+
89+
# Empty array
90+
export DBT_BQ_MONITORING_GCP_PROJECTS='[]'
91+
```
92+
93+
All input formats are automatically normalized to an array of project strings internally, so you can use whichever format is most convenient for your setup.
94+
5795
:::warning
5896

5997
When using the "project mode", the package will create intermediate tables to avoid issues from BigQuery when too many projects are used.

macros/project_list.sql

Lines changed: 83 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,90 @@
1-
{#-- retrieve the projects list --#}
1+
{#--
2+
Retrieve the projects list from various input formats and always return an array.
3+
4+
This macro supports multiple input formats and normalizes them to a consistent array output:
5+
6+
SUPPORTED INPUT FORMATS:
7+
8+
1. dbt project variables (dbt_project.yml):
9+
vars:
10+
input_gcp_projects: "single-project" # → ["single-project"]
11+
input_gcp_projects: ["project1", "project2"] # → ["project1", "project2"]
12+
13+
2. CLI variables:
14+
--vars '{"input_gcp_projects": "test"}' # → ["test"]
15+
--vars '{"input_gcp_projects": ["test1", "test2"]}' # → ["test1", "test2"]
16+
17+
3. Environment variables:
18+
DBT_BQ_MONITORING_GCP_PROJECTS="single-project" # → ["single-project"]
19+
DBT_BQ_MONITORING_GCP_PROJECTS='["project1","project2"]' # → ["project1", "project2"]
20+
DBT_BQ_MONITORING_GCP_PROJECTS='[project1,project2]' # → ["project1", "project2"] (unquoted)
21+
DBT_BQ_MONITORING_GCP_PROJECTS='["project1"]' # → ["project1"]
22+
DBT_BQ_MONITORING_GCP_PROJECTS='[project1]' # → ["project1"] (unquoted)
23+
DBT_BQ_MONITORING_GCP_PROJECTS='[]' # → []
24+
25+
4. Mixed quote styles:
26+
DBT_BQ_MONITORING_GCP_PROJECTS="['proj1',\"proj2\"]" # → ["proj1", "proj2"]
27+
28+
EDGE CASES HANDLED:
29+
- Empty strings return empty arrays
30+
- Whitespace around project names is automatically trimmed
31+
- Surrounding quotes (single or double) are automatically removed
32+
- Invalid/malformed inputs fallback gracefully
33+
34+
RETURN VALUE:
35+
Always returns an array of project strings
36+
--#}
237
{% macro project_list() %}
338
{% set projects = dbt_bigquery_monitoring_variable_input_gcp_projects() %}
39+
40+
{#-- If it's already a list/array, return it directly --#}
441
{% if projects is iterable and projects is not string %}
5-
{{ return(projects) }}
6-
{#-- check if it's the string and it contains a "," --#}
42+
{{ return(projects) }}
43+
44+
{#-- Handle string inputs --#}
745
{% elif projects is string %}
8-
{% if projects == '' %}
9-
{{ return([]) }}
10-
{% else %}
11-
{% set projects_replaced = projects | replace("'", '"') %}
12-
{% set json = fromjson('{"v":' ~ projects_replaced ~ '}') %}
13-
{{ return (json['v']) }}
14-
{% endif %}
46+
{#-- Empty string case --#}
47+
{% if projects == '' %}
48+
{{ return([]) }}
49+
{% endif %}
50+
51+
{#-- Check if it looks like an array (starts with [ and ends with ]) --#}
52+
{% if projects.startswith('[') and projects.endswith(']') %}
53+
{#-- Extract content between brackets --#}
54+
{% set inner_content = projects[1:-1].strip() %}
55+
56+
{#-- Handle empty array --#}
57+
{% if inner_content == '' %}
58+
{{ return([]) }}
59+
{% endif %}
60+
61+
{#-- Split by comma and process each project --#}
62+
{% set project_list = [] %}
63+
{% set raw_projects = inner_content.split(',') %}
64+
65+
{% for raw_project in raw_projects %}
66+
{% set cleaned_project = raw_project.strip() %}
67+
68+
{#-- Remove surrounding quotes if present --#}
69+
{% if (cleaned_project.startswith('"') and cleaned_project.endswith('"')) or
70+
(cleaned_project.startswith("'") and cleaned_project.endswith("'")) %}
71+
{% set cleaned_project = cleaned_project[1:-1] %}
72+
{% endif %}
73+
74+
{#-- Add non-empty projects to the list --#}
75+
{% if cleaned_project %}
76+
{% set _ = project_list.append(cleaned_project) %}
77+
{% endif %}
78+
{% endfor %}
79+
80+
{{ return(project_list) }}
81+
{% else %}
82+
{#-- Single project string - wrap in array --#}
83+
{{ return([projects]) }}
84+
{% endif %}
85+
86+
{#-- Fallback: return empty array for any other type --#}
1587
{% else %}
16-
{{ exceptions.raise_compiler_error('Invalid `input_gcp_projects` variables. Got: ' ~ input_gcp_projects ~ ' but expected form like input_gcp_projects = ["project_1", "project_2"] ') }}
88+
{{ return([]) }}
1789
{% endif %}
1890
{% endmacro %}

0 commit comments

Comments
 (0)