🚀 Getting Started with dbt and Airbyte

Welcome to your modern data stack template! This project demonstrates how to build a scalable data warehouse using:

Airbyte for data ingestion
dbt for transformation
BigQuery for storage and computing

Prerequisites

VSCode installed (Download here)
Turntable.so extension installed in VSCode (Install here)
Python installed (3.8 or higher)
Google Cloud account with BigQuery enabled
Airbyte instance set up with sources configured

🏗️ Project Structure

📂 models/
├── 📁 staging/              # 🛠️ Raw data standardization
│   ├── 📁 stg_stripe/       # 💳 Payment processing
│   │   ├── 📁 base/         # 📜 Raw JSON parsing
│   │   │   └── 📄 base_stripe__customers.sql
│   │   ├── 📄 stg_stripe__customers.sql
│   │   └── 📄 _stripe_sources.yml
│   │
│   ├── 📁 stg_hubspot/      # 📈 Marketing automation
│   │   ├── 📁 base/
│   │   └── 📄 _hubspot_sources.yml
│   │
│   └── 📁 stg_shopify/      # 🛒 E-commerce platform
│       ├── 📁 base/
│       └── 📄 _shopify_sources.yml
│
├── 📁 intermediate/         # 🔍 Business logic layer
│   ├── 📁 finance/
│   ├── 📁 marketing/
│   └── 📁 sales/
│
└── 📁 marts/                # 📊 Business-specific models
    ├── 📁 core/             # 🔑 Core business entities
    ├── 📁 finance/          # 💰 Finance-specific models
    ├── 📁 marketing/        # 📣 Marketing-specific models
    └── 📁 sales/            # 🛒 Sales-specific models

🔄 Setup Instructions

Clone the Repository:

git clone https://github.com/yourusername/dbt-bigquery-quickstart-project.git
cd dbt-bigquery-quickstart-project

Set Up Environment Variables:

cp .env.example .env
# Edit .env with your configurations

Configure dbt Profile:

cp profiles.yml.example ~/.dbt/profiles.yml
# Edit profiles.yml with your BigQuery details

Install Dependencies:

pip install dbt-core dbt-bigquery
dbt deps

🔌 Airbyte Integration

This template is designed to work with Airbyte's BigQuery destination. Key points:

Raw Data Structure:
- Airbyte creates tables with prefix _airbyte_raw_
- Data is stored in JSON format in _airbyte_data column
- Each record has _airbyte_emitted_at timestamp

Base Models:

-- Example: models/staging/stg_stripe/base/base_stripe__customers.sql
select 
    JSON_EXTRACT_SCALAR(_airbyte_data, '$.id') as customer_id,
    JSON_EXTRACT_SCALAR(_airbyte_data, '$.email') as email,
    _airbyte_emitted_at as ingested_at
from {{ source('stripe', '_airbyte_raw_customers') }}

Source Configuration:

# Example: models/staging/stg_stripe/_stripe_sources.yml
version: 2
sources:
  - name: stripe
    database: "{{ env_var('DBT_PROJECT_ID') }}"
    schema: "{{ env_var('AIRBYTE_SCHEMA', 'raw') }}"
    loader: airbyte
    loaded_at_field: _airbyte_emitted_at
    tables:
      - name: _airbyte_raw_customers

🏭 Development Workflow

Set Up Airbyte Source:
- Configure source in Airbyte UI
- Set destination to BigQuery
- Note the destination schema

Update Environment Variables:

DBT_PROJECT_ID=your-project-id
AIRBYTE_SCHEMA=raw
DBT_STAGING_SCHEMA=staging

Create Base Models:
- Parse JSON data from Airbyte
- Use JSON_EXTRACT_SCALAR for BigQuery
- Add basic data type conversions
Create Staging Models:
- Add business logic and cleaning
- Implement standard naming
- Add data quality tests
Build Marts:
- Combine data from multiple sources
- Create business-specific views
- Optimize for analysis

📊 Data Quality

Source Freshness:

sources:
  - name: stripe
    freshness:
      warn_after: {count: 12, period: hour}
      error_after: {count: 24, period: hour}

Data Tests:

models:
  - name: stg_stripe__customers
    columns:
      - name: customer_id
        tests:
          - unique
          - not_null

🔍 Monitoring

Airbyte Sync Status:
- Check Airbyte UI for sync status
- Monitor _airbyte_emitted_at for freshness
dbt Run Status:
- Use dbt source freshness
- Check model test results

📚 Learning Resources

Airbyte Resources:
- Airbyte Docs
- BigQuery Setup Guide
dbt Resources:
- dbt Docs
- BigQuery Specific Guides
Community:
- dbt Slack
- Airbyte Slack

🆘 Need Help?

Check error messages in Turntable.so
Review Airbyte logs for sync issues
Visit dbt Discourse
Create an issue in this repository

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

📝 License

MIT License - see LICENSE file

BigQuery Setup Guide

For detailed instructions on setting up your BigQuery connection, including OAuth authentication and testing, see our BigQuery Setup Guide.

⭐ Credits & Connect

🚀 About This Repository

This repository is maintained by Matt Strautmann, an experienced is working closely with Founder/CEOs to use your Data to improve your bottom line. Period. Let me help you trust your data. know your customer. improve your bottom line.

Why Star This Repository?

Starring this repository helps me understand which tools, templates, and projects bring the most value to the community. Your support motivates me to keep producing high-quality content and maintain these resources for everyone!

🌟 Support This Project

If this repository has helped you:

Give it a ⭐ to show your appreciation!
Share it with others who might find it useful.

🤝 Connect with Me

I’d love to hear how you’re using this repository or discuss how I can help with your next project. Let’s connect:

LinkedIn: Matt Strautmann
GitHub: Matt Strautmann

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
analysis		analysis
docs		docs
macros		macros
models		models
seeds/raw_shopify		seeds/raw_shopify
tests		tests
.gitignore		.gitignore
README.md		README.md
dbt_project.yml		dbt_project.yml
packages.yml		packages.yml
profiles.yml.example		profiles.yml.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Getting Started with dbt and Airbyte

Prerequisites

🏗️ Project Structure

🔄 Setup Instructions

🔌 Airbyte Integration

🏭 Development Workflow

📊 Data Quality

🔍 Monitoring

📚 Learning Resources

🆘 Need Help?

🤝 Contributing

📝 License

BigQuery Setup Guide

⭐ Credits & Connect

🚀 About This Repository

Why Star This Repository?

🌟 Support This Project

🤝 Connect with Me

About

Uh oh!

Releases

Packages

matt-strautmann/dbt-bigquery-ecommerce-quickstart

Folders and files

Latest commit

History

Repository files navigation

🚀 Getting Started with dbt and Airbyte

Prerequisites

🏗️ Project Structure

🔄 Setup Instructions

🔌 Airbyte Integration

🏭 Development Workflow

📊 Data Quality

🔍 Monitoring

📚 Learning Resources

🆘 Need Help?

🤝 Contributing

📝 License

BigQuery Setup Guide

⭐ Credits & Connect

🚀 About This Repository

Why Star This Repository?

🌟 Support This Project

🤝 Connect with Me

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages