This project demonstrates an end-to-end data pipeline that integrates weather data with sales trends using Snowflake for data storage and processing. The pipeline is structured into three main phases: Ingestion, Transformation, and Delivery. A Streamlit application is provided to visualize and analyze the processed data.
-
Ingestion
- Create necessary databases, schemas, and tables in Snowflake.
- Define external stages and file formats for structured data ingestion.
- Load weather and sales data from an S3 bucket into Snowflake using
COPY INTO
commands.
-
Transformation
- Process raw data into structured harmonized and analytics schemas.
- Create views to integrate weather and sales data for city-based trend analysis.
- Implement User-Defined Functions (UDFs) for data conversion, such as Fahrenheit to Celsius and inches to millimeters.
- Aggregate and normalize weather and sales trends, focusing on Hamburg, Germany.
-
Delivery
- Expose transformed data through Snowflake views for easy querying.
- Develop a Snowflake Streamlit app to visualize weather and sales trends.
- Use Altair charts in Streamlit to present interactive data insights.
📂 weather_sales_pipeline
│-- 📂 ingest # SQL scripts for data ingestion
│ │-- copy_into.sql # SQL script for loading data into Snowflake
│ │-- tasty_bytes.sql # SQL script for database, schema, and table creation
│-- 📂 transform # SQL scripts for data transformation
│ │-- hamburg_sales.sql # SQL script for sales trends in Hamburg, Germany
│ │-- udf.sql # SQL script for user-defined functions
│ │-- updated_hamburg.sql # SQL script for applying UDFs and advanced transformations
│-- 📂 delivery # Streamlit app for data visualization
│ │-- streamlit.py # Streamlit application code
│-- 📜 README.md # Project documentation
The Streamlit app was created in Snowflake using Python to visualize daily sales and weather trends for Hamburg, Germany.
- Create a Snowflake account.
- Run the SQL scripts in the following order:
ingest/copy_into.sql
to create the database, schemas, and raw tables.ingest/tasty_bytes.sql
to create structured schemas and views.transform/hamburg_sales.sql
,transform/udf.sql
, andtransform/updated_hamburg.sql
to process and aggregate data.
- Create a Streamlit app and run the streamlit.py script.
- Use the Streamlit dashboard to explore weather and sales trends.
- Modify SQL scripts to adapt the pipeline to new datasets or custom transformations.
- Automate data ingestion using Snowflake Snowpipe.
- Integrate machine learning to predict sales trends based on weather conditions.
Do you have an opportunity for me or want to collaborate? You can connect with me on LinkedIn.