Retail Marketing Data Pipelines and Data Warehouse

Project Type

Data Warehouse
Data Pipelines
Data Integration
ETL/ELT
Cloud Migration
Scalable Architecture
Analytics Platform

Client Overview

The client is a Texas-based retail company that provides storage solutions. Since 1988, they have become one of the biggest in their field by expanding their business and enhancing customer service.

Summary of Results

  • The client’s data science team successfully integrated Google Analytics 360 data into their analytical models.
  • This data is now accessible through both Google BigQuery and the client’s on-premises data warehouse.
  • The integration provides a strong foundation for future projects.
  • It enables the client to use BigQuery and Analytics 360 for additional use cases.

The Challenges

The client used Google Analytics 360 to improve tracking on their website and mobile app. They needed to share this data across different teams for real-time analysis but faced challenges integrating it into their existing systems. They required a smooth data pipeline to transfer Analytics 360 data to their on-premises SQL Server data warehouse, ensuring it worked well with their current setup and could grow in the future. To solve this, they hired Accropolix to create and implement a solution that met their immediate needs and supported future growth.

The Requirements

The client teamed up with Accropolix to build a new data platform with these main goals:

  • The data science team needed Google Analytics data for their work.
  • This data had to be added to their current SQL Server data warehouse.
  • The new data pipelines needed to handle more data and changing needs.
  • The system should be cost-effective and use serverless solutions to reduce extra work.

The Solution

We started by setting up the BigQuery Data Transfer Service to automatically export Google Analytics 360 data into a partitioned BigQuery table every day. This made sure both intraday and previous day data were available for analysis. To integrate this data into the client’s on-premises SQL Server data warehouse, we created a series of cloud-based workflows.

Instead of just using built-in monitoring services, we used Google Cloud Functions and Eventarc to detect new data tables in BigQuery. This triggered automated processes using Cloud Functions to flatten and process the data efficiently.

To ensure smooth and reliable data movement between systems, we used Apache Airflow for workflow orchestration. This helped us manage the ETL (Extract, Transform, Load) process, ensuring timely delivery. The flattened data was then exported to a time-partitioned Google Cloud Storage bucket. From there, the client’s on-premises systems retrieved the data, which was then loaded into their SQL Server data warehouse using custom ETL scripts and Apache Nifi for data transformation.

This setup provided the client with a scalable and cost-effective solution, making sure the marketing data was easily accessible for real-time analysis and future growth.

About Accropolix

Accropolix is a premier data engineering firm specializing in the design and implementation of advanced data architectures and solutions. We empower businesses to fully leverage their data through our expertise in cloud services, data integration, and analytics.

Our commitment is to deliver customized solutions that enhance operational efficiency and provide actionable insights, utilizing best practices in data engineering to address the unique needs of each client.