• Objective

    The project aimed to create a unified data analytics platform by integrating Nielsen and IQVIA datasets. This enabled accurate and actionable insights into market share, consumer behavior, and purchasing patterns while ensuring data consistency and reliability.

  • Solution

    The Market Share Cockpit was designed to efficiently process and analyze sell-out data. The solution leveraged Azure Data Factory and Snowflake for data staging, processing, and loading, while SQL Server Master Data Services were used for master data enrichment. Advanced hierarchical transformations ensured data consistency across multiple aggregation levels, allowing seamless comparisons between external market intelligence and internal ERP reports. Automated data quality checks, including KPI analysis and value frequency monitoring, were implemented to enhance data accuracy and reliability.

  • Outcome
    • Robust Data Integration: A single, unified platform consolidating Nielsen and IQVIA data.

    • Enhanced Master Data Mapping: Improved alignment of external intelligence with internal categorizations for better decision-making.

    • Automated Adaptability: Dynamic adjustment to evolving data structures for ongoing accuracy.

    • Consistent Hierarchical Data: Ensured alignment of sales data across multiple levels of aggregation.

    • Comprehensive Data Quality Assurance: Integrated validation and monitoring processes for reliable insights.

  • Technology
    • Cloud Storage & Processing: Azure Data Lake, Snowflake

    • ETL & Data Integration: Azure Data Factory, SQL Server Master Data Services

    • Data Quality Checks: Automated KPI analysis, value frequency analysis, and monitoring dashboards

    • Security & Data Transfer: Secure SFTP for Nielsen data, service principal-based ingestion for IQVIA data

How does it work?

1
Data Sources
  • cloud databases
  • on-premise database
  • Excel files with "pretty" formatting
  • csv files
2
Python Script
  • processing Excel files with formatting
  • conversion to *.csv
3
Linux Pipeline
  • Data filtering
4
Staging
  • Staging schema data load
5
Aggragation / MDS
  • Data aggregation at the month level
  • Populating Intermediate Fact Tables
  • Loading MD datamarts
  • Data transfer to MDS
6
MDS
  • MD Enrichment byuser
  • Enter MD required for calculations: courses, units. conversion reates.
  • Launch dataflow continuation
7
DWH Loading
  • Calculation and loading of data marts from fact tables and MDS user data
  • Recording the download log and the errors that occurred with the reasons
8
PowerBI
  • PowerBI dataset refresh