• Customer
    Nutricia Netherlands Marketing and Innovation Department Employees constantly analyze the market and the Company's place in it. Looking for niches for new products and opportunities to expand the presence of existing ones
  • Objective

    Market data comes from different sources, with different frequency and level of detail. To perform market share analysis and other indicators, employees have to collect this data from different sources each time. Our task was to develop an algorithm and automate the process in which the client can download and process all the data "with one button", as well as enrich the master data on products and market participants with categories and values ​​used in the company.

  • Solution
    We have developed an automatic pipeline in which data is automatically collected from the required sources, processed and loaded into the MD enrichment system, and after enrichment by the user, to the final storage and from there to PowerBI.
  • Technology

    The solution is built on the use of the main platforms of the customer:

    Data processing:

         Informatica PowerCenter
         Python
         Linux commands

    Data Enrichment: SQL Server Master Data Services

    Front End: PowerBI

How does it work?

1
Data Sources
  • cloud databases
  • on-premise database
  • Excel files with "pretty" formatting
  • csv files
2
Python Script
  • processing Excel files with formatting
  • conversion to *.csv
3
Linux Pipeline
  • Data filtering
4
Staging
  • Staging schema data load
5
Aggragation / MDS
  • Data aggregation at the month level
  • Populating Intermediate Fact Tables
  • Loading MD datamarts
  • Data transfer to MDS
6
MDS
  • MD Enrichment byuser
  • Enter MD required for calculations: courses, units. conversion reates.
  • Launch dataflow continuation
7
DWH Loading
  • Calculation and loading of data marts from fact tables and MDS user data
  • Recording the download log and the errors that occurred with the reasons
8
PowerBI
  • PowerBI dataset refresh