Data Engineering

for a Trust and Corporate Management Company

Challenge

The client reached out to us with a need of implementing a data engineering platform to collate, transform the data from multiple ERPs to achieve the following objectives

Unique Proposition-

1Thin and scalable data engineering layer
2Designed suitable for the cloud with pay per use to minimize the cost
3Designed suitable for adding any micro-strategy over the data architecture

The client had an idea, but they needed a technical partner:

1Who could work in tandem with them to brainstorm, plan, design, develop, and implement such a robust platform along with all the features they’ve envisaged.
2With the capacity to scale up resources as their product grows and evolves in the future

Architecture

Azure architecture
Azure architecture

Technology Stack

01

Tech

Architecture

Gitlab
Jenkins
Azure DevOps
Octopus Deploy
AWS CI/CD Pipelines
AWS CI/CD Pipelines
  • Azure DataBricks + Spark -> Slower start time and not serverless
  • Similar to EMR; it separates the computer from storage. The storage used will be Blob
  • Curated Data Lake is the preliminary stage before loading to Azure Datawarehouse
  • Transformations are handled in Spark
  • The transformed data is then pushed into Blob storage which syncs into Azure Datawarehouse
  • Startup times are required while running as spot instances
  • Instances are scaled on demand

Next Case Study