BDNav

Image

BDNav


​At CloudZA, we specialize in crafting tailored cloud solutions that drive operational efficiency and innovation. Our collaboration with BDNav, a leading provider of analytics for retail stores, exemplifies our commitment to delivering scalable, cost-effective, and real-time data platforms.


Understanding BDNav's Challenges

BDNav's mission is to empower retail businesses with actionable insights derived from vast amounts of data. However, their existing data infrastructure was predominantly batch-oriented, leading to several challenges:​

  • Delayed Data Availability:
    Batch processing introduced significant lags, delaying critical business insights.​


  • High Operational Costs:
    The reliance on scripting for ETL processes resulted in substantial compute expenses due to processing requirements.


  • Data Integrity Concers:
    The absence of primary keys in their SQL Server databases complicated upsert operations, risking data duplication and inconsistency.​


  • Scalability Limitations:
    As data volumes grew, the existing architecture struggled to scale efficiently.​

Crafting the Solution

To address these challenges, CloudZA designedd and implemented a real-time, cost-efficient data ingestion pipeline leveraging AWS services and open-source technologies:

  • Real-Time Data Streaming:
    We deployed DMS to capture Change Data Capture (CDC) events from BDNav's SQL Server databases. This setup enabled the streaming of real-time data changes directly into Amazon S3.


  • Handling Absence of Primary Keys:
    For tables lacking natural primary keys, we implemented a surrogate key strategy. This approach facilitated efficient upsert operations, ensuring data consistency and integrity.


  • Scalable and Self-Healing Infrastructure:
    By utilising DMS Managed Service, we enabled a fault-tolerant, highly available environment, reducing the need for manual intervention and maintenance.


  • Performance Optimisation:
    Our team fine-tuned the Debezuim Server Iceberg configuration, achieving and ingestion rate of approximately 1 million rows per batch every 3 minutes.


  • Monitoring and Observability:
    We integrated Amazon CloudWatch for comprehensive monitoring, setting up metrics and alerts to track ingestion performance, resource utilisation and system

Lifecycle Stages of Implementation
  • 1. Ingestion and Streaming:
    Real-time data capture from SQL Server using Debezium Server Iceberg, with offset management to maintain data consistency.


  • 2. Teansformation and Upsertion:
    Application of surrogate keys and direct writing into Iceberg tables on S3, eliminating the need for intermediate ETL layers.


  • 3. Analysis and Visualisation:
    Utilisation of Amazon Athena to query Iceberg tables, providing BDNav's analysts with timely and actionable insights.


  • 4. Monitoring and Performance Tuning:
    Continuous monitoring via CloudWatch, with automated scaling policies to maintain optimal performance and resource utilisation.

Tools and Services Utilised
AWS Services:
  • Amazon S3: Scalable storage for Iceberg tables.

  • Amazon Athena: Serverless querying over Iceberg tables.

  • Amazon CloudWatch: Monitoring and logging

Open Source Tools:
  • DMS: CDC tool for real-time data streaming.

  • Apache Iceberg: High-performance table format for large datasets.

Achieved Outcomes
  • Cost Efficiency: Eliminated the need for AWS Glue ETL jobs, significantly reducing compute costs
  • Real-Time Insights: Enabled data availability within minutes of changes occuring in the source database, empowering BDNav to provide timely analytics to retail clients.
  • Operational Simplicity: Simplified architecture with reduced maintenance overhead, allowing BDNav's team to focus on delivering value rather than managing infrastructure
  • Scalability: The solution seamlessly scales with data growth, ensuring consistent performance even as data volumes increase.

Conclusion

Through close collaboration with BDNav, CloudZA delivered a robust, real-time data ingestion and analytics platform tailored to the unique challenges of the retail analytics sector. By harnessing the power of AWS and open-source technologies, we transformed BDNav's data operations, enabling them to offer enhanced insights to their retail clients while optimising costs and operationsl efficiency.


Talk to an expert

Schedule a FREE Consultation