Key Responsibilities Architect Integrated Solutions: Lead the architectural design and implementation across edge devices, cloud infrastructure, and machine learning workflows covering raw-to-gold data layers. Build and Govern the Data Platform: Manage data ingestion, transformation, and cataloging across medallion architecture zones (Raw, Bronze, Silver, Gold), aligned with Unity Catalog for governance.Enable Scalable ML Platform: Support ML teams through the development and maintenance of infrastructure for feature storage, model operations, deployment, and monitoring. Edge Integration and Automation: Design secure and scalable OT-IT integration using Docker, Portainer, RabbitMQ, OPC UA, and edge-to-cloud communication practices (including IDMZ).Monitor and Optimize Pipelines: Implement real-time monitoring for ETL and ML pipelines using tools such as Prometheus, and optimize workloads for performance and cost efficiency. Governance and Security Compliance: Enforce data governance standards, including access control, metadata tagging, and enterprise compliance across zones using Unity Catalog and Azure-native tools.Lead CI/CD Automation: Automate the deployment of platform components and ML workflows using Azure DevOps, GitHub Actions, and self-hosted runners in a monorepo structure. Technical Expertise Azure Cloud&DevOps: Azure Data Factory (ADF) for orchestration Azure Databricks (ADB) with ML workspace: Feature Store, Model Store Azure Data Lake Storage (ADLS) using medallion architecture Azure Event Hub: Topic design, consumer groups, ETL integration Azure Streaming Analytics for real-time telemetry dataAzure Key Vault, App
Service, Container Registry (ACR) Azure IoT Hub for edge device integration Azure DevOps&GitHub Actions for CI/CD automation GitHub self-hosted runners for workflow management Edge and On-Prem Integration: Edge VM deployment using Docker and Portainer Messaging via RabbitMQ (read/write from edge) OPC UA for PLC integration (e.g., FX Filter, NH3 Compressor) IDMZ architecture for secure edge-to-cloud integrationMachine Learning Platform&MLOps: End-to-end ML lifecycle: Feature
engineering, model training, validation, deployment Monitoring of deployed models at high frequency (e.g., 1-minute intervals) Cloud vs edge deployment strategy, cadence management (weekly, monthly, quarterly) MLflow, ADB ML workspace, monorepo structures for model codeData Architecture&Integration: Implementation of medallion architecture Integration with Unity Catalog for data governance and sharing Real-time SAP ingestion using CDC tools (e.g., Aecorsoft) Streaming and API-based data ingestion Template-driven ingestion and mapping using configurations Consumption layer design for ML, BI, and operational reportingGovernance&Data Modeling: Define and implement data governance policies Scalable data model design for operational analytics and ML feature generation Metadata tagging, access control, and quality enforcement across data layers