Sr. Data Engineer (Big Data & Analytics Engineering) (Pune)
-
Pune, India
-
Posted: yesterday
-
Save
- Build and maintain robust ETL/ELT pipelines for ingestion, transformation, and aggregation of large-scale datasets on Hadoop and enterprise data platforms.
- Develop high-performance data processing jobs using PySpark/Spark, Python, and SQL (including engines such as Impala where applicable).
- Partner with Product and Analytics stakeholders to translate requirements into reusable, governed data models (facts/dimensions, curated layers, and semantic-ready datasets).
- Implement and automate data quality checks, reconciliation, lineage documentation, and monitoring to ensure trust in downstream analytics and AI use cases.
- Optimize pipeline performance and cost through partitioning, file formats, compute tuning, and efficient query patterns.
- Optimize pipeline performance and cost through partitioning strategies, columnar file formats (Parquet, ORC, Delta), compute tuning, caching, and efficient query patterns.
- Contribute to CI/CD for data workflows (testing, code reviews, deployment automation), promoting engineering best practices and maintainable codebases.
- Support data governance, privacy, and security requirements (PII handling, access controls, auditability) in collaboration with platform and risk partners.
- Collaborate with data scientists to publish analysis-ready and ML-ready datasets, including feature generation and repeatable data preparation processes.
- Troubleshoot production issues, participate in on-call/operational rotations, and drive root-cause fixes to improve reliability.
- Communicate data platform capabilities, limitations, and trade-offs clearly to technical and non-technical stakeholders.
- Strong problem-solving skills with ability to debug complex distributed data issues independently.
- Clear written and verbal communication with both technical engineers and non-technical business stakeholders. All About You Technical Skills & Experience
- Strong hands-on experience in data engineering building production-grade pipelines on big data platforms (Hadoop ecosystem and/or cloud data platforms).
- Strong hands-on experience in data engineering building production-grade pipelines on big data platforms (Hadoop ecosystem: HDFS, Hive, Impala, YARN, Oozie).
- Proficiency in PySpark and Python and strong SQL skills across distributed and relational data stores.
- Experience with orchestration/integration tools such as Apache Airflow, Apache NiFi, Azure Data Factory, Pentaho, or Talend.
- Solid understanding of data modeling, incremental processing patterns (CDC, SCD Type 1/2), and building curated datasets for analytics and reporting
- Experience with cloud services (Azure/AWS/GCP) for data lakes, compute, and storage is preferred.
- Proficiency in columnar and open table formats: Parquet, ORC, Delta Lake, Apache Iceberg, or Apache Hudi.
- Strong knowledge of distributed computing patterns: partitioning, bucketing, broadcast joins, shuffle optimization.
- Working knowledge of DevOps/CI-CD practices: version control (Git), automated testing, release pipelines, and observability.
- Solid problem-solving skills with the ability to debug complex data issues and communicate clearly with technical and non-technical stakeholders.
- Bachelor's degree in computer science, Engineering, or equivalent practical experience.
- 5+ years of relevant experience in data engineering or big data analytics engineering (flexible based on depth of expertise). GenAI / LLM Data Enablement (Preferred)
- Experience preparing curated, governed datasets (including semi-structured/unstructured) for AI/GenAI consumption with attention to privacy, quality, and reproducibility ________________________________________ **Corporate Security Responsibility** All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must: + Abide by Mastercard's security policies and practices; + Ensure the confidentiality and integrity of the information being accessed; + Report any suspected information security violation or breach, and + Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines. Apply on Kit Job: kitjob.in/job/4n8q38
-
Company nameMastercard
-
Job positionSr. Data Engineer (Big Data & Analytics Engineering) (Pune)
Sr. Data Engineer (Big Data & Analytics Engineering) (Pune) has been posted in the Pune Information Technology category on Locanto.
If you’re still wanting to browse, there is so much to explore in the Information Technology category! Take a look at the ads Php and WordPress developer to enhance and develop new websites, Pune, Laptop field Engineer, Pune and End-to-End Supply Chain Control via MRP Software in #219, 2nd Floor, The Business Hub, Karve Road, Karvenagar, Pune to discover more of what you’re looking for. Currently, there are 9 ads posted in the Information Technology category in Pune.
You can find the Information Technology category under Jobs. Want something else? Check out the related categories Administrative & Support, Recruitment & HR and Education & Training Pune.
Interested in more? Widen your search to view ads in nearby areas of Pune. This includes Information Technology in Koregaon Park, Khadki and Bavdhan. There are more ads within a 15 km radius for this category. If you want to view those ads, click here.