Mallikarjun Jogannavar

Azure/AWS Data Engineer

Bangalore, IN.

About

Highly skilled Azure and AWS Data Engineer with over 2 years of experience in designing, developing, and optimizing robust data pipelines and ETL processes. Proven ability to reduce data processing time by 25% and successfully migrate large-scale data solutions, leveraging expertise in Azure Databricks, Azure Data Factory, AWS Glue, and Apache Airflow to drive efficient data management and analytics.

Work

Congnizant Technology Solutions

AWS Data Engineer

Bangalore, Karnataka, India

Nov 2024

→

Present

Summary

Currently serving as an AWS Data Engineer, responsible for importing and managing new client data products into AWS Athena and Starburst platforms.

Highlights

Developed and optimized SQL and PySpark scripts for importing new client data products into AWS Athena and Starburst platforms, enhancing data accessibility.

Automated data workflows by implementing Apache Airflow DAGs to trigger and execute scripts on AWS Glue instances, improving operational efficiency.

Designed and created robust staging and reporting tables for new data streams, ensuring consistent and reliable data flow to business dashboards.

Proactively monitored and debugged AWS Glue jobs, including fine-tuning worker node requirements to enhance job performance and stability.

Utilized GitHub for collaborative development, ensuring version control and streamlined team coordination across data engineering initiatives.

Congnizant Technology Solutions

AZURE Data Engineer

Bangalore, Karnataka, India

May 2023

→

Sep 2024

Summary

As an Azure Data Engineer, led the data lake migration from Hadoop to Azure, designing and implementing data pipelines for large-scale data processing.

Highlights

Engineered and deployed robust data pipelines to ingest raw data from diverse sources (RDBMS, On-Prem) into ADLS, achieving a 25% reduction in data processing time.

Developed and implemented ETL processes for large raw datasets from ADLS into stage tables, integrating 50+ data quality checks via IIG Framework to ensure data integrity.

Successfully migrated over 10,000 lines of consumption/gold-layer table code and complex Hadoop SAS programs into optimized Databricks Notebooks, improving data accessibility and performance.

Designed and implemented robust Delta Lake architectures utilizing external and managed tables, significantly enhancing data lake efficiency and query performance.

Utilized Azure DevOps to facilitate seamless team collaboration, version control, and CI/CD for data engineering initiatives, improving development lifecycle efficiency.

Congnizant Technology Solutions

Intern

Bagalkot, Karnataka, India

Apr 2022

→

Jun 2022

Summary

Completed an internship focused on big data architecture, transforming airline data for dashboard visualization.

Highlights

Completed intensive training in big data architecture, PySpark, and Hive, building foundational expertise in distributed data processing and warehousing concepts.

Executed an end-to-end project to transform raw airline data using PySpark, successfully storing and processing it in Hive tables for dashboard visualization.

Applied theoretical knowledge to practical data manipulation tasks, demonstrating early proficiency in developing data solutions for analytical consumption.

Education

Basaveshwar Engineering College (Autonomous)

Bagalkot, Karnataka, India

Aug 2018

→

Aug 2022

Bachelor of Engineering

Electronics and Communication

Skills

Azure Cloud

Azure Databricks, Azure Data Factory, ADLS (Azure Data Lake Storage) Gen2, Azure Virtual Machines, Azure Key Vault.

Azure Databricks

Apache-Spark Framework, PY-Spark, Spark-SQL, Python, Unity Catalog, Delta Lakehouse, External/Managed Tables, Performance Tuning.

Azure Data Factory

Data Flows, Data Pipeline Design, Triggers, ETL/ELT development.

AWS Cloud

Glue jobs, Athena, S3 storage.

Other Technical Skills

Apache Airflow, CI/CD, Data Modeling, Data Warehousing, Medallion Architecture, Data Migration, Big Data Processing, Data Quality Assurance, Metadata Management, Data Governance, Problem-Solving.