Raj portrait
👋

Hi, I am Raj Kumar. I am a Cloud AI & Data Engineer with 7+ years of experience. I architect secure cloud foundations and build intelligent data solutions, bridging the gap between Robust Data Engineering & Generative AI.

About Me

Cloud, AI & Data Engineer with 7+ years of experience building secure, scalable data platforms within the Microsoft Azure ecosystem and AWS. I specialize in turning raw, complex data into trusted, analytics-ready assets that enterprises can act on with confidence.

My core strength is end-to-end data engineering. I design and orchestrate pipelines using Azure Data Factory, Databricks, Synapse, and Microsoft Fabric, applying Medallion Architecture and Delta Lake to build lakehouses that are clean, performant, and governed. My daily toolkit includes PySpark, Spark SQL, Python, and SQL — paired with Snowflake for multi-cloud flexibility and Terraform for infrastructure-as-code deployments. I've built systems that handle real-time streaming with Kafka and deliver business insights through Power BI.

What sets me apart is that I don't stop at pipelines. I extend data platforms into intelligent systems using Azure AI Foundry and RAG-based architectures, building production-grade GenAI solutions that sit on top of the reliable data foundations I've already engineered.

Having recently graduated with my Master's in Computer Science (Dec 2025) from Concordia University Chicago, I'm open to relocation and ready to bring this full-stack data engineering perspective to a team solving hard problems at scale.

Outside of work, I'm a morning guy — I enjoy playing soccer, the gym, watching movies, and spending time with my friends. I also love making coffee and lattes.

My projects

DisputeAI

An intelligent financial dispute resolution system using Agentic AI and RAG on Azure to automate case analysis and generate citation-backed recommendations.

  • Azure AI Foundry
  • Azure OpenAI
  • RAG
  • Azure AI Search
  • LangChain
  • Agentic AI
  • Python
View on GitHub→

Snowflake AI Sales Insights

An AI-powered sales analytics platform using Snowflake Cortex Analyst with a natural-language Streamlit chat interface for querying live sales data.

  • Snowflake
  • Cortex Analyst
  • Streamlit
  • Snowpark
  • Python
  • SQL
  • Data Science
View on GitHub→

Credit Fraud Detection

A production-grade credit card fraud detection ML pipeline with PySpark feature engineering, model training, and a Power BI monitoring dashboard.

  • Azure Databricks
  • PySpark
  • Scikit-learn
  • XGBoost
  • Azure ML
  • MLflow
  • Power BI
View on GitHub→

Bicycle Sales Analytics

An end-to-end data pipeline with Snowflake data warehousing and Sigma Computing dashboards, following Medallion architecture for bicycle sales and accessories data.

  • Snowflake
  • SQL Server
  • Docker
  • Sigma Computing
  • Star Schema
  • Medallion Architecture
View on GitHub→

Earthquake Analysis

A data engineering platform analyzing global earthquake data using Microsoft Fabric Lakehouse and Azure Databricks with interactive Power BI dashboards.

  • Microsoft Fabric
  • Azure Databricks
  • Delta Lake
  • PySpark
  • Python
  • Pandas
  • Power BI
View on GitHub→

AWS VPC & EC2 Infrastructure

A hands-on AWS networking project demonstrating custom VPC setup with public/private subnets, NAT Gateway, Bastion Host, and secure EC2 access patterns.

  • AWS VPC
  • EC2
  • NAT Gateway
  • Bastion Host
  • Security Groups
  • Route Tables
View on GitHub→

DITA

Data is ingested from an on-premises, transformed using data engineering tools, and analyzed through visualization tools.

  • MS SQL Server
  • Azure Data Lake
  • Data Factory
  • Databricks
  • Synapse Analytics
  • Power BI
View on GitHub→

Product Sales Analytics

An interactive Power BI report leveraging the AdventureWorks database for sales performance through data visualization.

  • Power Query
  • Power BI
  • M language
  • DAX
View on GitHub→

Supply Chain Analytics

An end-to-end analytics pipeline on Azure Databricks processing supply chain and sales data using Medallion architecture with Delta Lake.

  • Databricks
  • PySpark
  • SQL
  • Delta Lake
  • Time Travel
  • Multi Hop
  • Unity Catalog
View on GitHub→

My skills

My experience

My Education

My Certifications

Microsoft Power BI Data Analyst Associate badge
Microsoft Power BI Data Analyst Associate
Microsoft PL300 - November 2024
Microsoft Azure Data Engineer Associate badge
Microsoft Azure Data Engineer Associate
Microsoft DP300 - October 2024
AWS Developer Associate badge
AWS Developer Associate
Amazon Web Services - July 2025
Microsoft Azure AI Fundamentals badge
Microsoft Azure AI Fundamentals
Microsoft AI900 - June 2024
Microsoft Azure Data Fundamentals badge
Microsoft Azure Data Fundamentals
Microsoft DP900 - June 2024
Microsoft Azure Fundamentals badge
Microsoft Azure Fundamentals
Microsoft AZ900 - January 2024
Databricks Generative AI badge
Databricks Generative AI
Databricks - November 2024
Databricks Fundamentals badge
Databricks Fundamentals
Databricks - October 2024
Azure Databricks Platform Architect badge
Azure Databricks Platform Architect
Databricks - January 2025
AWS Cloud Practitioner badge
AWS Cloud Practitioner
Amazon Web Services - April 2025
Databricks AI Agent Fundamentals badge
Databricks AI Agent Fundamentals
Databricks - May 2026

Contact me

Please contact me directly at manalarajkumar.rm@gmail.com or through this form.