Vishal Verma

Vishal Verma

About Me

I am a Research Engineer and ML Leader specializing in causal inference, distributed ML systems, and LLM infrastructure. With over 10 years of experience spanning industry and academia, I drive applied AI innovation at the intersection of rigorous research and scalable engineering.

My research focuses on causal representation learning—uncovering latent causal structures from observational data to enable more robust, interpretable, and trustworthy machine learning systems. I am particularly interested in how causal abstractions can improve generalization, fairness, and out-of-distribution robustness in modern ML pipelines.

Currently, I am advised by Prof. Kun Zhang (CMU/MBZUAI) and Prof. Zhijing Jin (Max Planck Institute), working on foundations of causal discovery and representation learning. Previously, I led data science initiatives at Dream11 as Lead Data Scientist, where I built production ML systems serving millions of users.

Get In Touch

News

Oct 2025

Open-Sourced Causal Agent Library

Released fully automated causal effect estimation library powered by Large Language Models.

Sep 2025

Paper Accepted at NeurIPS 2025

"Causal AI Scientist: Facilitating Causal Data Science with Large Language Models" accepted at Neurips Workshop.

Aug 2025

Paper Accepted at COLM 2025

"Causal AI Scientist: Facilitating Causal Data Science with Large Language Models" accepted at Conference on Language Modeling.

Sep 2023

Contributed to EconML Open Source

Contributed distributed computing capabilities to scale OrthoLearner algorithms in Microsoft's EconML library using Ray framework.

Aug 2023

Two Papers Accepted at ACM AIMLSystems 2023

Papers on FENCE fraud detection system and distributed causal algorithms accepted at ACM AIML conference.

Skills & Expertise

All Skills
Languages
ML/AI
Frameworks
MLOps & Cloud
Data & Monitoring
Certifications

Work Experience

Jan 2025 – Present

Research Engineer

CMU-CLeaR Group & MBZUAI (Causal ML), Pittsburgh, PA
  • Architected and launched causal discovery platform, reducing hypothesis-to-result cycle by 50% and improving adoption by 35%
  • Co-led development of Causal for Education platform with MBZUAI and KBZA, implementing causal graph algorithms for student performance prediction and academic pathway simulation
  • Guided 40+ M.S. students through complete ML development lifecycle and building distributed systems
Sep 2019 – Jan 2025

Lead Machine Learning Engineer

Dream11, Mumbai, India
  • Directed cross-functional team of 4+ engineers to build AI-Coach system using research agents and LLMs, delivering actionable player analysis and strategic match guidance
  • Achieved 150% faster runtime by designing and building scalable causal inference platform by scaling existing algorithms and causal inference techniques in a distributed manner
  • Applied uplift modeling techniques to minimize churn rates and improve promotional ROI through intelligent ranking algorithms
  • Achieved 200% reduced runtime identifying root cause by implementing the RCA detection algorithm using causal discovery and causal algorithms
  • Saved 60% of resources under-utilization by building a concurrency prediction model using LSTM to ensure app services are future-proofed based on anticipated user demand
  • Designed and built Scalable Real-time and Batch Fraud Detection ML Systems using a connected component algorithm and graph database bringing down detection and blocking time up to 150%
Dec 2017 – Mar 2019

Machine Learning Engineer

Kohl's (consultant), San Jose, CA
  • Collaborated with Kohl's data science team to develop a GCP-based Data Science Framework utilizing machine learning models (e.g., Logistic Regression, Random Forest) to analyze customer affinity towards product signals, impacting over 50 million U.S. customers for enhanced recommendations, discount optimization, and improved search experience
June 2016 – Sep 2019

Sr. Data Engineer

Exadatum Software Services Pvt. Ltd, Pune, India
  • Developed a versatile, self-serve tool over Apache Spark for solving batch, streaming, and machine learning use cases, with Data-Visualization support. Tailored for use by Data Engineers, Data Scientists, and Business Analysts
  • Built Data Quality Framework involving ML use-cases such as anomaly detection and trend detection

Publications

2023

FENCE: Fairplay Ensuring Network Chain Entity for Real-Time Multiple ID Detection at Scale In Fantasy Sports

ACM AIMLSystems'23

Upreti, Akriti; Verma, Vishal; Kothari, Kartavya; Thukral, Utkarsh

Read Paper
2023

Accelerating Causal Algorithms for Industrial-scale Data: A Distributed Computing Approach with Ray Framework

ACM AIMLSystems'23

Verma, Vishal; Reddy, Vinod; Ravi Jaiprakash

Read Paper

Get In Touch

Location

Pittsburgh, Pennsylvania