Index◉ 01 / 06 Based inDoha, Qatar DisciplineSoftware · Data · Strategy Status● Graduating '26

Mohammad
Khan,
where code meets strategy.

Software, data, and strategy. Trained at Carnegie Mellon University in Qatar, shipping hybrid ML systems, database architectures, and the occasional open-source contribution to Apache Lucene.

Scroll to work
02 About
Mohammad Khan in Pittsburgh Pittsburgh

I work at the seam between code, data, and the messy parts of a business.

I'm a senior at Carnegie Mellon University in Qatar, majoring in General Studies with a minor in Business Administration. My coursework leans into machine learning, databases, and software engineering on one side, and strategy, marketing, and operations on the other, and I try to keep one foot firmly in each world.

Recent work spans a Ridge regression model for dengue fever forecasting (selected from four candidates after time-aware cross-validation), a hybrid TF-IDF + BERT classifier over 500K+ Reddit posts for mental-health query answering, a contribution to Apache Lucene's merge scheduler (mentored by Amazon engineers), and client-facing consulting projects on pricing and positioning for the Tepper side of campus.

I spent a semester on exchange in Pittsburgh, took a Global Learning Trip to Morocco, and generally enjoy work that happens at the intersection of things: disciplines, languages, geographies.

500K+Reddit posts modeled
0.67R² on dengue forecasting
4Languages (Py, SQL, C, R)
3Continents studied
03 Selected Work

Things I've built,
shipped, or helped design.

2024 to 2025 · 7 selected

№ 01 / Open Source

SharedMergeScheduler for Apache Lucene.

Summer contribution to Apache Lucene, mentored by Amazon engineers. Designed a shared merge scheduler that coordinates merges across multiple IndexWriters, with global executor sharing, task tracking, and graceful shutdown for multi-tenant workloads. Reviewed through GitHub PRs against Lucene's core architecture.

Java Lucene Concurrency Open Source
View the PR
№ 02 / Data Science

Forecasting dengue fever cases from climate data.

Practical Data Science (67-364) with Peidi Dong. Trained and compared four regression models on 18 years of weekly dengue data from San Juan and Iquitos. Engineered seasonal, lagged, and rolling features; used TimeSeriesSplit to avoid temporal leakage. Ridge selected as final model (RMSE 6.89, R² 0.67).

Python scikit-learn Time Series Ridge
Full report
№ 03 / Machine Learning

Query answering for mental-health posts.

Trained logistic-regression, random-forest, and XGBoost classifiers on 500K+ Reddit posts using a hybrid TF-IDF + BERT embedding pipeline. Built a concern-level scoring system from rule-based logic and model confidence. Presented to faculty.

Python BERT XGBoost NLP
Case study
№ 04 / Databases

14-table PostgreSQL system for a loyalty app.

Designed inheritance relationships, foreign-key constraints, triggers, and functional-dependency analysis to BCNF. Wrote 10 Python user stories with JOINs, aggregations, and window functions.

PostgreSQL Python BCNF
Schema and queries
№ 05 / Consulting

Strategic Management client project: customer segmentation that informed the strategy.

Tepper course 70-437. Conducted behavioral analysis to identify priority shopper segments and their implications for growth. Refined scope, prioritized strategic questions against client constraints, and delivered recommendations the team could act on.

Strategy Segmentation Client-facing
Deliverable (redacted)
№ 06 / Data Analysis

Network analysis of the Deezer music platform.

Detected user communities with the Louvain algorithm and evaluated modularity for cohesion. Quantified preference diversity across stratified clusters.

NetworkX Louvain
Write-up
№ 07 / Marketing

Marketing plan: from consumer perception to positioning.

Course 70-381. Translated consumer-perception research into segmentation and positioning direction, and recommended messaging built around authenticity as the primary value driver.

Research Positioning Branding
Plan summary

Courses at CMU-Q that shaped how I think.

Not a transcript, just the ones that keep showing up in my work. A mix of computer science, machine learning, and the Tepper and Dietrich classes that made the technical material land differently.

67-364 Practical Data Science End-to-end data science workflow. Home of the dengue forecasting project: time-aware cross-validation, feature engineering, and model comparison.
15-288 Machine Learning in a Nutshell Hybrid pipelines, model trade-offs, and the honest mess of real-world data. Home of the Reddit mental-health project.
15-122 Principles of Imperative Computation Contracts, invariants, and the discipline of reasoning about code before writing it. C and memory, up close.
67-262 Database Design and Development Normalization to BCNF, trigger design, and the loyalty-app project that became Work/№04.
67-272 Application Design and Development Full-stack web development in Ruby on Rails. MVC architecture, authentication and authorization, forms and validations.
17-313 Foundations of Software Engineering What good engineering practice looks like at scale: testing, reviews, and the culture around the code.
70-437 Strategic Management and Innovation Tepper. Client-facing capstone where strategy stops being a slide deck and starts being a decision.
70-381 Marketing Segmentation, positioning, and how consumer perception data becomes a defensible plan.
15-282 AI in Medicine Where ML meets clinical reality: evaluation, ethics, and the gap between a model's AUC and a doctor's decision.
70-311 Organizational Behavior The best "soft" class I took. Still the reason my team projects don't fall apart in week three.
05 Curriculum Vitae

The short version.

↓ Download PDF
Experience & Education
Fall '25

iX Lab, Information Systems Department

Carnegie Mellon University in Qatar

Helped establish the iX Lab, a research space now home to a growing body of IS faculty projects. Early-stage work focused on organizing the physical and operational setup so the lab could become what it is today.

Doha, Qatar
Summer '25

Open Source Contributor, Apache Lucene

Mentored by Amazon Engineers

Developed a SharedMergeScheduler to enable multi-tenant merge coordination across IndexWriters. Refactored task logic for global executor sharing and graceful shutdown handling, reviewed via GitHub PRs against Lucene's core architecture.

Remote
Dec '25

Global Learning Trip, Morocco

CMU-Q field program

Two weeks across Casablanca, Marrakech, and Essaouira. Visits to five women's cooperatives, exposure to Amazigh craft and culture, and learning that ranged from the Atlas Mountains to the coast. The kind of trip that quietly changes how you read a room.

Morocco
Fall '24

Inventory Assistant, Information Systems Department

Carnegie Mellon University in Qatar

Part-time on-campus role. Led a full rebuild of the Information Systems department's physical inventory, from a disorganized state to a fully catalogued, photo-indexed, and labeled system. The result is an inventory the department now actually uses.

Doha, Qatar
Summer '24

Exchange Semester, CMU Pittsburgh

Carnegie Mellon University (main campus)

Cross-campus exchange from CMU-Q. Coursework across machine learning, marketing, and software engineering. A lot of walking Forbes Avenue.

Pittsburgh, USA
Graduating '26

B.S. General Studies, Minor in Business Administration

Carnegie Mellon University in Qatar

Interdisciplinary track combining computer science and ML coursework (Imperative Computation, Machine Learning, Databases, Software Engineering, AI in Medicine, Practical Data Science) with Tepper business courses (Strategic Management, Marketing, Operations, Organizational Behavior).

Doha, Qatar

Notes, essays,
and the occasional rant.

All posts →
07 Contact

Let's talk.
mhk2@andrew.cmu.edu

Email mhk2@andrew.cmu.edu
LinkedIn mohammad-nasimul-haque-khan
GitHub github.com/mhk2-cmu
Location Doha, Qatar · open to relocation