← Back to cohort

Muhammad Tauha Kashif

NUST · 2026 · 412365
Email
mkashif.bese22seecs@seecs.edu.pk
Phone
923334727249
LinkedIn
https://www.linkedin.com/in/muhammad-tauha-
GitHub

Academic

Program
CGPA
2.64
Year
2026
Education
Bachelors in Software Engineering School of Electrical Engineering and Computer Sciences , Islamabad (2022)
Address
House 25, Block 16 , Sector b-i, township , Lahore , Pakistan
DOB

Career

Current role
Target role
Skills
PROFESSIONAL PROFILE Fresh Software Engineering graduate passionate about data engineering, with hands-on experience in AWS and GCP pipelines, real-time processing, and dashboarding. Focused on developing efficient data processing systems that transform raw data into actionable insights. Experienced in batch data processing, cloud infrastructure optimization, and building analytics dashboards that drive business decisions. EDUCATION Bachelors in Software Engineering School of Electrical Engineering and Computer Sciences , Islamabad (2022) INTERNSHIP EXPERIENCE Software Productivity Strategists (SPS) Inc. 01-Jul-2025 - 22-Sep-2025 DevOps/DataOps Intern • Shipped CI/CD for containerized data services with Jenkins (multibranch) and reproducible Docker builds; gated quality via pre-commit (Black, Ruff). • Instrumented ETL containers + hosts with Prometheus/Grafana (cAdvisor, Node Exporter, Alertmanager); built scrape configs & dashboards for latency, errors, and capacity. • Standardized project scaffolding (.env, .dockerignore, Makefile, READMEs) so new services go from repo ’ deploy with fewer steps and cleaner diffs. Buildables 22-Jul-2025 - 18-Oct-2025 Data Engineering Intern • Built production-grade ETL pipelines with hash-based change detection (MD5) and PostgreSQL upserts, processing incremental loads while maintaining full audit trails and execution metadata for data lineage tracking. • Designed star schema data warehouse implementing SCD Type-2 for historical tracking; wrote complex analytical queries using CTEs, window functions, and joins to derive business insights from e-commerce transaction data. • Optimized large dataset processing by benchmarking Pandas vs Dask vs Polars on 2M+ records, achieving 10-15x performance gains through lazy evaluation and Parquet columnar storage—reducing file sizes by 75%. • Developed modular data quality framework with custom cleaners for standardizing inconsistent formats (dates, currencies, names), achieving 99.7% parse success across messy real-world datasets. • Containerized entire data stack using Docker Compose with isolated PostgreSQL instances, automated schema initialization, and health checks— ensuring reproducible deployments across environments. FINAL YEAR PROJECT AI Development Environment Troubleshooting Copilot - Autonomous AI Agent System: Developed an intelligent troubleshooting copilot that automates diagnosis and resolution of development environment configuration issues (Docker, package managers, CLI toolchains) using LLM-powered workflow orchestration. - System Profiling & Context Extraction: Built modular CLI utilities for capturing structured error traces and comprehensive system snapshots (hardware, processes, services, network, installed packages) to provide rich diagnostic context. - Hybrid Web Architecture: Engineered full-stack solution with React/TypeScript frontend and FastAPI backend, featuring real-time WebSocket event streaming, chat-based interface, and agent workflow visualization. - RAG-Enhanced Troubleshooting Pipeline: Implemented vector-based semantic retrieval system using ChromaDB with e5-base-v2 embeddings for context-aware error diagnosis, combined with LangGraph-based multi-stage workflow orchestration (initialization → context gathering → step generation → error resolution). - Structured Data Pipeline: Designed JSONL-based training data format for error normalization and context requirement detection across multiple domains (Python, Node.js, Docker, Git), enabling future fine-tuning and knowledge base expansion. - Production-Ready Features: Integrated error recovery mechanisms, command safety verification, diff-based state

AI enrichment

Fresh Software Engineering graduate passionate about data engineering, with hands-on experience in AWS and GCP pipelines, real-time processing, and dashboarding. Focused on developing efficient data processing systems that transform raw data into actionable insights. Experienced in batch data processing, cloud infrastructure optimization, and building analytics dashboards that drive business decisions.
Status: ai_done
Provenance
Source file:
Created: 1777448793