Hamza Zaman

Data Analyst ↔ Data Scientist — SQL • Power BI • Python • ML/NLP/LLMs

See Portfolio Contact

Snapshot

Work authorisation

Eligible to work in the UK (open to relocation)

Location

London, United Kingdom

Email

hamzazaman04@gmail.com

linkedin.com/in/hamza-zaman-data

GitHub

github.com/hamza-zaman

Tech stack for commercial work

Analyst Core

SQL (T-SQL, window funcs, CTEs)
Power BI (DAX, Power Query/M)
Excel (Power Query, Pivot, VBA basics)
Python for analysis (pandas, NumPy)

DS / ML

scikit-learn, XGBoost/LightGBM
Feature engineering, CV, ROC/AUC
Time-series & forecasting
Experiment tracking (MLflow)

Modern AI / NLP

LLMs (open-source: Llama/Mistral)
RAG pipelines (LangChain/LlamaIndex)
Vector DBs (FAISS / pgvector / Chroma)
Prompt engineering, eval & guardrails

Data Eng & Ops

APIs (REST/GraphQL), ETL in Python
Azure SQL, star schemas
Git, CI/CD (GitHub Actions), Docker (basic)
Azure ML / SageMaker (exposure)

Soft skills that ship value

Stakeholder communication & expectation settingKPI design, experimentation, and measurementData storytelling for non-technical audiencesPrioritisation & delivery under time constraintsData quality, governance, and GDPR awarenessCoaching & enabling self-serve analytics

Outcomes

Reporting time: 3h → 15m (automation, Power BI + Python ETL)

Decisions 40% faster (API-powered dashboards: monday.com, GA, Hootsuite, Cvent)

£50k+ savings (spend/variance insights in Power BI)

Revenue +£10k/week (GA Trainline analysis, process fixes)

95% availability during COVID (demand forecasting)

Selected projects

Car Insurance Claim Prediction

Python, XGBoost

Gradient boosting model to estimate claim risk for personal auto policies.
Feature engineering improved premium accuracy and profitability.

Open project Open notebook

London Fire Brigade Incident Analytics

Python, scikit-learn

Classification & clustering to flag false alarms; ~20% cost premium exposed.
Actions cut avoidable call-outs by ~33% (deployment planning).

Open project Open notebook

Osteoporosis Fracture-Risk Prediction

Python, scikit-learn

Benchmarked KNN, Random Forest, Logistic Regression, and SVM models; KNN (k=7) achieved the highest accuracy at 88% on the balanced dataset.
Top predictive factors identified from Random Forest feature importance were BMD and Age.

Open project Open notebook

Master’s Thesis — NLP Lip-Reading

TensorFlow, Seq2Seq (GRU-attention)

Optimised on 45k LRS2 sentences (TPU); phoneme-viseme features.
~3% WER and 0.92 BLEU (~+15 pp vs baseline).

Open project Open notebook

Hamza Zaman

Snapshot

Work authorisation

Location

Email

LinkedIn

GitHub

Tech stack for commercial work

Analyst Core

DS / ML

Modern AI / NLP

Data Eng & Ops

Soft skills that ship value

Outcomes

Selected projects

Car Insurance Claim Prediction

London Fire Brigade Incident Analytics

Osteoporosis Fracture-Risk Prediction

Master’s Thesis — NLP Lip-Reading