Project 1

Movie recommendation project

This project focuses on building a movie recommendation system that helps users discover films similar to their interests. The core idea is to suggest movies based on similarity rather than popularity, making recommendations more personalized and relevant. The system uses a content-based filtering approach, where movies are recommended by analyzing features such as genres, keywords, cast, and overview, and measuring their similarity using vectorization techniques and cosine similarity.

The inspiration behind this project came from the overwhelming number of movies available across streaming platforms, which often makes it difficult for users to decide what to watch next. Instead of relying on ratings alone, this system helps users find movies that align closely with their preferences.

I built this project to understand how recommendation engines work in real-world applications such as Netflix and Amazon Prime. It solves the practical problem of content discovery by reducing decision fatigue and improving user experience through intelligent, data-driven recommendations.


Project 2

Email/SMS spam classification project

This project focuses on building a machine learning–based system that automatically classifies emails and SMS messages as either spam or legitimate (ham). The goal is to identify unwanted, promotional, or potentially harmful messages and filter them before they reach the user.

The classifier works by analyzing the text content of messages and converting them into numerical features using text vectorization techniques. A supervised machine learning model is then trained to recognize patterns commonly found in spam messages, such as specific words, frequency of terms, and sentence structure.

The inspiration behind this project comes from the increasing volume of spam and phishing messages in everyday communication. Manual filtering is inefficient and unreliable, making automation a practical necessity.

I built this project to gain hands-on experience in natural language processing (NLP) and text classification while solving a real-world problem related to digital safety and communication efficiency. The final system demonstrates how machine learning can be applied to improve message filtering and reduce noise in communication channels.


Project 3

Car price prediction project

Built a complete end-to-end machine learning application that predicts used car prices based on real-world market data, taking the project from raw messy data all the way through to a fully deployed, interactive web product. The foundation of the project was a real Quikr used car dataset, which required significant data cleaning and preprocessing — handling missing values, removing outliers, standardising inconsistent entries for kilometres driven and pricing, and ensuring the data was in a reliable state before any modelling began. Using pandas and NumPy throughout, I performed exploratory data analysis to understand the distribution of key variables such as car age, fuel type, brand, and usage, identifying the features most predictive of resale price.

Once the data was clean and well understood, I engineered the relevant features and trained a Linear Regression model using scikit-learn, evaluating its performance and iterating on the feature set to improve prediction accuracy. The trained model was then serialised using pickle, making it portable and ready for integration into a production environment without needing to retrain on every request.

The deployment layer was built using Flask, a lightweight Python web framework, where I designed two routes — a home page that dynamically populates dropdown menus for car company, model, year, fuel type, and kilometres driven from the live dataset, and a prediction route that takes the user's form input, cleans and formats it, passes it through the loaded model, and returns a real-time price estimate. The front end was built with HTML and CSS using Jinja2 templating to retain selected values after submission and display the predicted price cleanly without resetting the form.

The project demonstrates the kind of end-to-end ownership that separates practical data science from notebook-only work — covering data engineering, statistical modelling, software development, and web deployment within a single cohesive pipeline.

Tech stack: Python · pandas · NumPy · scikit-learn · Flask · pickle · Jinja2 · HTML/CSS


Project 4

Currency converter Chatbot

This project focuses on building a currency converter chatbot that can perform real-time currency conversions while also supporting basic conversational interactions. The chatbot allows users to ask conversion-related questions in natural language, making the experience simple and user-friendly.

The system is built using Google Dialogflow for natural language understanding, which helps identify user intent and extract key parameters such as source currency, target currency, and amount. Once the intent is detected, the backend processes the request and fetches live exchange rates using a currency conversion API.

The inspiration behind this project comes from the increasing adoption of chat-based interfaces in modern applications, where users prefer conversational interactions over traditional input forms. Combining a chatbot with live financial data makes currency conversion faster and more accessible.

I built this project to gain hands-on experience in integrating conversational AI with external APIs and backend logic. It solves the real-world problem of quick currency conversion while demonstrating how chatbots can be extended beyond conversation to deliver practical, data-driven functionality.


Project 5

Retail Customer Behavior and Shopping Trends

End To End Data Analytics Project using SQL, Python, Power BI, Project Report, GitHub | Company-Level Data Analyst Project 📌 Project Overview The goal of this project is to simulate a corporate-grade end-to-end data analytics workflow, demonstrating the ability to translate raw data into strategic business intelligence by:

Data Preparation,Modeling & Exploratory Data Analysis (Python): Clean and transform the raw dataset for analysis.

Data Analysis (SQL): Simulate business transactions, and run queries to extract insights on customer segments, loyalty, and purchase drivers.

Visualization & Insights (Power BI): Build an interactive dashboard that highlights key patterns and trends, enabling stakeholders to make data-driven decisions.

Report and Presentation: Write a clear project report summarizing your key findings and business recommendations. Prepare a presentation that visually communicates insights and actionable recommendations to stakeholders.