Skip to content

🧠 Amazon Machine Learning Hackathon 2025 β€” Product Price Prediction πŸš€ Team AVIS β€” Rank: 1.6K / 82K Registrations Final SMAPE Score: 51.4The challenge involved predicting product prices from semi-structured product catalog data.

Notifications You must be signed in to change notification settings

Venkat-023/Amazon_MachineLearning-Hackathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Amazon ML Hackathon 2025 | Team AVIS | Product Price Prediction | Rank 1693 |Total Registered:82,790 | SMAPE 51.4

Amazon Machine Learning Hackathon 2025 β€” Product Price Prediction

This repository contains Team AVIS’s solution for the Unstop ML Hackathon 2025, where the task was to predict product prices from structured and unstructured catalog data.

πŸ† Team Name: AVIS πŸ“ˆ Final SMAPE: 51.4 πŸ₯‡ Leaderboard Rank: #1693 βš™οΈ Frameworks: LightGBM + Sentence Transformers (MiniLM-L6-v2) 🧠 Hardware: GPU-accelerated training on Google Colab

Project Overview

The goal was to build a regression model that accurately predicts product prices using both structured features (brand, quantity, unit) and unstructured text (titles, bullet points, and product descriptions).

Component Description Text Encoder SentenceTransformer – all-MiniLM-L6-v2 Model LightGBM (GPU, regression_l1 objective) Metric SMAPE (Symmetric Mean Absolute Percentage Error) Optimization Early stopping, feature scaling, lemmatization, unit normalization Model Architecture

Text Cleaning β€” remove emojis, punctuation, and stopwords

Embedding Generation β€” use SentenceTransformer (MiniLM-L6-v2)

Feature Fusion β€” combine embeddings + categorical + numeric features

Training β€” GPU-based LightGBM regressor

Evaluation β€” SMAPE metric

πŸ“Š Results Metric Score Validation SMAPE 47.43 Public Leaderboard SMAPE 51.4 Final Rank #1693 / 82,790 🧩 Tech Stack

🐍 Python 3.12

πŸ’‘ LightGBM (GPU)

πŸ€– SentenceTransformers (MiniLM-L6-v2)

🧹 NLTK for text preprocessing

πŸ“¦ scikit-learn, pandas, numpy, joblib

πŸ§ͺ How to Run

1️⃣ Install dependencies

!pip install -q lightgbm sentence-transformers emoji nltk

2️⃣ Run the training script

python amazon_price_prediction.ipynb

3️⃣ Predict on new data

python inference_script.py --input test.csv --output predicted_prices.csv

Highlights

βœ… Preprocessed 95K+ records combining structured and unstructured data βœ… Generated 384-dimensional text embeddings using MiniLM βœ… Optimized LightGBM with GPU acceleration βœ… Achieved SMAPE 51.4 β€” Top 2% out of 82,790 global participants

About

🧠 Amazon Machine Learning Hackathon 2025 β€” Product Price Prediction πŸš€ Team AVIS β€” Rank: 1.6K / 82K Registrations Final SMAPE Score: 51.4The challenge involved predicting product prices from semi-structured product catalog data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages