This project aims to build accurate predictive models for diamond prices using regression models. It is crucial for diamond sellers and buyers to determine competitive prices and make informed decisions, respectively. Factors such as carat weight, cut quality, color grade, clarity, and physical dimensions impact a diamond’s price. The project uses a dataset from Kaggle, with data on these characteristics for 53,909 diamonds.
The analysis encompasses three regression models and handles multi-collinearity among predictors and prevents model overfitting. A range of techniques like visualization, exploratory data analysis, model diagnostics, and train-test split are used to ensure the models’ reliability. The selected model achieves 0.9848 in R-squared.