Bike Sharing

A machine learning analysis of bike-sharing demand patterns using the UCI dataset, with Gradient Boosting achieving ~90% prediction accuracy.

The Challenge

Bike sharing systems struggle with inventory imbalances. Stations run empty during peak demand while others have idle bikes. Operators need demand forecasting to optimize bike distribution.

The Solution

Analyzed the UCI Bike Sharing Dataset with hourly rental records. Compared four regression models, with Gradient Boosting Regressor achieving the best performance (~0.9 R² score) by capturing complex feature interactions.

Key Features

Exploratory analysis revealing peak demand at 4-5 PM (~450 bikes)
Feature engineering from temporal, weather, and user-type variables
Model comparison: Ridge, Random Forest, Gradient Boosting, KNN
Gradient Boosting identified as best performer with ~0.9 score
Hour of day and temperature identified as dominant predictors

Tech Stack

Python

Jupyter Notebook for iterative analysis and visualization

Scikit-learn

Model training with hyperparameter tuning; Gradient Boosting achieved best results

Pandas

Data manipulation with one-hot encoding for categorical variables

Back to Projects