This repository contains code for a movie recommendation system using collaborative filtering and content-based filtering methods.
MyMovieRecommendation/
│
├── data/
│ ├── ml-25m/
│ │ ├── movies.csv
│ │ └── ratings.csv
│ │ └── ... (other data files)
│
├── notebooks/
│ ├── MovieLensRecommendation.ipynb
│ └── ...
│
├── src/
│ ├── models.py
│ ├── dataset.py
│ ├── train.py
│ ├── test.py
│ └── utils.py
│
├── requirements.txt
│
├── README.md
This movie recommendation system project combines collaborative filtering and content-based filtering methods to provide personalized movie recommendations to users. Collaborative filtering leverages user preferences and similarities, while content-based filtering focuses on the attributes of movies. The goal is to offer accurate movie recommendations based on user preferences and movie content.
To run the code in this project, you need Python and several libraries. You can install the required libraries using the following command:
pip install -r requirements.txt
To use the code, follow these steps:
-
Run the training script to train the recommendation model:
python train.py
-
After training, you can evaluate the model using the test script:
python test.py
The project uses the MovieLens 25M dataset, which contains movie ratings and details. The data is organized in CSV files and is preprocessed to ensure data quality. You can download the dataset here.
The recommendation model is built using PyTorch and consists of collaborative filtering and content-based filtering components. The model architecture includes user and movie embeddings, neural networks, and the integration of movie attributes such as genres.
During training, the model learns to predict user ratings for movies. The hyperparameters, loss functions, and optimizers are set to optimize the model's performance.
The model's performance is evaluated using several metrics, including:
- Test RMSE: Root Mean Squared Error
- Test precision: A measure of the true positive rate
- Test recall: A measure of how well the model finds positive instances
- Test F1 score: A combined measure of precision and recall
These evaluation metrics provide insights into the model's accuracy and effectiveness in recommending movies to users.
Here are some sample evaluation results with the following metrics:
- Test RMSE: 0.274
- Test precision: 62.306%
- Test recall: 100.000%
- Test F1 score: 76.776%
The following two plots show the distribution of true ratings and prediction errors:
The first plot illustrates the distribution of true ratings, while the second plot shows the distribution of prediction errors.
If you'd like to contribute to this project, you're welcome to do so! You can help by submitting issues, making pull requests, and following coding standards and guidelines detailed in the project's documentation.
This project is licensed under the MIT License. You can find the full license details in the LICENSE file.