Skip to content

Dalageo/GANScapeGenerator

Repository files navigation

Landscape-Gif

Generating Landscapes Using DCGAN and StyleGAN3 🏞️

This project examines image generative models, focusing on DCGAN and StyleGAN3. Initially, the project started by training a DCGAN to generate landscape images, but it was found that to create a decent DCGAN, you need a significantly larger and more diverse dataset, a carefully designed architecture, advanced techniques to stabilize training between the generator and discriminator, and sufficient computational resources to train effectively. As a result, the project did not delve deeply into developing the most powerful DCGAN architecture. Instead, it explored the use of a pretrained variant of StyleGAN3, originally provided by NVIDIA and further fine-tuned on landscape images by Justin Pinkney.

More specifically, the model used is the StyleGAN3-t LHQ 256, which is a StyleGAN3-t model further trained on 15 million images of various landscapes at a resolution of 256x256. Examples of initial generated outputs from this model are shown in the following stacked arrangement of three images:

LHQExample

Further fine-tuning of this model was carried out for 50 epochs using the Landscape Pictures dataset, with the same seeds used to generate new landscape images after this additional training, which are displayed below. By comparing the two stacks of images, it is evident that the new dataset likely introduced greater colors and lighting variations, increased detail, and possibly more geographical diversity, since the landscapes in the second stack feature richer and more complex environments, ranging from detailed mountain terrains to lush, vibrant valleys.

FineTunedExample

Additional generated images are available in a GIF at the top of this repository, with the top three images generated by the StyleGAN3-t LHQ 256 and the bottom three by the further trained model. Below is a GIF of images generated by the provided DCGAN architecture:

DCGANExample

These images demonstrate the architecture's ability to generate landscape-like visuals, but with noticeable limitations such as lower resolution, simplified color schemes, and less realistic textures compared to the more advanced StyleGAN3 outputs. This difference highlights the benefits of using pre-trained models, especially when training resources are limited.

Dataset Description

The Landscape Pictures dataset, a collection of natural landscape photos from Flickr, was used to train both DCGAN and StyleGAN3-t LHQ 256 models. It consists of 4,300 images, representing a variety of landscape types. Details of these categories, including the number of pictures and a brief description of each, are provided in the table below:

Landscape Category Number of Pictures Description
landscapes 900 General landscape pictures
landscapes_mountain 900 Pictures featuring mountain landscapes
landscapes_desert 100 Pictures of desert landscapes
landscapes_sea 500 Sea views and coastal landscapes
landscapes_beach 500 Beach scenes
landscapes_island 500 Pictures of island settings
landscapes_japan 900 Landscapes located in Japan

To adapt to my system's and model capabilities, the landscape images, which originally varied in resolution, were uniformly resized to a resolution of 256x256 pixels.

Setup Instructions

Local PC Local Environment Setup

  1. Clone the repository:
    git clone https://github.com/Dalageo/GANScapeGenerator.git
    
  2. Navigate to the cloned directory:
    cd GANScapeGenerator
    

For DCGAN:

  1. Open the GANScapeGenerator_DCGAN.ipynb using your preferred Jupyter-compatible environment (e.g., Jupyter Notebook, VS Code, or PyCharm)

  2. Update the dataset, model and output directory paths to point to the location of your local environment.

  3. Run the cells sequentially to reproduce the results.

For StyleGan3:

  1. Visit the StyleGAN3 repository and follow the installation instructions.

  2. Open the GANScapeGenerator_StyleGAN3.py using your preferred Python-compatible environment (e.g., VS Code, or PyCharm)

  3. Select a pretrained StyleGAN3 model suitable for your dataset:(e.g., NVIDIA's NGC catalog or Finetuned stylegan3 models on Hugging Face)

  4. Update the dataset, model and output directory paths

  5. Run the cells sequentially to reproduce the results.

To train the models on GPU, you will need to activate GPU support based on your operating system and install the required dependencies. You can follow this guide provided by PyTorch for detailed instructions.

Acknowledgments

Firstly, I would like to thank Alec Radford, Luke Metz, and Soumith Chintala for introducing DCGAN in their 2015 paper, "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks".

Additionally, special thanks to NVIDIA for providing pretrained StyleGAN3 models for educational and research purposes, as well as Justin Pinkney for making available a StyleGAN3 variant that has been pretrained on the LHQ dataset.

License

The provided fine-tuned StyleGAN3 model is licensed under the Nvidia Source Code License, the dataset is under the CC 1.0 Universal, while the accompanying documentation is licensed under the AGPL-3.0 license. AGPL-3.0 license was chosen to promote open collaboration, ensure transparency, and allow others to freely use, modify, and contribute to the work.

Any modifications or improvements must also be shared under the same license, with appropriate acknowledgment.


Nvidia-Logo       CC 1.0 Universal       AGPLv3-Logo