This contains the implementation of various experiments related to training a Deep Convolutional GAN (DC GAN) and evaluating it on different tasks including image generation, latent space traversal, conditional generation, Wasserstein GAN (WGAN), and classifier training.
- Training a DC GAN with a chosen architecture using the standard GAN loss, along with plotting the loss curves for both the Generator and Discriminator.
- Generating images: Plot a 10x10 grid of generated images after training the GAN.
- Varying training iterations: Vary the number of times the Generator and Discriminator are trained and observe the effects on image quality and loss convergence.
- FID computation: Compute the Fréchet Inception Distance (FID) by sampling 1000 data points from both the true and generated data distributions.
- Latent space traversal: Implement latent space traversal using linear and non-linear interpolation methods and visualize the results.
- Conditional generation: Implement a conditional GAN (c-GAN) to generate images by conditioning on class labels.
- Wasserstein GAN (WGAN): Modify the loss and Critic network to optimize the Wasserstein metric, and evaluate the model by plotting generated images and recomputing FID.
- Decoder network: Implement a decoder network to reconstruct the latent variables used by the GAN generator and compute a reconstruction loss. Train this decoder along with the regular GAN losses.
- MLP classifier: Use the decoder output as input to train a multi-layer perceptron (MLP) classifier and compute classification metrics (accuracy and F1 score).
- ResNet-based classifier: Fine-tune a pre-trained ResNet model (32 or 50) and compare its classification results to the MLP classifier trained on decoded latents.
- ResNet on 20-class subset: Train a ResNet-based classifier on a 20-class subset of the data and report classification metrics.
- Data augmentation with c-GAN: Use the conditional GAN to generate 100 additional images per class and retrain the classifier with the augmented data. Compare results with the previous classifier.
This project implements various experiments related to training and evaluating a Variational Autoencoder (VAE) and its variants. The experiments focus on training a vanilla VAE, implementing conditional likelihood, evaluating performance on image generation and reconstruction, exploring latent space, and applying classifiers on latent vectors.
- Training a Vanilla VAE
- CNN-based Classifier
- Posterior Inference and MLP Classifier
- Beta-VAE
- Latent Space Interpolation
- Adversarial Autoencoder
- VQ-VAE with Discrete Latent Space
- Gaussian Mixture Model (GMM) on Latent Space
Implement a vanilla VAE with MSE loss for conditional likelihood. The model is trained, and results are evaluated with varying numbers of samples of z
during training for the input to the decoder. The following outputs are reported:
- 10x10 grids of reconstructions and generations
- Loss curves for likelihood, KL divergence, and combined terms
- Reconstructed and Generated Images:
A CNN-based classifier is built using the training images from the dataset. The classifier is evaluated on the test set, and classification accuracy is reported.
Using the VAE trained in the first step, posterior inference is performed on all images. The latent vectors obtained from this inference are then used to train a multi-layer perceptron (MLP) classifier. Classification performance is compared between the original CNN classifier and the MLP classifier based on latent vectors.
Implement a beta-VAE with four different values of the hyperparameter beta
. The following results are observed and documented:
- 10x10 grids of generated and reconstructed images for each
beta
value - Observations on how varying
beta
affects the results
- Generated and Reconstructed Images:
For the VAE trained with the optimal beta
, posterior inference is performed on a pair of images. Linear interpolation is done along the latent space between corresponding latent vectors, and 10 interpolated points are visualized for 10 different image pairs.
An adversarial autoencoder is implemented with MSE loss. The following outputs are evaluated:
- Generated images
- Generated Images: Visual representation of images generated by the adversarial autoencoder.
A Vector Quantized VAE (VQ-VAE) is implemented with a discrete latent space. After training, posterior inference is performed on all images, and the latent vectors are used to build a classifier. The classifier's performance (accuracy) is computed.
A Gaussian Mixture Model (GMM) is fit on the latent vectors obtained via posterior inference from the VAE. New latent vectors are sampled from the GMM, passed through the decoder, and a 10x10 grid of generated images is plotted.
This project implements various experiments related to training and evaluating Denoising Diffusion Probabilistic Models (DDPM) and other generative models such as DDIM. The project involves training DDPMs on both the butterfly dataset and the latent space of a VQ-VAE model, comparing different sampling procedures, and implementing advanced techniques like classifier-guided diffusion and score-based models.
- Training DDPM on the Butterfly Dataset
- Generated Images Visualization
- FID Computation
- Training DDPM on VQ-VAE Latent Space
- Conditional Generation using Classifier-Guided Diffusion
- Noise Conditional Score Network
- DDIM Sampler Comparison
- DDIM Inversion Method
- ResNet-50 on Animal Dataset
- Distillation to MLP
Train a Denoising Diffusion Probabilistic Model (DDPM) on the butterfly dataset. The following outputs are reported:
- U-Net training loss curves during DDPM training.
- U-Net Training Loss Curves: Plot of loss curves for the U-Net model during training.
After training the DDPM on the butterfly dataset, generate images using the model and visualize the results by plotting a 10x10 grid of the generated images.
- Generated Images: 10x10 grid of images generated by the trained DDPM.
Compute the Fréchet Inception Distance (FID) by sampling 1000 data points from both the true and generated data distributions. This provides a measure of the quality of generated images.
Repeat the previous experiments by training the DDPM on the latent space of the VQ-VAE trained in the previous assignment. Compute the FID score and visualize the generated images.
- FID Score: FID score computed for generated images from the DDPM trained on VQ-VAE latents.
- Generated Images: 10x10 grid of images generated by the DDPM trained on latent vectors.
Implement conditional generation using classifier-guided diffusion. This method guides the diffusion process based on class labels and is used for conditional image generation.
- Conditional Generation: Generated images conditioned on class labels.
Implement a Noise Conditional Score Network (NCSN) and repeat the above experiments. Compare the sampling procedures (speed and image quality) between the NCSN and DDPM.
Implement a DDIM inversion method to get the latent vectors for a pair of real images. Perform linear interpolation in the latent space between these vectors and generate images corresponding to the interpolated latents.
Train a ResNet-50 model on the Animal dataset and measure the accuracy on the test dataset.
Distill the ResNet-50 model into a smaller-sized MLP using KL divergence loss across logits. Measure the test accuracy of the distilled MLP.