VQ-VAE for Tiny ImageNet (ImageNet-200)
This repository contains a Vector Quantized Variational Autoencoder (VQ-VAE) trained on the Tiny ImageNet-200 dataset using PyTorch. It is part of an image augmentation and representation learning pipeline for generative modeling and unsupervised learning tasks.
🧠 Model Details
- Model Type: Vector Quantized Variational Autoencoder (VQ-VAE)
- Dataset: Tiny ImageNet (ImageNet-200)
- Epochs: 35
- Latent Space: Discrete codebook (vector quantization)
- Input Size: 64×64 RGB
- Loss Function: Mean Squared Error (MSE) + VQ commitment loss
- Final Training Loss: ~0.0292
- FID Score: ~102.87
- Architecture: 3-layer CNN Encoder & Decoder with quantization bottleneck
📦 Files
generator.pt
— Trained VQ-VAE model weightsloss_curve.png
— Plot of training loss across 35 epochsfid_score.json
— FID evaluation result on 1000 generated samplesfid_real/
— 1000 real Tiny ImageNet samples used for FIDfid_fake/
— 1000 VQ-VAE reconstructions used for FID
🔧 Usage
import torch
from models.vqvae.model import VQVAE
model = VQVAE()
model.load_state_dict(torch.load("generator.pt", map_location="cpu"))
model.eval()
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Evaluation results
- FID on Tiny ImageNet (ImageNet-200)self-reported102.870