Fine-tuning ImageNet diffusion model on unconditional/conditional data

Question

I want to fine tune some generative diffusion model (DDPM), lets say trained on ImageNet (NOT Stable Diffusion which is text2img), to some other data like CelebA or CIFAR-10. I wonder two things:

does my pretrained model needs to be unconditional? or can it be conditioned on ImageNet classes but can be tuned on unconditional CelebA or with classes from CIFAR-10?
maybe trivial but do image sizes must agree between ImageNet data and my target data from pretraining? aka if I have CIFAR-10 32x32 I need model trained on the same resolution of ImageNet?

So far I found some models from OpenAI trained on ImageNet in both ways but haven't tried to that yet. Some theoretical input would be greatly appreciated

Fine-tuning ImageNet diffusion model on unconditional/conditional data

Answers (1)

Related Questions