Feed forward VQGAN+CLIP
0 Preferiti
Generating images from text. This model takes as input a text prompt, and returns as an output the VQGAN latent space, which is then transformed into an RGB image. Eventually it minimizes the distance between the CLIP generated image features and the CLIP input text features.
Generating images from text. This model takes as input a text prompt, and returns as an output the VQGAN latent space, which is then transformed into an RGB image. Eventually it minimizes the distance between the CLIP generated image features and the CLIP input text features.
Modello di prezzo:
price unknown / product not launched yet
Top 5
Tools of the DAY