** It is a cool movement of figures, logos, illustrations, etc. **. It's a pretty vague definition, isn't it? I don't think there is actually a clear definition. Motion graphics are often used for the so-called emo image. It is often used in artist MVs and corporate commercials. If you search for "motion graphics" on YouTube etc., you will find a lot of cool images, so please take a look.
Most of the motion graphics are created by the creators by hand. This is a very time-consuming and difficult task. So recently, I wondered if it could be generated using the spear AI. Wouldn't it be fun to train AI to generate motion graphics?
Even if it is generated, a data set for training is required. However, according to my research results, ** motion graphics datasets did not exist. ** So you need to create your own dataset. However, it takes a huge amount of time to create a large data set, so this time I simply created motion graphics of 6 types of particles. I used Adobe's Africa Effects to create the dataset.
Now that we've created the dataset (aside from whether it's fairly small and can be called a dataset), the next thing to think about is the model. There are several generative models, and the current mainstream is GAN or VAE. GAN is difficult to learn, and if the data set is small like this time, there is a high possibility that it will not learn well, so we adopted VAE. VAE is a derivative system of AE, and the distribution of latent variables is acquired by compressing and restoring the data set. After training, it can be generated by giving noise z to the decoder part. For AE and VAE, the following articles will be helpful. Variational Autoencoder Thorough Explanation Since this time we will generate a video, VAE built a model of Unet structure by 3D convolution. Learn the spatiotemporal information of moving images by 3D convolution. The transpose conv in the decoder part is a transpose convolution. Transpose convolution is explained in an easy-to-understand manner with a video in the article [here](ttps: //qiita.com/kenmatsu4/items/b029d697e9995d93aa24 "Transpose convolution").
Since the data set is small, the training ended immediately. I judged that it could be generated at about 200 epoch, so I ended the training there. It took about an hour with GTX1080Ti.
Experiment details
Video size | epoch number | Framework | GPU |
---|---|---|---|
128x128x32 | 200 | PyTorch | GTX1080Ti |
Motion graphics that look like learning data have been generated properly.
Morphing By complementing the noise vector z given to the decoder between the motion graphics, it is possible to generate motion graphics between the two motion graphics. This is a video version of what is often generated by GAN and changes little by little.
Intermediate motion graphics are generated by complementing the latent variable z between motion graphics as shown in the following equation.
z = t z_1 + (1-t)z_2 \\
0 < t < 1
I was able to generate motion graphics using deep learnig. This time, there was no diversity in the generation because it was a fairly small dataset. If you have a large motion graphics dataset, I would love to experiment. (Can someone make it?) However, it is interesting to be able to generate intermediate things with Mophing etc. In this experiment, we found that Morphig can be created even with a small data set.
Variational Autoencoder Thorough Explanation [Deconvolution in Neural Networks](ttps: //qiita.com/kenmatsu4/items/b029d697e9995d93aa24 "Transpose Convolution")
Recommended Posts