[PYTHON] [GPT-2] I tried to make a fake Trump president's bot by fine-tuning President Trump's Twitter with GPT-2, which is talked about as "too dangerous".

portfolio2.png

background

It has been talked about that if you use GPT-2 published by OpenAI, you can automatically generate natural sentences. There is.

Cornell University's latest survey shows that 70% of people who read GPT-2 generated texts read the texts in New York. ・ The result is that it was misunderstood as a Times article.

A new full model with 1,558 million (1558M) parameters has been released.

However, in reality, there are still unknowns about how this AI can be used effectively. Therefore, I developed and released a Web application called "Mockers", an online tool that anyone can easily use GPT-2. By doing so, I would like to provide an opportunity to consider how to use GPT-2.

If you want to get a feel for what GPT-2 looks like, try this Mockers generation tool. https://mockers.io/generator

Although it is in English, please refer to here for how to use it. https://doc.mockers.io/archives/1966/ https://doc.mockers.io/archives/1987/

Purpose of this article

Share the results of experiments that challenged fine tuning using Mockers.

Fine tuning is to use a model that has already been trained, give additional data, train it at low cost, and generate another model. A model is created that learns the context and style of a given sentence and generates the sentence according to it. Mockers does more than just try GPT-2, it supports fine-tuning and auto-posting.

Use Case

By using this mechanism, for example, the following use cases can be realized.

――It is possible to build a media that does not infringe the copyright of a certain curated media, imitates it, and parasitizes it so that it receives PV spills.

――You can build a bot that constantly impersonates a Twitter account.

What you want to try

In this article, as a demo, I will experiment with fine tuning using GPT-2. I used Mockers to fine-tune President Trump's Twitter to create a fake Trump presidential bot.

Here, too, you can always see the latest President Trump's Mock. https://mockers.io/timeline

procedure

Access the following page. https://mockers.io/login

screencapture-mockers-io-login-2019-12-20-06_38_44.png

Login is required for fine tuning. Sign up or use your Google account.

screencapture-mockers-io-2019-12-20-06_32_05.png

If you log in successfully, you will be prompted to create a model as it is, so press "Go to creation screen".

70b9de1e-26c3-8c28-719b-6d0e34a46eef.png

When the new model dialog is displayed, enter the "model name" appropriately and set the "model type" to "custom model (Twitter)". Now you can generate a fine-tuned model for your Twitter account. Enter the target Twitter account in "Target account to mock (input)".

screencapture-mockers-io-model-settings-2019-12-18-21_50_41.png

Currently, it takes up to 2 hours to generate a model. Once the model is generated, it will be automatically generated on a regular basis, but you can also register an account to tweet the generated text. To do this, you need to register with the Twitter API in advance. See the article below for how to apply. https://qiita.com/kngsym2018/items/2524d21455aac111cdee

The Twitter account entered in "Synchronize (input) target account" triggers the tweet and generates sentences with content related to the tweet.

In this way, "impersonation (Mock)" is realized.

Experimental result

The following is the content actually tweeted by Fake Trump. It's not necessarily the line he's likely to say, but sometimes he says something that fits his position, and even if it's not, it's generated with topics that are related to what he said in the past. You can see that.

University of Alberta Human Kinetics (SHK) University is the premier facility for conducting high energy, high impact research in science, physical sciences and engineering.

The Democratic candidate was not even a Democrat, and the Republican candidate was a strong and capable conservative. Both of them were the people I strongly supported. Our country is excellent for both.

Since 2015, the first major US military operations to tackle IS militants in Syria and Iraq have deployed at least 2,500 additional ground forces, but the group's most extreme.

Task

――GPT-2 can control the length on a word-by-word basis, but since it cannot control the number of characters, it cannot be optimized for media that is strict about the number of characters like Twitter. As a result, if you exceed Twitter's limit of 280 characters, you have to forcibly cut it.

--The GPU memory required for fine tuning is too large to work on GPUs that can be used by the general public on models of 774M and above. Even my privately owned "Geforce GTX 1080 Ti" and AWS P3 instance "Tesla V100" didn't work due to lack of memory. (Normal reasoning is possible)

in conclusion

It is expected that the accuracy of sentence generation technology based on large-scale unsupervised learning will continue to improve, and at the same time, it will shift to the phase of utilization in the real world. We hope that this article and Mockers can contribute to natural language AI and its development and social implementation.

P.S. Don't forget Hillary.

It's an argument I heard from Senate Democrats before the election, and Democrats said this could happen if this was a Republican, Republican Republican, or other party.

https://mockers.io

Recommended Posts

[GPT-2] I tried to make a fake Trump president's bot by fine-tuning President Trump's Twitter with GPT-2, which is talked about as "too dangerous".
[1 hour challenge] I tried to make a fortune-telling site that is too suitable with Python
I tried to make "Sakurai-san" a LINE BOT with API Gateway + Lambda
[5th] I tried to make a certain authenticator-like tool with python
[2nd] I tried to make a certain authenticator-like tool with python
[3rd] I tried to make a certain authenticator-like tool with python
I tried to make a 2channel post notification application with Python
[Introduction] I want to make a Mastodon Bot with Python! 【Beginners】
I tried to make a todo application using bottle with python
[4th] I tried to make a certain authenticator-like tool with python
[1st] I tried to make a certain authenticator-like tool with python
I tried to make a strange quote for Jojo with LSTM
I tried to make a mechanism of exclusive control with Go
I made a bot to post on twitter by web scraping a dynamic site with AWS Lambda (continued)
Python: I tried to make a flat / flat_map just right with a generator
I tried to implement a blockchain that actually works with about 170 lines
I tried to make an open / close sensor (Twitter cooperation) with TWE-Lite-2525A
I tried to make a traffic light-like with Raspberry Pi 4 (Python edition)
I tried to make a skill that Alexa will return as cold
I tried to make a url shortening service serverless with AWS CDK