[PYTHON] Following BERT, I was frustrated trying to experience XLNet in about 30 minutes.

Purpose

I want to move XLNet a little after BERT, I tried an example that seems to move in an instant.

Selected example

I couldn't find a good site I tried running the following github code. https://github.com/zihangdai/xlnet

I ran the following code to solve a problem called RACE (Reading Comprehension Dataset).

run_race.py

Description of RACE

The following is one issue with RACE. Understand the text and answer the questions (with options).

{"answers": ["C", "D", "A", "A"], "options": [["take care of the whole group", "make sure that everybody finishes homework", "make sure that nobody chats in class", "collect all the homework and hand it in to teachers"], ["chat with each other", "listen to the teacher", "make friends", "communicate"], ["get benefits from", "are tired of", "cannot get used to", "hate"], ["Three.", "Four.", "Two.", "Five or six."]], "questions": ["A discipline leader is supposed to _ .", "The new way of learning is said to give students more chances to _ .", "We can see from the story that some students _ this new way of learning.", "How many leaders are there in one group?"], "article": "Take a class at Dulangkou School, and you'll see lots of things different from other schools, You can see the desks are not in rows and students sit in groups. They put their desks together so they're facing each other. How can they see the blackboard? There are three blackboards on the three walls of the classroom!\nThe school calls the new way of learning "Tuantuanzuo", meaning sitting in groups. Wei Liying, a Junior 3 teacher, said it was to give students more chances to communicate.\nEach group has five or six students, according to Wei, and they play different roles .There is a team leader who takes care of the whole group. There is a "study leader"who makes sure that everyone finishes their homework. And there is a discipline leader who makes sure that nobody chats in class.\nWang Lin is a team leader. The 15-year-old said that having to deal with so many things was tiring.\n"I just looked after my own business before,"said Wang. "But now I have to think about my five group members."\nBut Wang has got used to it and can see the benefits now.\n"I used to speak too little. But being a team leader means you have to talk a lot. You could even call me an excellent speaker today."\nZhang Qi, 16, was weak in English. She used to get about 70 in English tests. But in a recent test, Zhang got a grade of more than 80.\n"I rarely asked others when I had problems with my English tests. But now I can ask the team leader or study leader. They are really helpful."", "id": "middle1.txt"}

RACE https://www.cs.cmu.edu/~glai1/data/race/ I think you can download the data from.

To run run_race.py

Since there is no usage, I ran it with the following options as appropriate. I don't understand the contents at all. Add options appropriately so that no error occurs I reduced it.

python run_race.py --do_eval=True 
--model_dir="./xlnet_cased_L-12_H-768_A-12" 
--spiece_model_file="./xlnet_cased_L-12_H-768_A-12/spiece.model" 
--data_dir="./data" 
--model_config_path="./xlnet_cased_L-12_H-768_A-12/xlnet_config.json" --output_dir="./output" 
--eval_batch_size=1 
--train_eval=False --uncased=False

Below is some information that may be useful.

(1) I got an error saying that there is not enough memory, so I ran away with the minimum batch size.

--eval_batch_size=1 

(2) The version of tensorflow is as follows. Isn't it an error after 2.0.0?

tensorflow 1.15.3

Execution result

The result is as follows. Since it is 4 choices, it is 0.25 even if it comes out, so it is not working properly at all! !! !! (The number of questions has been reduced to a few by erasing most of the questions ...) ⇒ ** Failure **

eval_accuracy = 0.22
I0806 18:16:47.620249 10828 evaluation.py:167] Evaluation [45/50]
INFO:tensorflow:Evaluation [50/50]
I0806 18:17:51.467021 10828 evaluation.py:167] Evaluation [50/50]
INFO:tensorflow:Finished evaluation at 2020-08-06-18:17:52
I0806 18:17:52.105508 10828 evaluation.py:275] Finished evaluation at 2020-08-06-18:17:52
INFO:tensorflow:Saving dict for global step 0: eval_accuracy = 0.22, eval_loss = 1.3860148, global_step = 0, loss = 1.3860148
I0806 18:17:52.121129 10828 estimator.py:2049] Saving dict for global step 0: eval_accuracy = 0.22, eval_loss = 1.3860148, global_step = 0, loss = 1.3860148
INFO:tensorflow:================================================================================
I0806 18:17:53.214434 10828 run_race.py:546] ================================================================================
INFO:tensorflow:Eval | eval_accuracy 0.2199999988079071 | eval_loss 1.3860148191452026 | loss 1.3860148191452026 | global_step 0 |
I0806 18:17:53.229773 10828 run_race.py:550] Eval | eval_accuracy 0.2199999988079071 | eval_loss 1.3860148191452026 | loss 1.3860148191452026 | global_step 0 |
INFO:tensorflow:================================================================================
I0806 18:17:53.229773 10828 run_race.py:551] ================================================================================

C:\_qiita\___xlnet\xlnet-master>

Summary

For the time being, I tried to run XLNet, but on the site I presented, run_race.py I didn't get any decent results. It is self-evident that it is used incorrectly because it is a site that many people refer to. Useful information (0) XLNet may not work easily.

(1) Insufficient memory --eval_batch_size=1 Can be avoided with.

(2) In the case of the presentation site (github) The version of tensorflow will be an error after 2.0.0 (?).

Is it about? If you have any comments, please let us know.

Recommended Posts

Following BERT, I was frustrated trying to experience XLNet in about 30 minutes.
Let's experience BERT in about 30 minutes.
I was addicted to trying logging.getLogger in Flask 1.1.x
I was able to recurse in Python: lambda
I was addicted to scraping with Selenium (+ Python) in 2020
I was able to repeat it in Python: lambda
A story about trying to implement a private variable in Python.
What I was addicted to with json.dumps in Python base64 encoding
I was addicted to confusing class variables and instance variables in Python