[PYTHON] [PyTorch] Introduction to Japanese document classification using BERT

Introduction

In this article, we will follow the process of fine-tuning a pre-trained BERT model through the task of categorizing news articles. It can be applied to any Japanese document classification task simply by changing the input data.

Feel free to comment if you notice any mistakes or better ways.

Advance preparation

Google Colaboratory is used for implementation. For details on how to set up and use Google Colaboratory, see [this article](https://cpp-fu learning.com/python_colaboratory/). However, no special settings are required, and if you have a Google account, you can use it immediately, including the GPU, for free. ** If you want to use GPU for reproduction, please change the hardware accelerator to "GPU" from "Runtime"-> "Change runtime type" and save it in advance. ** **

Document classification by BERT

We will use the public data livedoor news corpus. This data is assigned to each news article in one of nine genre categories. We will implement a BERT document classification model for tasks that fall into this category.

Data reading

First, download the target data and format it. For the processing of this part, I referred to this article.

#Download livedoor news corpus
!wget https://www.rondhuit.com/download/ldcc-20140209.tar.gz
!tar zxvf ldcc-20140209.tar.gz
#Create file for storing formatting results
!echo -e "filename\tarticle"$(for category in $(basename -a `find ./text -type d` | grep -v text | sort); do echo -n "\t"; echo -n $category; done) > ./text/livedoor.tsv
#Store by category
!for filename in `basename -a ./text/dokujo-tsushin/dokujo-tsushin-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/dokujo-tsushin/$filename`; echo -e "\t1\t0\t0\t0\t0\t0\t0\t0\t0"; done >> ./text/livedoor.tsv
!for filename in `basename -a ./text/it-life-hack/it-life-hack-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/it-life-hack/$filename`; echo -e "\t0\t1\t0\t0\t0\t0\t0\t0\t0"; done >> ./text/livedoor.tsv
!for filename in `basename -a ./text/kaden-channel/kaden-channel-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/kaden-channel/$filename`; echo -e "\t0\t0\t1\t0\t0\t0\t0\t0\t0"; done >> ./text/livedoor.tsv
!for filename in `basename -a ./text/livedoor-homme/livedoor-homme-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/livedoor-homme/$filename`; echo -e "\t0\t0\t0\t1\t0\t0\t0\t0\t0"; done >> ./text/livedoor.tsv
!for filename in `basename -a ./text/movie-enter/movie-enter-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/movie-enter/$filename`; echo -e "\t0\t0\t0\t0\t1\t0\t0\t0\t0"; done >> ./text/livedoor.tsv
!for filename in `basename -a ./text/peachy/peachy-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/peachy/$filename`; echo -e "\t0\t0\t0\t0\t0\t1\t0\t0\t0"; done >> ./text/livedoor.tsv
!for filename in `basename -a ./text/smax/smax-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/smax/$filename`; echo -e "\t0\t0\t0\t0\t0\t0\t1\t0\t0"; done >> ./text/livedoor.tsv
!for filename in `basename -a ./text/sports-watch/sports-watch-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/sports-watch/$filename`; echo -e "\t0\t0\t0\t0\t0\t0\t0\t1\t0"; done >> ./text/livedoor.tsv
!for filename in `basename -a ./text/topic-news/topic-news-*`; do echo -n "$filename"; echo -ne "\t"; echo -n `sed -e '1,3d' ./text/topic-news/$filename`; echo -e "\t0\t0\t0\t0\t0\t0\t0\t0\t1"; done >> ./text/livedoor.tsv

When you execute the above command, a table with 0 or 1 flags indicating which of the 9 categories it belongs to will be displayed after the file name and article body, `` `/ text / livedoor.tsv``` It should be made as.

#Verification
!head -10 ./text/livedoor.tsv

output



filename	article	dokujo-tsushin	it-life-hack	kaden-channel	livedoor-homme	movie-enter	peachy	smax	sports-watch	topic-news
dokujo-tsushin-4778030.txt June, which will soon be called June Bride. I think there are many single women who are in a state of "celebration poverty" who have just been called even though their ceremony has not been done yet. As the number of attendances increased, it was not uncommon for such requests to be made. "I have a request ... Can you give me a speech by a friend's representative?" In such a case, what should a single woman do? Recently, if you search on the Internet etc., many example sentence sites for speeches by friend representatives will appear, so if you refer to them, anyone can create safe ones. However, Yuri (33 years old) created it with reference to the internet, but said, "I was worried whether this was really good. Since I live alone, there is no one who can tell me what I think, but I will bother to ask other friends. I'm wondering if that's the case ... ", so I used it as an internet trouble consultation site. He posted the speech he made there and sent a message saying, "Is this okay? Please correct it." "About three people corrected it overnight. By the way, there were many other people like that, and there were many posts on the consultation site asking for corrections as well." (Yuri) .. When I looked at the site that I was taught as a trial, there were certainly more than 1000 posts saying "Please correct your wedding speech." I didn't know that there was such an internet community in the shadow of a happy wedding. However, "I'm still happy with the speeches requested in advance. The most disgusting thing is the surprise speech!" Said the congratulatory poor single woman who said that it took more than 100,000 celebrations last year alone. Kaoru (35 years old) "I'm basically not good at speaking in public. So when I'm suddenly nominated, I get confused and can't say anything. Then I can't enjoy it at all even after I've fallen into self-loathing. The merit of the surprise speech is that it is not ready, so it seems to be fun to have Frank speak his true intentions. However, if you are a person who can handle it well, but if you are not good at it, there is a great risk that you will become "muddy" instead of "frank". By the way, in the case of a professional moderator, it seems that there is often a word during the ceremony, "I will nominate you as a surprise speech later" instead of a real surprise, but Kaoru says, "How many minutes ago It's impossible to be told! " The point is that it is important to select people who can enjoy surprises. On the other hand, Mr. Yukie (30 years old) gave a speech saying, "It's boring, and when I was searching for example sentences on the net, I thought,'Is there such a way?'" "Speech". "It's a way to write a letter to a friend of the bride and read it aloud like to XX. In this case, it's okay to write it a little frankly, and above all, you can read it out without memorizing it. If you give it to a friend, it will be a memorial. "(Mr. Yukie) I see, it's true that you only have to read this, so even people who are not good at speaking in public may not fail. Although the protagonist was the bride and groom, I was nervous when it came to the situation, and I secretly rehearsed about the content.
dokujo-tsushin-4778031.txt Before the spread of mobile phones, ordinary telephones were the usual tool for contacting lovers. When you break up with your lover, tearfully erase the contact information of the person in your notebook. Then, the edge would be cut off. However, nowadays, there is a mobile phone, there is an e-mail, and if you open the Internet, you can see the other party's dinner on the blog. What do the single women think about the relationship with the ex-boyfriend who can't cut it? "I was surprised to receive an email from him who broke up five years ago," said Naomi (36 years old). "I broke up because of his flirtation, but now it says that he seems to be single. , I have a lover who is about to get engaged, so I'm through. If I didn't have a partner, I might have had a resurrection love. Isn't it a good time for a bachelor who has few chances to meet? " At that time, Naomi lived in her parents' house. If he had no email or cell phone, he probably wouldn't have been able to contact him. On the other hand, Mikako (38 years old) said, "It was harder to cut the ties in the era of only ordinary telephones." "Now, because I rely on memory, if I erase my phone number and e-mail address, there is a place until then. If I try to drag it, it will be dragged, and if I try to cut it, it will be cut. I used to memorize his phone number. Picking up and returning the handset ... It was difficult to cut off my feelings. "Mr. Chie (34 years old) agrees. "Information comes in if you use the information network, but you can shut it out by yourself. But I don't think that you should insulate him from the breakup. The period until the heart is healed I often cut off contact and then send emails to return to friendships. ”Compared to before it was all-or-nothing, now I'm cutting off, returning to friends, and having a friendship with me. It became possible to select the relationship after parting, such as continuing. It's painful while there is still some untappedness, but if you use the tools well, you may benefit from it. However, there are some pitfalls. Finally, I would like to introduce Ritsuko's (35-year-old) tohoho story. "I'm emailing a milestone with my ex-boyfriend who I met 10 years ago. I also write comments in my diary with" My Miku "and exchange New Year's cards. The ex-boyfriend is already married and a child. There are two of them. Recently, I've become fat in middle age and feel like my home dad. I'm a good person, but now that I'm completely cold, why am I so obsessed with him? If you want to keep the memories of your past romance clean, you may not need to know it. ”The memories of love are beautified in your brain. It's good to continue friendships on the internet, but at the same time, the memories of a faint and sad love can be covered with a sense of life and reality, so be careful not to connect too much. Well, they are like each other. (Towa Kurufu) 1 0 0 0 0 0 0 0 0
dokujo-tsushin-4782522.txt Mr. Kaori (pseudonym / 31 years old), who works in the publishing industry, complains, "Men love women's" makeup "after all." That's because many celebrities have recently posted their makeup photos on their blogs, which can become news and talk. Since the beginning of this year, the entertainers who have shown "makeup" on their blog are Yuko Ogura, Natsumi Abe, and Morning Musume. There are various genres such as Reina Tanaka, Maomi Yuki, and Riisa Naka. Because it is a blog that can send out personal life in real time, everyone publishes their makeup with a relaxed expression. For fans, it will be a nice service to get a glimpse of the real face of their favorite entertainer. Then, why did she have any doubts about the celebrity's no makeup show? "I know it's my sword, but I can't really admit its beauty because I'm so confident in myself," says Kaori. "In the comment," It's cute even without makeup! Or "No need to make up!" "It's kind of complicated to think that I'm releasing my makeup photos in anticipation of all the praises lined up," he added. Isn't it because you want to show off your makeup because it's originally a fan service, "to call a topic or to be praised in the comments"? And, it seems that the single woman just looks at "Naname". In addition, Hiromi (pseudonym / 32 years old), who operates at a manufacturer, said, "I'm going to go to the" makeup illusion "of men. If my face is beautiful, I'll come to work without makeup. But a little every day. But I want you to acknowledge the effort you make to be beautiful, "he revealed as a maiden. "I think from the bottom of my heart that young girls in their teens and early twenties put on makeup," Oh, it's cute, "but when people of my age are showing off, compared to my own makeup. "I'm gonna do it," says the single women, who are unknowingly hurt after seeing the beautiful celebrity makeup. By the way, what is the reputation of men when they show off their makeup without makeup? Yusuke (pseudonym / 34 years old), who works for an apparel company, said, "I personally don't like the face made by women, so I think it's really cute when I look at makeup." On the other hand, Toru (pseudonym / 28 years old), who works for an IT company, said, "It depends on the person, but I wonder why I bother to post it on my blog. Originally, I want women's makeup to be shown only to important people. "Oh, I just showed it to me" (laughs), "said the men. The makeup was shown on a blog with a feeling that it would become a boom in the future. It seems that there are affirmatives and oppositions, but it is certain that men have a special feeling for women's "makeup". It may be necessary for the single women to polish their real faces for the time when they will show off their makeup. 1 0 0 0 0 0 0 0 0
dokujo-tsushin-4788357.txt The change due to aging of the hip is said to be "deflection-> lower-> flow inward", and the bust is "deflect-> bend-> flow out". The change in bust has already started in the 20s, and some people have become "flexible" in their 20s. And no one has returned. Furthermore, looking at the changes in each part of the body from the 20s to the 50s, the changes in the waist and abdomen are the largest, and the abdomen is the same size as the bust. This is a part of the content of the press release of the Wacoal Institute for Human Sciences held in April, "Discovering a certain law regarding body aging (body shape change due to aging)". The persuasive power is outstanding because it is explained with photos and videos along with the data that aggregates and analyzes the numerical values of secular change for a total of 40,000 people. Introducing the physical characteristics and daily behavior / consciousness of people with little change in body shape after being made to face the reality. The main content is to move your body on a daily basis, check your posture, and always try on your underwear to check the fit. Then, in the panel discussion, the results of the person who participated in the experiment of walking with a wide stride for one year showed that his spine was stretched and his fat was lost. I was not exhausted, and when I told my acquaintances about the content, I heard various opinions and experiences. There was also a question, "Isn't the person with little change in body playing sports because it has been measured all the time?", But this answer said, "Be careful about your daily life rather than trying hard to exercise." I have a strong impression that I haven't been on a diet so much. " It was Yoko who said, "My aunt also said that." When an aunt in her 60s went on a hot spring trip with her friends, she was delighted that she was praised for having little change in her bust, so she asked her how to care for her bust. "My aunt always wears a bra, and I always measure and try it on when I buy it. On the other hand, my friends sometimes don't wear bras at home because they are'painful'." Since hearing that, Yoko has been trying on underwear and having her measure it when she buys it. Koko, who was suffering from back pain, said, "I also walked thin." My doctor pointed out that my muscles were weak, so I stopped my bicycle to the station and decided to walk back and forth every day for 30 minutes each way. By starting to build muscle strength, the whole body became firm, and as a result, I succeeded in losing weight. A child who said, "But if you play sports, you'll be fine" improved by going to a sports club around the waist, which started to be anxious due to unhealthy conditions. I'm happy that my jeans are one size smaller. In the panel discussion, it is said that "aging is one-way, but the status quo can be maintained" and "anti-aging is attracting attention in the medical community, but the relationship between maintaining physical beauty and maintaining health should be two sides of the same coin." It was. In that case, it is better not to give up the resistance to aging, which is the change in body shape. Aya Sugimoto said, "I don't want to go back to when I was young or look young. I want to pursue how beautiful I am now." (Office M2 / Onomaki) Click here for details 1 0 0 0 0 0 0 0 0
dokujo-tsushin-4788362.txt The child allowance will be paid from June, but initially it was supposed to be paid 26,000 yen per child per month. However, the payment will be half price in the first year, and it seems that the monthly payment of 26,000 yen per child will be postponed after 2011. The other day, it was reported that the city rejected an attempt to apply for a child allowance for 554 foreign men who allegedly adopted a child in Thailand, but it seems that bad people can easily cheat if they think about it. What's going on with the child allowance full of deficiencies? I hear a voice of doubt from the single woman who is not paid. Mr. Yuri, who is currently single, complains that she cannot be satisfied with the uniform payment, asking, "Why do you need a child allowance for a house that is picked up at a private kindergarten by Benz?" "I heard that the child allowance imitated France, but in France the child allowance is called a" family allowance "and there is no income limit. However, the family allowance is paid from the second child up to the age of 20 and 3 It seems that you can get a premium family allowance from the eyes. If the child allowance is for the purpose of measures against the declining birthrate, shouldn't the allowance be given from the second person like in France? ” It's strange to do it. Although it was said that it was a countermeasure against the declining birthrate, there were no voices saying that they would give birth by relying on child allowances. "In anticipation of an allowance that may disappear if the government changes, even if you get married from now on, you may not get pregnant immediately, and the child allowance may be abolished when you give birth safely." There is no possibility or nothing. When asked about the use of child allowances for housewives with children, many said that they would save money for future education expenses. Regarding that, "I feel very sorry that a family who couldn't do it even if they wanted a child would bear the cost. It would be nice if it was really used for children, but parents would gamble. I don't want you to spend our taxes to spend it or to spend on your parents' entertainment expenses, "says Yuri. Aya Batsuichi said, "Both couples without children and singles will contribute to raising children who will lead the next generation by working and paying taxes. I thought that was a child allowance, but for children The amount of money that is supposed to go to the home is being reduced, I don't know where it will go, and I'm not convinced that our taxes go to foreign children because of the disguised application form. " I'm angry. "While the child allowance with income limit should be increased, distributing the child allowance equally to all households seems to be only popular for elections. What will happen to the child allowance if the administration changes?" Aya was worried that if the child allowance was distributed as it is, the children who received the child allowance would be paid in the form of a tax increase in the future, but the local government was crying for financial difficulties. There is a voice of repulsion from. What will happen? I want to keep an eye on future child allowances. By the way, this time, I asked single people who have no children, but many of them said that they didn't understand the "child allowance" because it had nothing to do with them. Many people only knew about long-term care insurance when they were in a position to care for their parents, but if any system is enforced after passing the Diet, the tax paid by us will be used. It will be done. I think it's our right to know and to complain. Isn't it better than being indifferent? (Office M2 / Setsuko Saeda) 1 0 0 0 0 0 0 0 0
dokujo-tsushin-4788373.txt By chance at a bookstore, "Ballet shoes given by a rabbit"(Komine Shoten Naoko Awa/Written by Naoko Minamizuka/An illustration)Motoko (27-year-old pharmacist), who found the picture book, was full of nostalgia and joy. "I was learning ballet, and I felt encouraged when I read this picture book, so I read it many times in the elementary school library. I really wanted to find a bookstore in my neighborhood, but I couldn't find it. … ”Mr. Motoko. "Rabbit-giving ballet shoes" with an impressive gentle pink color are now displayed in Motoko's room so that the cover can be seen. Masae (40 years old), who works for an advertising agency, likes "The Cat That Lived a Million Times" (Kodansha Yoko Sano)/Written by).. This is a book I often borrowed from the library when I was in elementary school. "In the sixth grade, my children were reading thick novels, but I was still reading" The Cat That Lived A Million Times. "My mother warned me to read a more decent book. It was done once, but now I can tell my mother, "This is a special book." At that time, I think I was a child and felt various things from the pictures and stories. "(Mr. Masae ) Many women are healed by the gentle stories and soft pictures of picture books. When I asked the women around me, it seems that the picture books I bought after I became an adult were more "books I read as a child" and "books with special memories" than newly published ones. "Gongitsune" with impressive beautiful pictures (Kaiseisha Nankichi Niimi)/Written by、黒井健/An illustration)And "Buying a Tebukuro" (Kaiseisha Nankichi Niimi)/Written by、黒井健/An illustration), "Father Christmas" (Fukuinkan Shoten Raymond Briggs) featuring a slightly quirky Santa/Written by、すがはらひろくに/translation)Etc. are also very popular. When you open a nostalgic picture book, the "sensitivity" of your childhood that you forgot will suddenly come back to life. Some people have discovered the charm of picture books since they became adults. Naomi (36 years old, working at a manufacturer), who was hardly impressed with picture books when she was a child, was attracted to picture books by giving a gift to her friend's child, "Guri and Gura" (Fukuinkan Shoten Rieko Nakagawa)./Written by、おおむらゆりこ/An illustration)。  「絵も可愛いし、大きなカステラは美味しそうだし、何より『ぐりぐら ぐりぐら』という言い回しに、はまっちゃいました。『ぐりぐら ぐりぐら』ってつぶやくと、ちょっとくらい嫌なことがあっても、どうでもよくなっちゃいます(笑)」(ナオミさん)  読んで癒される絵本だが、最近は、自分で絵本を描いてみたいと思う女性も増えているらしい。大阪で「大人のための絵本講座」を開いているAn illustrationレーターのおおさわまきさん(星未来工房)に、絵本を描く魅力について伺った。  「絵本は目でイメージしその世界に入り込める奥深いものです。文章と絵で構成されているので、いろいろな見方ができるし、たくさんのことを伝えられるのが魅力です。絵本講座を受講した生徒さんたちは、一度絵本を仕上げると『もっと作りたい、楽しい!』と目を輝かせますよ。絵本作りは、年齢関係なく誰にでもできる癒しの世界だと思っています」  絵本作りのコツについてお聞きすると「私が絵本作りの勉強し始めた頃、先生から『難しく考えたらダメだよ』と繰り返し言われました。難しいと思うとどんどん描けなくなるんですよね。だから、自分も含めて、難しくないことからはじめていこうと強く思いました。そして、何よりも童心に戻ることが大切です。子どもは何でも素直に楽しむでしょう。大人も『恥』とか『かっこよく』とか考えないで、遊び感覚で自然に絵本作りに取組むことが大切です」(おおさわさん)。  子どものように素直に描けるようになると、考え方も自然と柔軟になってくるはず。大人という枠組みや常識という枠組みから離れて、自由に空想し、自由に描くことで、心が癒されていくのだろう。  最後に、おおさわさんにお勧め絵本を紹介していただいた。「『ちきゅうになった少年』(フレーバル館 みやざきひろかず/I love writing and drawing). For those who are busy and stressed every day, I think it is an interesting expression of the desire to be reborn as a non-human being. This is a book to rest your mind when you are tired, have a hard time, or want to escape. Also, watercolor painting is a painting material that relaxes and heals the brain and mind, so it is highly recommended. "(Mr. Osawa) If you want to read a picture book but do not know what kind of book is suitable for you, first of all, the picture book in the library Go to the corner and see. You should be able to come across your favorite pictures and stories while picking up many books. I like a book that makes me feel positive when I finish reading it. (Office M2 / Haruhi Kanda) ・ Interview cooperation-Hoshi Mirai Kobo Osawa Maki 1 0 0 0 0 0 0 0 0
dokujo-tsushin-4788374.txt Mr. Kanako (30 years old / working at a trading company) who changed jobs to the desired occupation last fall. I thought I would be able to get a rewarding job and have a fulfilling life, but I am suffering from unexpected problems. "Female employees eat lunch together in the break room, but I'm surprised that the topic at that time is all about internal rumors and bad talk. From the story of internal affairs, male employees are selected and female employees at other branches I'm so impressed that I don't run out of stories every day, such as bad words. ”(Mr. Kanako) Of course, some women go out to eat outside, but senior women later complained about it. , You can make rumors that there are no roots or leaves ... "When lunch break approaches, I feel depressed. When I was in my thirties, I never thought I would be worried about lunch break." (Mr. Kanako) The only break in a busy day was lunch break. There should be many people who say. If you can have a good lunch and lunch service with your friends and have a good conversation, you will be able to work hard in the afternoon. However, it often doesn't go as planned. Mr. Mutsumi (29 years old, medical-related) can only re-employ four female employees, including Mr. Mutsumi. The problem is that the three women who have been working together for a long time have a tight bond. "Since there is only one break room, we eat together for lunch, but the three seniors are too close to each other, so I can't talk about it. At first, I was silently listening, but gradually I stayed. It's getting harder ... Now, as soon as I finish eating, I go back to my desk and read a magazine. "(Mr. Mutsumi) At this company, even if a new female employee is hired, she often quits in a short period of time. Seeing Mr. Mutsumi who can spend a lunch break alone, his boss seems to expect that he may be able to work for a long time. Eri (27 years old), a temporary worker, used to feel that it was hard to spend a lunch break alone, but as she became a temporary worker and experienced multiple companies, she said, "The comfort of one person." It is said that he woke up to. "Sometimes I go out to eat with people in the same department, but I often eat at my desk while looking at magazines. After that, I spend time writing emails. It's very easy because there is no bond between women (laughs) "(Eri) According to Eri, there are companies where everyone watches" soap opera "during lunch break. Eri, who loves drama, had a lot of fun, but some female employees were sitting down first and killing time in the hot water supply room. "The way we spend our lunch breaks varies from company to company. At companies that have an implicit rule that female employees always eat together, they eat while silently looking at their mobile phones. Even though there are 7 to 8 people, as a scene This was pretty painful. Also, some companies say that the oldest female employee is more talkative and has to listen to her all the time during the lunch break. There was. ”(Mr. Eri) When I was in elementary school, before the excursion, there were promises around the classroom that“ ○○ -chan, let's have lunch together ”. I always think I was thrilled and waiting for my friend's OK. The teacher was worried about the shy child, and he said to the leader student, "Please invite XX at noon," or "Let's eat together this time." …. The niece who had just entered high school said happily, "I'm glad I made friends to eat lunch right away." No matter how many times you spend your lunch break, who and where you spend your lunch is a big issue. As an aside, the number of non-smoking offices is increasing, and the way smokers spend their lunch breaks is changing. At the place where my acquaintance works, there is no place to smoke in the building from this spring, so I have no choice but to go to the smoking area at the nearest station to smoke. "I can't go to a cafe where I can smoke every day just to smoke, and when I smoke indoors, the smell of cigarettes permeates my clothes, so I'm grateful for the outdoor smoking area at the station now." acquaintance. The lunch break on a rainy day seems to be depressing considering the way to the station. (Office M2 / Haruhi Kanda) 1 0 0 0 0 0 0 0 0
dokujo-tsushin-4788388.txt There is a single woman who hesitates to get married, saying, "I have to work because his income is low, and I think it's okay to get married a little further." She wants to quit her job right away if she can live on his income alone. In other words, he wanted to be a full-time housewife, but when he heard his annual income, he tilted his head. Is it really impossible to live with this amount of money? In the days when there were many full-time housewives, there was no place to work for housewives, and there were many families who lived only on their husband's income, even though they spilled this month as well. However, even though her husband's income has decreased due to the recession, he goes out to eat out, buy brand-name products, and travel abroad even though it is a cheap tour. Isn't it really extravagant compared to the old days when you could just eat it? Kiyoko, a 56-year-old housewife with two grown-up children, said, "Now, housewives are said to be celebrities, but if I have never owned a branded item, I will travel abroad with my family. I've never been to a housewife. My husband's income isn't enough, but if I don't have the luxury, I've managed to do it every month. " When a child entered elementary school, some housewives went to a part-time job to cover the cost of the cram school, but Kiyoko's family did not allow her to attend the cram school at all due to her husband's policy. It seems that he only let him go to one lesson. "It seemed difficult for a family to attend a cram school for a private junior high school exam. I was surprised to hear that the cost of a cram school was 50,000 yen a month. I sent them to a private school and it was huge after that. I thought it would be difficult to spend a lot of money on education. ”Kiseko's eldest daughter entered a private women's university, but some of her friends who were able to lift up from junior high school and high school were classmates from elementary school. "I'm not sure about the need for integrated middle and high school education, but after all, if you go to the same university, you don't have to pay a high education fee to go from junior high school," says Kiyoko. Some housewives have a purpose in their work and work for themselves, but there are also housewives who work to raise their standard of living by traveling abroad or eating out with their families. Some housewives buy brand-name products on their own, but most of the housewives' purpose is to pay for their children's education. Education costs are generally said to cost more than 10 million yen from the birth of a child to graduation from university. From kindergarten to university, private schools cost more than 20 million yen. Then you have to work for education, but why go from kindergarten to private school? "Because I want to give my child an educational background that I can be proud of." A child, a mother of an infant who is aiming to enter a private kindergarten, says that being proud is the mother's values, and compared to other children, her child is There may be the appearance of the mother herself that she is doing something special. If you send your child to a private kindergarten or dress it in a brand, your mother will have to wear the appropriate clothing and bag. If you want to live like that, but you can't do it with your husband's income, you can put up with it. If you can't stand it, you can work. But if the purpose of work is for children, then we should consider what is really happy for them. In the hearts of women who complained that his income was low or his husband was not earning enough, I feel that he was looking for a life that dared to want something that he did not need. It can be improved by competing with people, but I don't think it is comparable or competing with happiness at home. Mr. Kiyoko mentioned above, she said that her adult daughter said she was happy to have a mother who always waited for her when she came back from school. "I didn't have the ability or qualifications, so I just stayed at home and saved money, but I enjoyed the time I spent with my children." It sounded fresh. There are things that you can get by working, but there are also things that you can get by saving time. If you think his income is low, why not try to remember the trick of making ends meet? Marriage is something that you can do in real life rather than thinking about it. (Office M2 / Setsuko Saeda) 1 0 0 0 0 0 0 0 0
dokujo-tsushin-4791665.txt In the coming season, the natural enemy of the skin is "ultraviolet rays". Marketing company Trenders Co., Ltd. conducted an awareness survey on "UV care" for women in their 20s and 30s, 99%Answered that "UV care is required when going out". It turns out that UV care is now common sense for women. UV care cannot be overlooked even for a single woman aiming for anti-aging. When asked, "What kind of scene do you care about UV rays in a day?", The most frequently asked question was "commuting", and more than half answered. Regarding the "time zone when UV rays are anxious", the most common was "12:00 to 15:00", which was 83.%.. Next, "9:00 to 12:00" is 67%,"~9 o'clock"(43%)It turns out that working women are aware of UV care during the morning hours. In fact, 87 people are doing "UV care in the morning"%"I get the most UV rays during my morning commute. (36 years old, real estate advertising design)" "I don't miss it because I get sunburned when I commute. (29 years old, trading company accounting)." "Morning UV" care is a must to prevent sunburn and stains. As a specific UV care method, the most common method was "applying sunscreen", which was 92 overall.%Answered. Next, "parasol"(59%),"hat"(39%)It continues with. "Sunscreen", which can be easily applied and has UV protection, has gained the support of women. For the question "How many times a day do you reapply sunscreen?", On average, "0".It is clear that it is repainted about once a day, "94 times". Of course, when I sweat a lot, I'm worried that the effect is decreasing without knowing it, so I reapply it many times, but for a single woman who is busy every day, even that time is regrettable. Also, I am concerned about the burden on my skin. Also, according to the question "What do you look for in a sunscreen?", "Sunscreen effect" (93)%), Next to the answer that is a mast effect for sunscreen, "less burden on the skin"(88%), "Not sticky"(73%)There are many opinions, and it seems that "being gentle on the skin" is important, not just being able to do UV care. The new sunscreen "Anessa" is recommended for women who want to value both UV care and gentleness to the skin. Not only does it block all UV rays on the ground, but it also has a smooth feel without the stickiness and whitening that has been common with sunscreens. If you're a single woman who hasn't used the sunscreen recently, saying "I don't want to put a strain on my skin" while feeling the UV rays on my skin, why not give it a try? Click here for details ・ Anessa-Shiseido 1 0 0 0 0 0 0 0 0

Next, read it as a data frame and divide it into training data, verification data, and evaluation data.

import pandas as pd
from sklearn.model_selection import train_test_split
from tabulate import tabulate

#Data reading
df = pd.read_csv('./text/livedoor.tsv', sep='\t')

#Data split
categories = ['dokujo-tsushin', 'it-life-hack', 'kaden-channel', 'livedoor-homme', 'movie-enter', 'peachy', 'smax', 'sports-watch', 'topic-news']
train, valid_test = train_test_split(df, test_size=0.2, shuffle=True, random_state=123, stratify=df[categories])
valid, test = train_test_split(valid_test, test_size=0.5, shuffle=True, random_state=123, stratify=valid_test[categories])
train.reset_index(drop=True, inplace=True)
valid.reset_index(drop=True, inplace=True)
test.reset_index(drop=True, inplace=True)

#Confirmation of the number of cases
table = [['train'] + [train[category].sum() for category in categories],
         ['valid'] + [valid[category].sum() for category in categories],
         ['test'] + [test[category].sum() for category in categories]]
headers = ['data'] + categories
print(tabulate(table, headers, tablefmt='grid'))

output


+--------+------------------+----------------+-----------------+------------------+---------------+----------+--------+----------------+--------------+
| data   |   dokujo-tsushin |   it-life-hack |   kaden-channel |   livedoor-homme |   movie-enter |   peachy |   smax |   sports-watch |   topic-news |
+========+==================+================+=================+==================+===============+==========+========+================+==============+
| train  |              696 |            696 |             691 |              409 |           696 |      673 |    696 |            720 |          616 |
+--------+------------------+----------------+-----------------+------------------+---------------+----------+--------+----------------+--------------+
| valid  |               87 |             87 |              87 |               51 |            87 |       84 |     87 |             90 |           77 |
+--------+------------------+----------------+-----------------+------------------+---------------+----------+--------+----------------+--------------+
| test   |               87 |             87 |              86 |               51 |            87 |       85 |     87 |             90 |           77 |
+--------+------------------+----------------+-----------------+------------------+---------------+----------+--------+----------------+--------------+

Preparing for learning

Install the transformers library to use the BERT model. Through transformers, many pre-trained models besides BERT can be used very easily with short code.

!pip install transformers

In addition, install MeCab for morphological analysis. This is called and used by transformers in the process.

!apt install mecab libmecab-dev mecab-ipadic-utf8
!pip install mecab-python3

Import the libraries needed to train and evaluate your model.

import numpy as np
import transformers
from transformers import BertJapaneseTokenizer, BertModel
import torch
from torch.utils.data import Dataset, DataLoader
from torch import optim
from torch import cuda
import time
from matplotlib import pyplot as plt

Next, specify the Japanese pre-learned BERT model to be used this time. Currently, transformers can use four types of models published by Tohoku University's Inui-Suzuki laboratory. Here, I will try bert-base-japanese-whole-word-masking.

#Specifying a pre-trained model
pretrained = 'cl-tohoku/bert-base-japanese-whole-word-masking'

Next, shape the data into a form that can be populated into the model. First, define a class to create a `Dataset``` that holds the feature vector and the label vector together, which is often used in PyTorch. By passing ``` tokenizer``` to this class, it is possible to perform processing such as morphological analysis on the input text, pad it to the specified maximum series length, and then convert it to word ID. .. However, the tokenizer``` itself, where all the processing is written for BERT, will be obtained later through tranformers, so what you need in the class is the processing and result to pass to `` tokenizer```. It is only the process of receiving.

#Dataset definition
class CreateDataset(Dataset):
  def __init__(self, X, y, tokenizer, max_len):
    self.X = X
    self.y = y
    self.tokenizer = tokenizer
    self.max_len = max_len

  def __len__(self):  # len(Dataset)Specify the value to be returned with
    return len(self.y)

  def __getitem__(self, index):  # Dataset[index]Specify the value to be returned with
    text = self.X[index]
    inputs = self.tokenizer.encode_plus(
      text,
      add_special_tokens=True,
      max_length=self.max_len,
      pad_to_max_length=True
    )
    ids = inputs['input_ids']
    mask = inputs['attention_mask']

    return {
      'ids': torch.LongTensor(ids),
      'mask': torch.LongTensor(mask),
      'labels': torch.Tensor(self.y[index])
    }

Create a Dataset using the above. One of the arguments, MAX_LEN, represents the maximum series length, longer statements are cut off, and shorter statements are padded to align to this length. Originally, BERT can specify up to 512, but this time 128 is specified due to memory restrictions.

#Specifying the maximum series length
MAX_LEN = 128

#Get tokenizer
tokenizer = BertJapaneseTokenizer.from_pretrained(pretrained)

#Creating a Dataset
dataset_train = CreateDataset(train['article'], train[categories].values, tokenizer, MAX_LEN)
dataset_valid = CreateDataset(valid['article'], valid[categories].values, tokenizer, MAX_LEN)
dataset_test = CreateDataset(test['article'], test[categories].values, tokenizer, MAX_LEN)

for var in dataset_train[0]:
  print(f'{var}: {dataset_train[0][var]}')

output


ids: tensor([    2,  5563,  3826,     7,     9,     6,  5233,  2110,    10,  4621,
           49,  1197,    64,    14, 10266,     7,  3441,  1876,    26,    62,
            8,    70,   825,     6,  9749,    70,  3826,     7,  1876,    15,
           16,  7719,  1549,  4621,    11,  1800,    15,    16,  6629,    45,
           28,   392,     8,  5880,     7,  1800,    34,  1559,    14,    31,
          947,     6,  8806,    16,  6629,    13,  1755,  3002,  4621,    11,
         1942,     7,  9626,   392,   124,     7,   139,     8, 25035,  4021,
          489,  7446,   143, 16430, 13901,  1993,    49,  8365,  2496, 12084,
           40,  5880,  1800,  9749,  1876,    15,    16,  7719,  1549,  4621,
           14,     6,  5563,  3826,     5,  4314,  5233,  2110,    10,   120,
         4118,     7,  1876,    26,    20,    16,    33,   344,     9,     6,
        10843,   329, 11426,    11,  1943,    10,    72,     7, 10485,     7,
         1876,    26,    62, 26813,  7004,    11, 20718,     3])
mask: tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1])
labels: tensor([0., 1., 0., 0., 0., 0., 0., 0., 0.])

The information of the first sentence is output. You can see that the input string has been converted to an ID series as ids. In BERT, the special delimiters [CLS] and [SEP] are inserted at the beginning and end of the original sentence during the conversion process, so they are also `2``` and 3``. Included in the series as . The correct label is also held in one-hot format as `` `labels. We also keep a maskthat represents the padding position so that we can pass it to the model along with theids``` during training.

Next, define the network. By using transfomers, the entire BERT part can be expressed by `` `BertModel```. Then, to handle the classification task, define a dropout that receives the BERT output vector and a fully connected layer, and you're done.

#Definition of BERT classification model
class BERTClass(torch.nn.Module):
  def __init__(self, pretrained, drop_rate, otuput_size):
    super().__init__()
    self.bert = BertModel.from_pretrained(pretrained)
    self.drop = torch.nn.Dropout(drop_rate)
    self.fc = torch.nn.Linear(768, otuput_size)  #Specify 768 dimensions according to the output of BERT
    
  def forward(self, ids, mask):
    _, out = self.bert(ids, attention_mask=mask)
    out = self.fc(self.drop(out))
    return out

Learning the BERT classification model

Now that the `` `Datasetand the network are ready, it's time to create the usual learning loop. Here, a series of flows is defined as a train_model``` function. For the meaning of the components that appear, see the flow of the problem in the article [Language Processing 100 Knock 2020] Chapter 8: Neural Net. Please refer to the explanation along with it.

def calculate_loss_and_accuracy(model, loader, device, criterion=None):
  """Calculate loss / correct answer rate"""
  model.eval()
  loss = 0.0
  total = 0
  correct = 0
  with torch.no_grad():
    for data in loader:
      #Device specification
      ids = data['ids'].to(device)
      mask = data['mask'].to(device)
      labels = data['labels'].to(device)

      #Forward propagation
      outputs = model.forward(ids, mask)

      #Loss calculation
      if criterion != None:
        loss += criterion(outputs, labels).item()

      #Correct answer rate calculation
      pred = torch.argmax(outputs, dim=-1).cpu().numpy() #Predicted label array for batch size length
      labels = torch.argmax(labels, dim=-1).cpu().numpy()  #Batch size length correct label array
      total += len(labels)
      correct += (pred == labels).sum().item()
      
  return loss / len(loader), correct / total
  

def train_model(dataset_train, dataset_valid, batch_size, model, criterion, optimizer, num_epochs, device=None):
  """Executes model training and returns a log of loss / correct answer rate"""
  #Device specification
  model.to(device)

  #Creating a dataloader
  dataloader_train = DataLoader(dataset_train, batch_size=batch_size, shuffle=True)
  dataloader_valid = DataLoader(dataset_valid, batch_size=len(dataset_valid), shuffle=False)

  #Learning
  log_train = []
  log_valid = []
  for epoch in range(num_epochs):
    #Record start time
    s_time = time.time()

    #Set to training mode
    model.train()
    for data in dataloader_train:
      #Device specification
      ids = data['ids'].to(device)
      mask = data['mask'].to(device)
      labels = data['labels'].to(device)

      #Initialize gradient to zero
      optimizer.zero_grad()

      #Forward propagation+Backpropagation of error+Weight update
      outputs = model.forward(ids, mask)
      loss = criterion(outputs, labels)
      loss.backward()
      optimizer.step()
      
    #Calculation of loss and correct answer rate
    loss_train, acc_train = calculate_loss_and_accuracy(model, dataloader_train, device, criterion=criterion)
    loss_valid, acc_valid = calculate_loss_and_accuracy(model, dataloader_valid, device, criterion=criterion)
    log_train.append([loss_train, acc_train])
    log_valid.append([loss_valid, acc_valid])

    #Save checkpoint
    torch.save({'epoch': epoch, 'model_state_dict': model.state_dict(), 'optimizer_state_dict': optimizer.state_dict()}, f'checkpoint{epoch + 1}.pt')

    #Record end time
    e_time = time.time()

    #Output log
    print(f'epoch: {epoch + 1}, loss_train: {loss_train:.4f}, accuracy_train: {acc_train:.4f}, loss_valid: {loss_valid:.4f}, accuracy_valid: {acc_valid:.4f}, {(e_time - s_time):.4f}sec') 

  return {'train': log_train, 'valid': log_valid}

Set the parameters and perform fine tuning.

#Parameter setting
DROP_RATE = 0.4
OUTPUT_SIZE = 9
BATCH_SIZE = 16
NUM_EPOCHS = 4
LEARNING_RATE = 2e-5

#Model definition
model = BERTClass(pretrained, DROP_RATE, OUTPUT_SIZE)

#Definition of loss function
criterion = torch.nn.BCEWithLogitsLoss()

#Optimizer definition
optimizer = torch.optim.AdamW(params=model.parameters(), lr=LEARNING_RATE)

#Device specification
device = 'cuda' if cuda.is_available() else 'cpu'

#Model learning
log = train_model(dataset_train, dataset_valid, BATCH_SIZE, model, criterion, optimizer, NUM_EPOCHS, device=device)

output


epoch: 1, loss_train: 0.0976, accuracy_train: 0.8978, loss_valid: 0.1122, accuracy_valid: 0.8575, 405.6795sec
epoch: 2, loss_train: 0.0468, accuracy_train: 0.9622, loss_valid: 0.0802, accuracy_valid: 0.8942, 405.0562sec
epoch: 3, loss_train: 0.0264, accuracy_train: 0.9822, loss_valid: 0.0688, accuracy_valid: 0.9077, 407.3759sec
epoch: 4, loss_train: 0.0164, accuracy_train: 0.9907, loss_valid: 0.0708, accuracy_valid: 0.9050, 407.4937sec

Check the result.

#Log visualization
x_axis = [x for x in range(1, len(log['train']) + 1)]
fig, ax = plt.subplots(1, 2, figsize=(15, 5))
ax[0].plot(x_axis, np.array(log['train']).T[0], label='train')
ax[0].plot(x_axis, np.array(log['valid']).T[0], label='valid')
ax[0].set_xlabel('epoch')
ax[0].set_ylabel('loss')
ax[0].legend()
ax[1].plot(x_axis, np.array(log['train']).T[1], label='train')
ax[1].plot(x_axis, np.array(log['valid']).T[1], label='valid')
ax[1].set_xlabel('epoch')
ax[1].set_ylabel('accuracy')
ax[1].legend()
plt.show()

bert-ja.png

#Calculation of correct answer rate
dataloader_train = DataLoader(dataset_train, batch_size=1, shuffle=False)
dataloader_valid = DataLoader(dataset_valid, batch_size=1, shuffle=False)
dataloader_test = DataLoader(dataset_test, batch_size=1, shuffle=False)

print(f'Correct answer rate (learning data):{calculate_loss_and_accuracy(model, dataloader_train, device)[1]:.3f}')
print(f'Correct answer rate (verification data):{calculate_loss_and_accuracy(model, dataloader_valid, device)[1]:.3f}')
print(f'Correct answer rate (evaluation data):{calculate_loss_and_accuracy(model, dataloader_test, device)[1]:.3f}')

output


Correct answer rate (learning data): 0.991
Correct answer rate (verification data): 0.905
Correct answer rate (evaluation data): 0.904

The correct answer rate was about 90% in the evaluation data.

Normally, I think that it is often the case that parameters such as whether or not weights are fixed for each layer of BERT and the learning rate are adjusted while checking the accuracy of the verification data.

in conclusion

This time, the parameters were fixed, but the accuracy was relatively high, and the result showed the strength of pre-learning. By using the transformers library for implementation, it is sufficient to prepare only the parts used in the conventional neural network. Please try it with various data.

reference

transformers BERT (official) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin, J. et al. (2018) (Original Article) [Language processing 100 knock 2020] Summary of answer examples by Python

Recommended Posts

[PyTorch] Introduction to Japanese document classification using BERT
[PyTorch] Introduction to document classification using BERT
Introduction to Lightning pytorch
[PyTorch] How to use BERT --Fine tuning Japanese pre-trained models to solve classification problems
I tried to implement sentence classification & Attention visualization by Japanese BERT in PyTorch
I tried to compare the accuracy of Japanese BERT and Japanese Distil BERT sentence classification with PyTorch & Introduction of BERT accuracy improvement technique
Introduction to PyTorch (1) Automatic differentiation
Introduction to discord.py (3) Using voice
Introduction to Bayesian Modeling Using pymc3 Bayesian-Modeling-in-Python Japanese Translation (Chapter 0-2)
[Details (?)] Introduction to pytorch ~ CNN CIFAR10 ~
[PyTorch] Japanese sentence generation using Transformer
[Python] Introduction to CNN with Pytorch MNIST
Document classification with toch text from PyTorch
[Introduction to Pytorch] I played with sinGAN ♬
Create document classification data quickly using NLTK
Introduction to Discrete Event Simulation Using Python # 2
Introduction to Tornado (3): Development using templates [Practice]
[Super Introduction to Machine Learning] Learn Pytorch tutorials
Learn Japanese document categorization using spaCy / GiNZA (failure)
[Super Introduction to Machine Learning] Learn Pytorch tutorials
[Introduction to cx_Oracle] (5th) Handling of Japanese data
Introduction to Tornado (2): Introduction to development using templates-Dynamic page generation-
Introduction to Scapy ② (ICMP, HTTP (TCP) transmission using Scapy)
Introduction to MQTT (Introduction)
Introduction to Scrapy (1)
Introduction to Scrapy (3)
Introduction to Supervisor
Introduction to Tkinter 1: Introduction
Introduction to PyQt
Introduction to Scrapy (2)
[Linux] Introduction to Linux
Introduction to Scrapy (4)
Introduction to discord.py (2)
[Implementation explanation] How to use the Japanese version of BERT in Google Colaboratory (PyTorch)
Introduction to discord.py
[PyTorch] Tutorial (Japanese version) ④ ~ TRAINING A CLASSIFIER (image classification) ~
[Introduction to Pytorch] I tried categorizing Cifar10 with VGG16 ♬
[Introduction to Python] How to stop the loop using break?
[Introduction to RasPi4] Environment construction; OpenCV / Tensorflow, Japanese input ♪
[Introduction to cx_Oracle] (Part 13) Connection using connection pool (client side)
[Introduction to Python] How to write repetitive statements using for statements
[Technical book] Introduction to data analysis using Python -1 Chapter Introduction-