[PYTHON] Web scraping beginners tried to make a command to get the movie name of next Friday Road Show

1. Overview

It's a hassle to check the TV section every time what the movie of this Friday Road Show is. So, I wondered if I could get the movie name of the Friday Road Show from the terminal of the PC with one command using Python web scraping.

2. Goal

--Display the title of the movie that will be broadcast next Friday with a single command on the terminal. --At that time, scrape the Friday Road Show lineup page (https://kinro.jointv.jp/lineup) using Python's Beautiful Soup. .. Let's look at the structure of the page to be scraped in advance.

Friday Road Show website


...
<li>
  <div class="photo">
    <a href='/lineup/20170414'>
      <img src="https://dtg3yjoeemd2c.cloudfront.net/pic/lineup/20170414/photo01_p62bphcy8m.jpg " alt="Detective Conan: A Pure Black Nightmare" />
    </a>
  </div>
...
</li>

<li>
  <div class="photo">
    <a href='/lineup/20170421'>
      <img src="https://dtg3yjoeemd2c.cloudfront.net/pic/lineup/20170421/photo01_uyxdjywd.jpg " alt="Cinderella" />
    </a>
  </div>
...
</li>

<li>
  <div class="photo">
    <a href='/lineup/20170428'>
      <img src="https://dtg3yjoeemd2c.cloudfront.net/pic/lineup/20170428/photo01_9txwertpu3.jpg " alt="Wild Speed Sky Mission" />
    </a>
  </div>
...
</li>
...

3. Code

kinro.py


#coding:utf-8

import urllib.request
import datetime
from bs4 import BeautifulSoup


def func():
	html = urllib.request.urlopen("https://kinro.jointv.jp/lineup")
	soup = BeautifulSoup(html, "lxml")
	today = datetime.date.today()
	nextFriday = today + datetime.timedelta(days = (4 - today.weekday()) % 7)
	strnextFriday = nextFriday.strftime("%Y%m%d")
	a = soup.find_all("a", href = "/lineup/" + strnextFriday)
	tmp = a[0].find("img")
	title = tmp.attrs['alt']
	print(title)

if __name__ == '__main__':
    func()

Open a terminal and in the same directory as this code,

$python kinro.py

Execute the command

Detective Conan: A Pure Black Nightmare#Within April 14, 2017
Cinderella#April 15, 2017~21st

If the title of the movie is displayed like, it is successful.

Of course, in .barhrc

alias kinro='python ~/my_dir/kinro.py'  #Directory name matches environment

If you define this command like this, you can get the movie name of next Friday Road Show with one command of $ kinro on any directory.

4. Code description

4.1. Loading a web page

The first two lines.

kinro.py(part)


html = urllib.request.urlopen("https://kinro.jointv.jp/lineup")
soup = BeautifulSoup(html, "lxml")

4.2. Get the date next Friday

Lines 3-5. I'm getting today's date and calculating the difference in days from there to next Friday.

kinro.py(part)


today = datetime.date.today()
nextFriday = today + datetime.timedelta(days = (4 - today.weekday()) % 7)
strnextFriday = nextFriday.strftime("%Y%m%d")

4.3. Acquisition / output of movie name

Lines 6-9.

kinro.py(part)


a = soup.find_all("a", href = "/lineup/" + strnextFriday)
tmp = a[0].find("img")
title = tmp.attrs['alt']
print(title)

On the 6th line

Friday Road Show website


<a href='/lineup/20170414'>
  <img src="https://dtg3yjoeemd2c.cloudfront.net/pic/lineup/20170414/photo01_p62bphcy8m.jpg " alt="Detective Conan: A Pure Black Nightmare" />
</a>

Take out the part of, and further from there on the 7th line

Friday Road Show website


<img src="https://dtg3yjoeemd2c.cloudfront.net/pic/lineup/20170414/photo01_p62bphcy8m.jpg " alt="Detective Conan: A Pure Black Nightmare" />

From there on the 8th line

Detective Conan: A Pure Black Nightmare

Only the part of is taken out.

5. Reference URL

5.1. Regarding scraping

5.2. Regarding date manipulation

Recommended Posts

Web scraping beginners tried to make a command to get the movie name of next Friday Road Show
I tried to make a Web API
[Linux] Command to get a list of commands executed in the past
Get the song name from the title of the video you tried to sing
I tried to get the movie information of TMDb API with Python
I tried web scraping to analyze the lyrics.
[Python] I tried to get the type name as a string from the type function
Create a command to get the work log
I wanted to know the number of lines in multiple files, so I tried to get it with a command
[LPIC 101] I tried to summarize the command options that are easy to make a mistake
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
[For beginners] Web scraping with Python "Access the URL in the page to get the contents"
python beginners tried to predict the number of criminals
Get the variable name of the variable as a character string.
[Linux] [C / C ++] How to get the return address value of a function and the function name of the caller
I tried to create a Python script to get the value of a cell in Microsoft Excel
Get UNIXTIME at the beginning of today with a command
[Command] Command to get a list of files containing double-byte characters
To get the name of the primitive etc. generated immediately before
I tried to make a site that makes it easy to see the update information of Azure
[First scraping] I tried to make a VIP character of Smash Bros. [Beautiful Soup] [Data analysis]
Stock price plummeted with "new corona"? I tried to get the Nikkei Stock Average by web scraping
[Python] I tried to make a simple program that works on the command line using argparse.
I want to get the name of the function / method being executed
[Linux] I tried to summarize the command of resource confirmation system
I tried to get a database of horse racing using Pandas
I tried to get the index of the list using the enumerate function
I tried to make a regular expression of "amount" using Python
I tried to make a regular expression of "time" using Python
How to make a command to read the configuration file with pyramid
I tried to make a regular expression of "date" using Python
[Go] Create a CLI command to change the extension of the image
How to output the output result of the Linux man command to a file
How to get the vertex coordinates of a feature in ArcPy
A command to easily check the speed of the network on the console
The road to web application development is a long way off
Create a function to get the contents of the database in Go
PhytoMine-I tried to get the genetic information of plants with Python
I tried to make a mechanism of exclusive control with Go
[For beginners] I want to get the index of an element that satisfies a certain conditional expression
I tried to make a thumbnail image of the best avoidance flag-chan! With RGB values ​​[Histogram] [Visualization]
A super introduction to Django by Python beginners! Part 2 I tried using the convenient functions of the template
Get a capture of the entire web page in Selenium Python VBA
I tried to get the batting results of Hachinai using image processing
I tried to get the authentication code of Qiita API with Python.
Try to get the road surface condition using big data of road surface management
[Personal memo] Get data on the Web and make it a DataFrame
I tried to get the RSS of the top song of the iTunes store automatically
I tried to display the altitude value of DTM in a graph
I tried to verify the result of A / B test by chi-square test
How to get the "name" of a field whose value is limited by the choice attribute in Django's model