I'm ** Shun ** studying programming. Recently, I was interested in Python, so I read "** Python that understands fluently ". This book will teach you the basic syntax of Python and how to do web scraping.
[ A Python book that you can understand fluently **]
(https://www.amazon.co.jp/%E3%82%B9%E3%83%A9%E3%82%B9%E3%83%A9%E3%82%8F%E3%81%8B%E3%82%8BPython-%E5%B2%A9%E5%B4%8E-%E5%9C%AD/dp/4798151092/ref=asc_df_4798151092/?tag=jpgo-22&linkCode=df0&hvadid=295686767484&hvpos=1o1&hvnetw=g&hvrand=17010285472902510266&hvpone=&hvptwo=&hvqmt=&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=1009343&hvtargid=pla-526272651553&psc=1&th=1&psc=1/)
Simply put, it's a technology that extracts the information you want on a website.
Now that I've learned web scraping, I'll try it. The site for scraping this time is BanG Dream's official site (https://bang-dream.com/)
Why did you try this site? .. .. I wanted the image below.
I made a folder called Qiita with VScode. I would like to save it in this folder called Qiita. Then open a command prompt and execute the following command. The installation will start.
$ > pip install requests --user
$ > pip install BeautifulSoup4 --user
Once the installation is complete, I would like to open a terminal and check if the installation was successful.
$ >>> import requests
>>>
$ >>> from bs4 import BeautifulSoup
>>>
If no message is displayed at this point, the installation is successful. If you get the following error message here, the installation has failed. In such a case, check if your computer is connected to the Internet, and then install it again with the pip command.
$ >>> import requests
Traceback (most recent call last ) :
File "<stdin>" , line 1 , in <module>
ModuleNotFoundError : No module named " requests "
>>>
I saved the following contents in the Qiita folder as Qiita01.py. A commentary is also posted.
Qiita01.py
import requests
from bs4 import BeautifulSoup
result = requests.get("https://bang-dream.com/")
soup = BeautifulSoup(result.text, "html.parser")
img = soup.find_all('img')
print(img)
import requests
Declaration to use requests library
from bs4 import beautifulsoup
Importing external library beautifulsoup
result = requests.get("https://bang-dream.com/")Enter the URL you want to scrape here
#### **` soup = BeautifulSoup(result.text, "html.parser")Specify the character string to be analyzed and the type of processing to actually analyze in the processing of BeautifulSoup`**
img = soup.find_all('img')in the find method[img]Specify the character
|Mesot|function|
|:--------|------|
| find_all() |Searches for the tag specified in the citation and returns a list containing all matches|
``` print(img) ```output
### Output result

If you look at it in the terminal, you will see something like this. Let's open the link drawn by the red line. If the following image appears, scraping is successful.

## Impressions
Why did you write a rudimentary article? ?? Some people may think that. The answer is simple, I've only had this many articles ... I want to deepen Python further.
Recommended Posts