python beautifulsoup requests glob find_all

Sample code 1 (specify url)

python


import requests
from bs4 import BeautifulSoup

url = 'https://xxx'
r = requests.get(url)

soup = BeautifulSoup(r.text, 'html.parser')

#Display the text of the p tag
tag_p = soup.find_all('p') 
for p in tag_p:
  print(p.text)

#---The following is find_example of all method(Same for find method) ---
#Specifying attributes
ids = soup.find_all(id='sample')

#Specifying attributes(class)
clss = soup.find_all(class_='sample')

#Specify tag name and attribute
divs = soup.find_all('div', class_='sample')

#Multiple tags
tags = soup.find_all(['a', 'b', 'c'])

Sample code 2 (specify a file)


from glob import glob
from bs4 import BeautifulSoup

#When targeting html files in the same directory
files = glob('*.htm')

for file in files:
  ff = open( file, 'r' ,encoding='utf-8' ).read() 
  soup = BeautifulSoup( ff ,'html.parser')

  #Display the text of the p tag
  tag_p = soup.find_all('p')
  for p in tag_p:
    print(p.text)

Recommended Posts

python beautifulsoup requests glob find_all
Retry with python requests
Python Requests status code
python selenium chromedriver beautifulsoup
Aim python library master (18) requests
Retry post request using python requests
How to use Requests (Python Library)
[Python] POST wav files with requests [POST]
Get the weather with Python requests
Get the weather with Python requests 2
[Lambda] Make import requests available [python]
Send multipart / form-data with python requests