[PYTHON] Beautiful Soup memo

document http://www.crummy.com/software/BeautifulSoup/bs4/doc/

BeautifulSoup


from bs4 import BeautifulSoup

soup = BeautifulSoup(raw) #raw is web page load data

#findAll:Get the object of the corresponding tag in the list
#Below is the class image-Get all ul of items
ul_items = soup.findAll('ul',class_='image-items')

#find:Get 1 object of the corresponding tag
a = item.find('a')
#It looks like this when id is specified
sample = soup.find(id='template-embed-sample')

#Get attribute value
#Get the link destination of the a tag
link = a.attrs['href']

BeautifulSoup object obtained by find method? Because it has the information of the contained child You can also get the following

<div><span>hogehoge</span><div>

to get hogehoge

div = soup.find('div')
span = div.find('span')#Find the span in the div
print(span.text)

Recommended Posts

Beautiful Soup memo
Beautiful Soup
Beautiful soup spills
My Beautiful Soup (Python)
Scraping with Beautiful Soup
Table scraping with Beautiful Soup
Crawl practice with Beautiful Soup
Try scraping with Python + Beautiful Soup
A memorandum when using beautiful soup
Scraping multiple pages with Beautiful Soup
[Python] A memorandum of beautiful soup4
Scraping with Python and Beautiful Soup
Scraping pages with pagination with Beautiful Soup
Scraping with Beautiful Soup in 10 minutes
Website scraping with Python's Beautiful Soup
Raspberry-pi memo
Pandas memo
HackerRank memo
python memo
graphene memo
Flask memo
pyenv memo
Matplotlib memo
pytest memo
sed memo
Python memo
Install Memo
BeautifulSoup4 memo
networkx memo
python memo
tomcat memo
command memo
Generator memo.
psycopg2 memo
Python memo
SSH memo
[Python] Scraping a table using Beautiful Soup
Command memo
Memo: rtl8812
pandas memo
Shell memo
Python memo
Remove unwanted HTML tags with Beautiful Soup
Pycharm memo
Python memo
Frequently used methods of Selenium and Beautiful Soup
How to search HTML data using Beautiful Soup