[PYTHON] [Memo] How to use BeautifulSoup4 (1) Display html

Scraping with Beautiful Soup on jupyter notebook.

In [1] Import Beautiful Soup

In[1]


from bs4 import BeautifulSoup

In [2] Store the html of the article you want to scrape in the variable kiji

In[2]


kiji = """<html>
        <head>
           <title>I posted it on Qiita</title>
        </head>
        <body>
           <p class="title">
              <b>Challenge Qiita for output.</b>
           </p>
        <p class="article">
              <b>I will do my best to write an article.</b>
           </p>
        </body>
   </html>"""

Write the html you want to store between "" "and" "".

In [3] Load the html stored in the variable kiji into BeautifulSoup.

In[3]


soup = BeautifulSoup(kiji,"html.parser")

Write BeautifulSoup (variable containing stored html, "parser you want to use"). This time it is (kiji, "html.parser"). Be careful not to forget to enclose the parser in "" or write. Like htmlparser.

Use In [4] soup with prettify to make it easier to see.

In[4]


print(soup.prettify())

By using prettify (), it is layered and easy to see.

In [4] Output result

In[4]


<html>
 <head>
  <title>
I posted it on Qiita
  </title>
 </head>
 <body>
  <p class="title">
   <b>
Challenge Qiita for output.
   </b>
  </p>
  <p class="article">
   <b>
I will do my best to write an article.
   </b>
  </p>
 </body>
</html>

In [5] Display the title

In[5]


print(soup.html.head.title)

In[5]Output result


<title>I posted it on Qiita</title>

Recommended Posts

[Memo] How to use BeautifulSoup4 (1) Display html
[Memo] How to use BeautifulSoup4 (2) Display the article headline with Requests
[Memo] How to use BeautifulSoup4 (3) Display the article headline with class_
How to use cron (personal memo)
[Memo] How to use Google MµG
How to use Python-shell
How to use tf.data
How to use virtualenv
How to use Seaboan
How to use shogun
How to use Pandas 2
How to use Virtualenv
How to use numpy.vectorize
How to use pytest_report_header
How to use partial
How to use Bio.Phylo
How to use SymPy
How to use x-means
How to use WikiExtractor.py
How to use IPython
How to use virtualenv
How to use Matplotlib
How to use iptables
How to use numpy
How to use TokyoTechFes2015
How to use venv
How to use dictionary {}
How to use Pyenv
How to use list []
How to use python-kabusapi
How to use OptParse
How to use return
How to use dotenv
How to use pyenv-virtualenv
How to use Go.mod
How to use imutils
How to use import
How to use Qt Designer
How to use search sorted
[gensim] How to use Doc2Vec
python3: How to use bottle (2)
A memo of how to use AIST supercomputer ABCI
Understand how to use django-filter
How to use the generator
Memo of how to use properly when combining pandas.DataFrame
[Python] How to use list 1
How to use FastAPI ③ OpenAPI
How to use Python argparse
How to use IPython Notebook
How to use Pandas Rolling
[Note] How to use virtualenv
How to use redis-py Dictionaries
Python: How to use pydub
[Python] How to use checkio
[Go] How to use "... (3 periods)"
How to use Django's GeoIp2
[Python] How to use input ()
How to use the decorator
[Introduction] How to use open3d
How to use Python lambda
How to use Jupyter Notebook