This time, we will ask Raspberry Pi to talk about the stock price obtained by web scraping using OpenJtalk.
・ Python 3 and OpenJtalk can be used on Raspberry Pi (Installation of OpenJtalk is explained in this article, so if you haven't done so already!)
・ Raspberry Pi3 model B ・ OS: Raspbian ・ Python ver3.7
The code is based on the following article. Web scraping with Python3: https://qiita.com/Senple/items/724e36fc1f66f5b14231 Get stock price: https://qiita.com/Azunyan1111/items/9b3d16428d2bcc7c9406
For the time being, web scraping may come into contact with the law if you make a mistake, so if you are new to web scraping, we recommend that you read this article. ..
Let's write the code.
This is a Python 3 series It is a code to get the stock price from Nikkei newspaper site and print it. For the meaning of the code, please refer to this article.
webscraping_test.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import urllib3 #this is installed as default
from bs4 import BeautifulSoup
import certifi
#URL
url = "https://www.nikkei.com/markets/kabu/"
#Html to access URL is returned →<html><head><title>Economic, stock, business and political news:Nikkei electronic version</title></head><body....
html = urllib3.PoolManager(
cert_reqs='CERT_REQUIRED',
ca_certs=certifi.where())
r = html.request('GET', url)
#Handle html with Beautiful Soup
soup = BeautifulSoup(r.data, "html.parser")
#Extract all span elements → All span elements are put back in the array →[<span class="m-wficon triDown"></span>, <span class="l-h...
#span element->Insert ordinary sentences like p and div elements,The difference is that there are no line breaks
spans = soup.find_all("span")
for tag in spans:
#Elements for which class is not set are tags.get("class").pop(0)Avoid the error with try as it will result in an error because you cannot do
try:
#tag.get("class")Get all the classes in tag and return them in a list
string_ = tag.get("class").pop(0)
if string_ in "mkc-stock_prices":
stockprice = tag.string
break
except:
pass
print(stockprice)
In order to output the scraped information immediately, you need to run openJtalk from Python. (Install OpenJtalk from here](https://qiita.com/coffiego/items/4fc3b0be78fcded3eef0)) So, there was a article running openJtalk from Python, so I will use it. Copy and paste this code into the same directory as before.
jtalk.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import subprocess
from datetime import datetime
def jtalk(t):
open_jtalk=['open_jtalk']
mech=['-x','/var/lib/mecab/dic/open-jtalk/naist-jdic']
htsvoice=['-m','/usr/share/hts-voice/mei/mei_normal.htsvoice']
speed=['-r','1.0']
outwav=['-ow','open_jtalk.wav']
cmd=open_jtalk+mech+htsvoice+speed+outwav
c = subprocess.Popen(cmd,stdin=subprocess.PIPE)
c.stdin.write(t.encode())
c.stdin.close()
c.wait()
aplay = ['aplay','-q','open_jtalk.wav']
wr = subprocess.Popen(aplay)
def say_datetime():
d = datetime.now()
text = '%s month%s day,%s time%s minutes%s seconds' % (d.month, d.day, d.hour, d.minute, d.second)
jtalk(text)
if __name__ == '__main__':
say_datetime()
After that, import this code in the code to webscaraping You can output voice by using jtalk.jtalk ("what you want to say")!
Finally combine. That said, just add a few lines, but ... Here is the resulting code.
webscraping_jtalk_test.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#python3
#The one who scraped the stock price is output by voice with openjtalk.
#web scraping
#This time with Python3! !!
##web scraping library####
import urllib3 #this is installed as default
from bs4 import BeautifulSoup
import certifi
### jtalk import
import jtalk #I'm importing here! !!
#URL
url = "https://www.nikkei.com/markets/kabu/"
#Html to access URL is returned →<html><head><title>Economic, stock, business and political news:Nikkei electronic version</title></head><body....
html = urllib3.PoolManager(
cert_reqs='CERT_REQUIRED',
ca_certs=certifi.where())
r = html.request('GET', url)
#Handle html with Beautiful Soup
soup = BeautifulSoup(r.data, "html.parser")
#Extract all span elements → All span elements are put back in the array →[<span class="m-wficon triDown"></span>, <span class="l-h...
#span element->Insert ordinary sentences like p and div elements,The difference is that there are no line breaks
spans = soup.find_all("span")
for tag in spans:
#Elements for which class is not set are tags.get("class").pop(0)Avoid the error with try as it will result in an error because you cannot do
try:
#tag.get("class")Get all the classes in tag and return them in a list
string_ = tag.get("class").pop(0)
if string_ in "mkc-stock_prices":
stockprice = tag.string
break
except:
pass
jtalk.jtalk(str(stockprice)) #Output audio with open j talk
#print(stockprice)
In the directory jtalk.py webscraping_jtalk_test.py If you confirm that there is, you can do the following and it will tell you.
$ ls
jtalk.py webscraping_jtalk_test.py
$ python3 webscraping_jtalk_test.py
I told you! But you just read the numbers one by one. Lol For the time being, I think that it can be applied as much as possible.
This time, I tried web scraping + OpenJtalk! With this, you can also utter the information obtained by the API in the same way. Write a program so that it works at a fixed time in the morning and let it read out the weather and news that you care about, Then! [Addition] I wrote a program to talk about the weather in the following article, so if you are interested, please do! URL: https://qiita.com/coffiego/items/ec050e6106a7424c048b