Scraping with Python-Getting the base price of mutual funds from the investment trust association web

About this article

It is a script that uses Python and lxml to get the base price of investment trusts from the website of the Investment Trusts Association (Comprehensive Investment Trust Search Library) by web scraping. Regarding the previously published Yahoo! Finance version, web scraping from Yahoo! Finance was prohibited by the rules, so I created it instead.

It is published on GitHub, so click here for the latest version sawadyrr5 / PyFundJP: Script for acquiring information on investment trusts in Japan

The search will be by ISIN code instead of fund code, but the basic syntax is exactly the same.

getNAV


# -*- coding: utf-8 -*-
# python 2.7
#Scraping standard price data from investment trust association
import lxml.html
import datetime

def getNAV(isin, sy, sm, sd, ey, em, ed):
    #Push the argument into the dict
    d = dict(isin=isin, sy=sy, sm=sm, sd=sd, ey=ey, em=em, ed=ed)

    #Unpack dict to generate URL
    url = 'http://tskl.toushin.or.jp/FdsWeb/view/FDST030004.seam?isinCd={isin}\
&stdDateFromY={sy}&stdDateFromM={sm}&stdDateFromD={sd}\
&stdDateToY={ey}&stdDateToM={em}&stdDateToD={ed}&showFlg=1&adminFlag=1'.format(**d)

    #Get ElementTree
    tree = lxml.html.parse(url)

    #date,Base price,Apply map and utf while getting all the elements of net worth-8 conversion
    contents = map(lambda html: html.text.encode('utf-8').replace('\n',''), tree.xpath('//*[@id="showList"]//label'))

    #Because it is one list[[date, price, cap], [date, price, cap], ...]Divide with
    res = []
    for i in range(0, len(contents)-1, 3):
        date = datetime.datetime.strptime(contents[i], '%Y year%m month%d day').strftime('%Y%m%d')
        price = int(contents[i+1].replace(',','').replace('Circle',''))
        cap = contents[i+2].replace(',','').replace('100 million yen','')
        res.append([date, price, cap])

    return res

if __name__ == '__main__':
    #Push parameters into dict(Japanese stock Alpha Quartet (monthly distribution type))
    args = dict(isin='JP90C000A931', sy='2015', sm='12', sd='01', ey='2015', em='12', ed='31')
    #Pass the dict and unpack
    print getNAV(**args)

Recommended Posts

Scraping with Python-Getting the base price of mutual funds from the investment trust association web
Scraping with Python-Getting investment trust attribute information from the investment trust association web
Scraping with Python-Getting investment trust attribute information from the investment trust association web
Scraping with Python-Getting the base price of mutual funds from the investment trust association web
Get Japanese stock price information from yahoo finance with pandas
Scraping with Python-Getting investment trust attribute information from the investment trust association web
Try scraping the data of COVID-19 in Tokyo with Python
Scraping the result of "Schedule-kun"
scraping the Nikkei 225 with playwright-python
I tried scraping the ranking of Qiita Advent Calendar with Python
Studying web scraping for the purpose of extracting data from Filmarks # 2
From the introduction of JUMAN ++ to morphological analysis of Japanese with Python
Studying web scraping for the purpose of extracting data from Filmarks # 2
Let's do web scraping with Python (stock price)