About this article

It is a script that uses Python and lxml to get the base price of investment trusts from the website of the Investment Trusts Association (Comprehensive Investment Trust Search Library) by web scraping. Regarding the previously published Yahoo! Finance version, web scraping from Yahoo! Finance was prohibited by the rules, so I created it instead.

It is published on GitHub, so click here for the latest version sawadyrr5 / PyFundJP: Script for acquiring information on investment trusts in Japan

The search will be by ISIN code instead of fund code, but the basic syntax is exactly the same.

`getNAV`


# -*- coding: utf-8 -*-
# python 2.7
#Scraping standard price data from investment trust association
import lxml.html
import datetime

def getNAV(isin, sy, sm, sd, ey, em, ed):
    #Push the argument into the dict
    d = dict(isin=isin, sy=sy, sm=sm, sd=sd, ey=ey, em=em, ed=ed)

    #Unpack dict to generate URL
    url = 'http://tskl.toushin.or.jp/FdsWeb/view/FDST030004.seam?isinCd={isin}\
&stdDateFromY={sy}&stdDateFromM={sm}&stdDateFromD={sd}\
&stdDateToY={ey}&stdDateToM={em}&stdDateToD={ed}&showFlg=1&adminFlag=1'.format(**d)

    #Get ElementTree
    tree = lxml.html.parse(url)

    #date,Base price,Apply map and utf while getting all the elements of net worth-8 conversion
    contents = map(lambda html: html.text.encode('utf-8').replace('\n',''), tree.xpath('//*[@id="showList"]//label'))

    #Because it is one list[[date, price, cap], [date, price, cap], ...]Divide with
    res = []
    for i in range(0, len(contents)-1, 3):
        date = datetime.datetime.strptime(contents[i], '%Y year%m month%d day').strftime('%Y%m%d')
        price = int(contents[i+1].replace(',','').replace('Circle',''))
        cap = contents[i+2].replace(',','').replace('100 million yen','')
        res.append([date, price, cap])

    return res

if __name__ == '__main__':
    #Push parameters into dict(Japanese stock Alpha Quartet (monthly distribution type))
    args = dict(isin='JP90C000A931', sy='2015', sm='12', sd='01', ey='2015', em='12', ed='31')
    #Pass the dict and unpack
    print getNAV(**args)

Scraping with Python-Getting the base price of mutual funds from the investment trust association web

About this article

getNAV

`getNAV`