Output product information to csv using Rakuten product search API [Python]

Introduction

I used Rakuten Ichiba's API to output product information that applies to keywords to csv.

I used this "Rakuten product search API". Rakuten Web Service: Rakuten Product Search API (version: 2017-07-06) \ | API List

Development environment and libraries to use

I used Jupyter Notebook as a development environment. When creating a large-scale tool or a tool that you want to execute regularly, you may need to create it with another text editor, but when creating a small one-shot tool, you can write a script while trying it little by little with Jupyter Notebook. However, it is very convenient because it can be executed immediately.

The libraries used are request and pandas. I used request to hit the API and pandas for the retrieved data manipulation and csv output.

Purpose

This was done for a price survey to sell agricultural products. Based on the acquired information, we are assuming that we will further analyze and make decisions (this time, we will acquire information).

There are various direct sales sites, but I thought that Rakuten Ichiba is familiar, has a large number of products, and has an API, so it is easy to acquire.

Preparing to handle Rakuten API

In order to utilize the API, you must first create an app from Rakuten's developer page and get an ID before you start writing a script.

This Rakuten Developers site Rakuten Web Service: API List

Create an app from "+ Issue App ID" on the upper right. By using the app ID obtained here when executing it in your own script, you will be able to access and stream the information on Rakuten Ichiba.

It would be nice to have APIs for other Rakuten services (Rakuten Travel, Rakuten Recipes, etc.) as well as Rakuten Ichiba. I would like to use it if I have a chance.

Script to get product information

(1) Enter keywords to get product information

This time, we will acquire product information that includes the potato variety name "Make-in" as a keyword.

First, import the required libraries.

import requests
import numpy as np
import pandas as pd

I want to use it later, so I'll include NumPy as well. There is no problem even if you do not use it. Next, a script that hits the API to get information.

REQUEST_URL = "https://app.rakuten.co.jp/services/api/IchibaItem/Search/20170706"
APP_ID="<Enter the app ID obtained from Rakuten's site here>"

serch_keyword = 'Make-in'

serch_params={
    "format" : "json",
    "keyword" : serch_keyword,
    "applicationId" : [APP_ID],
    "availability" : 0,
    "hits" : 30,
    "page" : 1,
    "sort" : "-updateTimestamp"
}

response = requests.get(REQUEST_URL, serch_params)
result = response.json()

Now you can get the information in the form of a dict type list with result ['Items']. This time, 30 products have been acquired (the value specified by " hits ": 30 in serch_params. This is the maximum value that can be acquired at one time).

Furthermore, for example, by setting result ['Items'] [2] ['Item'], the second item from the acquired items can be acquired as a dict type.     If you take a quick look at the script,

REQUEST_URL is listed in Rakuten Web Service: Rakuten Product Search API (version: 2017-07-06) \ | API List Specify the request URL, In ʻAPP_ID`, enter the app ID obtained from Rakuten's developer page earlier.

By specifying the character string you want to search with serch_keyword, products that match that keyword will be searched. It seems to be easy to use even if you accept user input here with Python's ʻinput ()` function.

In serch_params, write the parameters when sending a request in dict type. Rakuten Web Service: Rakuten Product Search API (version: 2017-07-06) \ | API List Details in the "Input Parameters" section Is listed. ʻApplicationId (app ID) is required for this parameter, and it seems that ʻAPP_ID is required, and one of keyword, shopCode, ʻitemCode, genreIdis required. This time, I want to get the product information by the search keyword, so I specified the previousserch_keyword for keyword`.

For example, this " page ": 1 is an acquisition page, so it seems that you can easily acquire a large amount of product information over multiple pages by looping this number with a for statement.

(2) Create a dict type that contains the necessary product information

By the way, the dict that I got by hitting the API earlier is [Rakuten Web Service: Rakuten Product Search API (version: 2017-07-06) \ | API List](https://webservice.rakuten.co.jp/api The items listed in the "Output parameters" section of / ichibaitemsearch /) are included as dict keys and values.

For example, if you specify the key like result ['Items'] [2] ['Item'] ['itemName'], you can get the product name.

The information acquired at this stage is inconvenient to handle because it contains extra information as it is, so I will make a dict that contains only the necessary information.

The data we need this time 「itemName」「itemPrice」「itemCaption」「shopName」「shopUrl」「itemUrl」 (Later, I thought that the shipping flag "postageFlag" was also necessary, but it is not reflected in the script below).

#Turn the for statement to make a dict
item_key = ['itemName', 'itemPrice', 'itemCaption', 'shopName', 'shopUrl', 'itemUrl']
item_list = []
for i in range(0, len(result['Items'])):
    tmp_item = {}
    item = result['Items'][i]['Item']
    for key, value in item.items():
        if key in item_key:
            tmp_item[key] = value
    item_list.append(tmp_item)

Now you can get a list containing dict type product information.

What got stuck here was that I had to use the copy () method at ʻitem_list.append (tmp_item.copy ()). If you use ʻitem_list.append (tmp_item) without using this method, you will end up with a dict that contains multiple items of one product, and you will have to twist your neck and straddle the days.

The following article helped me.

When you append a dict type variable to a Python list, the variable behaves like a pointer ... · GitHub

This theory seems to need to be understood, so I would like to summarize it separately.

(3) Format data with pandas

If you can create a dict type list, the rest is not difficult and the basic operation of pandas is enough. Create a data frame and format it a little to make it easier to use.

#Create a data frame
item_df = pd.DataFrame(item_list)

#Change the order of columns
items_df = items_df.reindex(columns=['itemName', 'itemPrice', 'itemCaption', 'itemUrl', 'shopName', 'shopUrl'])

#Change column names and row numbers:Column names should be in Japanese, and row numbers should be serial numbers starting from 1.
items_df.columns = ['Product name', 'Product price', 'Product description', 'Product URL', 'Store name', 'Store URL']
items_df.index = np.arange(1, 31)

(4) csv output

Output the created data frame to a csv file.

items_df.to_csv('./rakuten_mayqueen.csv')

In the argument of the df.to_csv () method, specify the save destination path (directory and file name). This time, I created a csv file directly under the directory where this script is located, using a relative path.

Now, let's open the output data in Excel or SpreadSheet.

スクリーンショット 2020-09-01 20.28.15.png

I was able to get it nicely!

in conclusion

For the time being, I was able to get product information from Rakuten Ichiba and even output csv. As a future policy,

** (1) Data collection and shaping ** Collect as many data as you need and shape them into a form that can be used.

** (2) Analysis and decision making of collected data ** Attempt reasonable pricing using data as a judgment material (decision-making)   That's why, next time, I'd like to do a little complicated data collection and shaping.

Recommended Posts

Output product information to csv using Rakuten product search API [Python]
Collect product information and process data using Rakuten product search API [Python]
Aggregate and analyze product prices using Rakuten Product Search API [Python]
[Python-pptx] Output PowerPoint font information to csv with python
Output search results of posts to a file using Mattermost API
[Python] Conversation using OpenJTalk and Talk API (up to voice output)
Python> Output numbers from 1 to 100, 501 to 600> For csv
Procedure to use TeamGant's WEB API (using python)
I tried to get Web information using "Requests" and "lxml"
Get GitHub information using PyGithub
This and that using reflect
Calculate information gain using NLTK
Try using pytest-Overview and Samples-
Collect product information and process data using Rakuten product search API [Python]
Search Twitter using Python
[Rails] How to get location information using Geolocation API
Using the National Diet Library Search API in Python
Output to "7-segment LED" using python on Raspberry Pi 3!
I tried to search videos using Youtube Data API (beginner)
Post to Twitter using Python
Search algorithm using word2vec [python]
Write to csv with Python
Tweet Now Playing to Twitter using the Spotify API. [Python]
Call github api in python to get pull request information
[Python] How to scrape a local html file and output it as CSV using Beautiful Soup
How to get followers and followers from python using the Mastodon API
Regularly upload files to Google Drive using the Google Drive API in Python
Python hand play (RDKit descriptor calculation: SDF to CSV using Pandas)
How to install python using anaconda
[Python] Loading csv files using pandas
[Python] Write to csv file with Python
[Python3] Google translate google translate without using api
Try using Pleasant's API (python / FastAPI)
[python] Read information with Redmine API
How to use OpenPose's Python API
Infinite product in Python (using functools)
Try using Python argparse's action API
[Django] Command to output QuerySet to csv
How to use bing search api
Interactively output BPE using python curses
Run Ansible from Python using API
[Python] How to use Typetalk API
Depth-first search using stack in Python
[Python] Create API to send Gmail
Convert XML document stored in XML database (BaseX) to CSV format (using Python)
Hit the Rakuten Ranking API to save the ranking of any category in CSV
Search for synonyms from the word list (csv) using Python Japanese WordNet
I tried to get the movie information of TMDb API with Python
Preprocessing with Python. Convert Nico Nico Douga tag search results to CSV format
Collecting information from Twitter with Python (Twitter API)
Homebrew Python Part 3-Amazon Product Search Program
Try using the Wunderlist API in Python
From Python to using MeCab (and CaboCha)
Data input / output in Python (CSV, JSON)
Introduction to Discrete Event Simulation Using Python # 1
Try using the Kraken API in Python
Bulk posting to Qiita: Team using Qiita API
Output color characters to pretty with python
[Python] Convert csv file delimiters to tab delimiters
Output Python log to console with GAE
Tweet using the Twitter API in Python
Log in to Slack using requests in Python
Get Youtube data in Python using Youtube Data API
I tried using UnityCloudBuild API from Python
Read Python csv and export to txt
Dump BigQuery tables to GCS using Python