Data science It is essential to set data in RDB.
-[Python beginners tried data science] Data acquisition from API [day1] -[Basics of data science] Collecting data from RSS with python
For the time being, the series continues. I plan to continue writing analysis with the acquired data in the future. (plans)
The flow is common because you just put it from csv to db --db set --parse csv --Connect to mysql --Hit sql with csv row data It feels like. I'm a python beginner and I usually enjoy querying with rails, so I tried it while checking each one.
It's usually basic mysql.
import csv
Put the csv module in
with open('path/to/csvfile.csv') as csvfile:
reader = csv.reader(csvfile)
In `csvfile``` contains ``` <open file'path / to / csvfile.csv', mode'r' at 0x1098da030>
`object, and the reader contains csvfile
Contains an object that iterates over the line.
Return a reader object which will iterate over lines in the given csvfile. csvfile can be any object which supports the iterator protocol and returns a string each time its next() method is called — file objects and list objects are both suitable.
From here
I didn't want to include the header in the csvfile, so I skipped the header.
next(reader, None)
The explanation of next () is here.
next(iterarot[, default]) Retrieve the next item from the iterator by calling its next() method. If default is given, it is returned if the iterator is exhausted, otherwise StopIteration is raised.
for row in reader:
#Put line by line in db
import mysql.connector
Put the module in
dbcon = mysql.connector.connect(
database=inifile.get("database", "db"),
user=inifile.get("database", "user"),
password=inifile.get("database", "password"),
host=inifile.get("database", "host")
)
dbcur = dbcon.cursor()
dbcur.execute('INSERT INTO table name(col name, col name) VALUES(%s, "%s")' % ('v1', 'v2'))
dbcon.commit()
Enter with. If you read the air, you can understand it (sorry, it was troublesome to check the behavior of each module's method). It feels like sql is actually being executed by commit. One of the reasons is probably that it is more efficient to hit sql at once when connecting once rather than disconnecting by connecting mysql many times when there are multiple sqls. (speculation)
Like this. (The table name and config are for my environment this time, so please change them appropriately.)
import mysql.connector
import csv
import ConfigParser
inifile = ConfigParser.ConfigParser()
inifile.read("config/database.ini")
dbcon = mysql.connector.connect(
database=inifile.get("database", "db"),
user=inifile.get("database", "user"),
password=inifile.get("database", "password"),
host=inifile.get("database", "host")
)
dbcur = dbcon.cursor()
with open('RCdata/chefmozaccepts.csv') as csvfile:
reader = csv.reader(csvfile)
next(reader, None)
for row in reader:
dbcur.execute('INSERT INTO restaurants_payments_methods (restaurant_id, payment_method) VALUES(%s, "%s")' % tuple(row))
dbcon.commit()
Other things I did not know about python.
Difference between tuple and list from here
just like lists. The differences between tuples and lists are, the tuples cannot be changed unlike lists and tuples use parentheses, whereas lists use square brackets.
I didn't know the tuple.
Recommended Posts