[PYTHON] Sample code summary when working with Google Spreadsheets from Google Colab

Recently, I often touch Google spreadsheets from Google Colab.

(I also made a video like this, so please take a look if you like! → [[Learning Python with Safa Colab -9] How to write values from Google Colab to Google Spreadsheet (YouTube)](https:: //www.youtube.com/watch?v=4RCWYQk9bfM))

When I touch it, I often type in the usual code, so I decided to put the frequently used descriptions together in Qiita as sample code.

Referenced document

Basically, I will write while referring to the official document of gspread. gspread(Docs)

Install gspread

The library gspread, which is essential for working with Google Spreadsheets. On Google Colab, when you hit a command, add ! At the beginning.

!pip install gspread

As I wrote so far, I noticed that Google Colab seemed to have gspread installed from the beginning. By the way, the installed version is 3.0.1.

import gspread
print(gspread.__version__)
# => 3.0.1

One thing to keep in mind here is that this 3.0.1, which is installed from the beginning, is not the latest version, so there are some features that cannot be used. burnash/gspread(Releases)

Therefore, if you want to use the latest gspread function, you need to install the latest gspread by hitting the following command.

!pip install --upgrade gspread
import gspread
print(gspread.__version__)
# => 3.6.0

** However, this time I thought about writing with Google Colab in mind to handle Google spreadsheets immediately, so I will write using 3.0.1 which is installed by default. .. ** ** (Since the functions added in the latest version are also attractive, I would like to write that as a separate post.)

Authentication process required to handle Google Spreadsheets from code

Essential processing for handling Google Spreadsheets from Google Colab. Running this code will give you the authentication required to work with Google Sheets from within Google Colab. (Almost like magic, I use it when writing code that handles Google spreadsheets. If you have trouble writing, you can paste this description on Colab with copy and execute it.)

from google.colab import auth
from oauth2client.client import GoogleCredentials
import gspread

auth.authenticate_user()
gc = gspread.authorize(GoogleCredentials.get_application_default())

I'm not sure if this authentication is because both Google Colab and Google Spreadsheets are Google products, but it's really smart, so it's always great without any hassle.

Specifically, the steps are as follows.

When you execute the above code, the URL for authentication will be displayed, so click it

スクリーンショット 2020-06-13 15.21.03.png

You will be taken to the authentication screen on the Google side, so authenticate. When the authentication is completed, the verification code will be displayed, so copy it.

スクリーンショット 2020-06-13 15.21.19.png

If you paste the code you copied earlier on the Colab side, authentication is completed.

スクリーンショット 2020-06-13 15.21.31.png

It's easy and wonderful.

How to get Google Sheets

Regarding the method of obtaining the Google spreadsheet to be operated, gspread provides several methods, so I will introduce each of them. From here, I will write the code on the assumption that the authentication described above has been completed. (If you run it without being authenticated, you will get an authentication error.)

Get by file name

This is a method to open directly by file name.

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)

Get by spreadsheet ID

How to open with the ID included in the spreadsheet URL. (As an aside, as far as I looked up the information in Japanese, I got the impression that the most common way to open it was this way.)

# https://docs.google.com/spreadsheets/d/{This is the spreadsheet ID}/edit#gid=0
ss_id = "Spreadsheet ID"
workbook = gc.open_by_key(ss_id)

Get by URL

It is a method to enter the URL of the spreadsheet as it is and open it

ss_url = "Spreadsheet URL"
workbook = gc.open_by_url(ss_url)

Get all spreadsheets

There was also a way to get all the spreadsheets in Google Drive. https://gspread.readthedocs.io/en/latest/api.html#gspread.Client.openall

When run, it returns a list of spreadsheets.

workbook_list = gc.openall()
for workbook in workbook_list:
  print(workbook.title) #Display the title of the acquired file

How to get a sheet

For the spreadsheet obtained above, you can get a specific sheet by the following method.

How to get by passing the sheet name to be acquired to worksheet ()

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet 1")

How to get by passing the index of the sheet to be acquired to get_worksheet ()

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.get_worksheet(0)

How to get the value in the sheet

How to get by specifying a cell (acell)

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet 1")

print(worksheet.acell("A1").value)
# =>Value of cell A1

print(worksheet.acell("B2").value)
# =>B2 cell value

print(worksheet.cell(3,2).value)
# =>B3 cell value

How to get by specifying a cell (range)

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet 1")
print(worksheet.range(3,2)[0].value)
# =>For range, an array of cells is returned
# =>In the case of this specification, only one cell information of B3 is included in the array, so like this[0]The value is obtained by specifying with

#Get cell information in the range you want to specify. Since it changes with an array, all values are output in a loop
cell_list = worksheet.range("A1:B3")
for cell in cell_list:
  print(cell.value)

About Cell Object

By the way, the cell acquired here has the following three properties.

For example, the following is sample code to get and display the value of a cell in the specified range ("A1: B3 ").

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet 1")

#Get cell information in the range you want to specify. Since it changes with an array, all values are output in a loop
cell_list = worksheet.range("A1:B3")
for cell in cell_list:
  print('%s line%The value in column s is%s' % (cell.row, cell.col, cell.value))

How to get the values in the sheet (get all)

There are the following methods to get the value in the specified sheet.

get_all_records returns a list of dicts (dictionaries)

If you use get_all_records, the value of the target sheet will be returned as a list of dict (dictionary).

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet name you want to get")

print(worksheet.get_all_records())
# => [{'item1': 2, 'item2': 4, 'item3': 31}, {'item1': 3, 'item2': 6, 'item3': 32}, ...

If you call it without specifying any options, the first line will be treated as header. Please refer to here for how to specify options.

For example, if you want to convert the value in the sheet to json, you will get the following code.

#When converting to the specified sheet value json
import json
ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Target sheet name")

dict_list = worksheet.get_all_records()
print(json.dumps(dict_list))

get_all_values returns the values in the sheet as an array

Another option is to simply get it as an array of values by using get_all_values.

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet name you want to get")

print(worksheet.get_all_values())
# => [['item1', 'item2', 'item3'], ['2', '4', '31'], ['3', '6', '32'], ...

How to add a value to a sheet

Use ʻappend_row` to append values to the sheet with rows. The sample below is sample code that adds 100 lines with the value to the specified sheet. (For example, if there is already a description in the sheet, a new row will be added from there)

import random

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet 2")

for i in range(100):
  worksheet.append_row([i, i * 2, random.randint(1, 100)])

Below is a capture of the spreadsheet that was actually entered.

シートに値を行とともに値を追加した際のキャプチャ

If you want to delete all the contents of the sheet (impractical)

If you want to delete all the contents of the specified sheet, it seems that you need to execute as follows.

However, since this is a very delicate code (the number of existing lines is deleted line by line), it is better to delete it at once by the process of deleting the sheet itself written below. Looks good. </ font>

#If you want to delete all the values in the specified sheet
ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet name")

for i in range(1, len(worksheet.get_all_values())):
  worksheet.delete_row(1)

If you want to use the latest gspread, delete_rows is prepared, so please use that. If you use it, it seems possible to delete all the specified line range at once.

If you want to delete the sheet itself

If you want to delete the target sheet itself, pass the target sheet itself to del_worksheet.

ss_name = "Spreadsheet file name"
workbook = gc.open(ss_name)
worksheet = workbook.worksheet("Sheet name you want to delete")

workbook.del_worksheet(worksheet)

Afterword

That's all.

I haven't tried gspread yet, so I hope I can add it to this post little by little.

Recommended Posts

Sample code summary when working with Google Spreadsheets from Google Colab
Code snippets often used when using BigQuery with Google Colab
A memo when executing the deep learning sample code created from scratch with Google Colaboratory
Cheat sheet when scraping with Google Colaboratory (Colab)
BigQuery-Python was useful when working with BigQuery from Python
Code snippets often used when processing videos with Google Colaboratory
About learning with google colab
Summary of advantages (disadvantages) when switching from CUnit to Google Test
[SEO] Flow / sample code when using Google Analytics API in Python
Python parallel / parallel processing sample code summary
Play with Turtle on Google Colab
How to deal with OAuth2 error when using Google APIs from Python
Impressions and memorandums when working with VS code for the first time
How to use VS Code (code server) with Google Colab in just 3 lines
Use MeCab and neologd with Google Colab
Working with Azure CosmosDB from Python Part.2
Summary of snippets when developing with Go
Machine learning with Pytorch on Google Colab