👀 googly 👀

4 minutes

September 16, 2024

A year and a half between blog entries? That’s pretty standard, although I suppose I did move to Canada in that time.

The Zen of Python refers to a list of principles for Python-y code. Some of these were on my mind when I was trying to figure out some Google APIs.

Simple is better than complex.

Flat is better than nested.

Readability counts.

Here’s Google’s provided code from their quickstart for retrieving values from a spreadsheet.

import os.path

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

# If modifying these scopes, delete the file token.json.
SCOPES = ["https://www.googleapis.com/auth/spreadsheets.readonly"]

# The ID and range of a sample spreadsheet.
SAMPLE_SPREADSHEET_ID = "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms"
SAMPLE_RANGE_NAME = "Class Data!A2:E"


def main():
  """Shows basic usage of the Sheets API.
  Prints values from a sample spreadsheet.
  """
  creds = None
  # The file token.json stores the user's access and refresh tokens, and is
  # created automatically when the authorization flow completes for the first
  # time.
  if os.path.exists("token.json"):
    creds = Credentials.from_authorized_user_file("token.json", SCOPES)
  # If there are no (valid) credentials available, let the user log in.
  if not creds or not creds.valid:
    if creds and creds.expired and creds.refresh_token:
      creds.refresh(Request())
    else:
      flow = InstalledAppFlow.from_client_secrets_file(
          "credentials.json", SCOPES
      )
      creds = flow.run_local_server(port=0)
    # Save the credentials for the next run
    with open("token.json", "w") as token:
      token.write(creds.to_json())

  try:
    service = build("sheets", "v4", credentials=creds)

    # Call the Sheets API
    sheet = service.spreadsheets()
    result = (
        sheet.values()
        .get(spreadsheetId=SAMPLE_SPREADSHEET_ID, range=SAMPLE_RANGE_NAME)
        .execute()
    )
    values = result.get("values", [])

    if not values:
      print("No data found.")
      return

    print("Name, Major:")
    for row in values:
      # Print columns A and E, which correspond to indices 0 and 4.
      print(f"{row[0]}, {row[4]}")
  except HttpError as err:
    print(err)


if __name__ == "__main__":
  main()

Much of the code is boilerplate for getting/saving authentication information. The actual call to get values involves a lot of nested subcomponents.

service
spreadsheets
values
get
execute
get

It was all too complex, and too unreadable for me. I’d copy and paste half the code every time I had to access Google’s APIs. So given my recent foray into releasing Python libraries in pip I decided to make an easy-to-use library for Google APIs. Here’s my equivalent code, using the googly library.

from googly import SheetsAPI

# The ID and range of a sample spreadsheet.
SAMPLE_SPREADSHEET_ID = "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms"
SAMPLE_RANGE_NAME = "Class Data!A2:E"


def main():
  """Shows basic usage of the Sheets API.
  Prints values from a sample spreadsheet.
  """
  service = SheetsAPI()
  service.set_sheet_id(SAMPLE_SPREADSHEET_ID)
  values = service.get_range(SAMPLE_RANGE_NAME)

  if not values:
    print("No data found.")
    return

  print("Name, Major:")
  for row in values:
    # Print columns A and E, which correspond to indices 0 and 4.
    print(f"{row[0]}, {row[4]}")


if __name__ == "__main__":
  main()

Right now, there are seven modules, basically corresponding with various tasks I’ve needed the library for over the years.

GMail - The first version of the library I wrote back in early 2016 to automatically monitor how badly I was failing at Inbox Zero. I haven’t actually made it to zero, but it proved helpful for motivating me to keep my inbox clean. I’ve also used it for automating some GMail labels that I can’t do with filters and a reliable way of sending emails.
Drive - I’ve been a long time user of k4dirstat and had some files in my Google Drive that were taking too much space, but it wasn’t until I reimplemented k4dirstat for my Google Drive that I realized what the problem was. Then just this past week I used my own library to download a boat-load of files while keeping the directory structure.
People - I got annoyed by all my contacts having differently formatted names/phone numbers/birthdays, so I automated a cleanup.
Photos - I maintain a Google Photo Album with all the photos from my family Instagram account, which is kept in sync semi-automatically.
Sheets - A large portion of the logic behind this library was being annoyed at the number of times I had to export a Google Sheet as csv, do some offline adjustments with a script, and then copy/paste the values back into Google Sheets.
YouTube - At one point, all I wanted was an RSS feed of each of the YouTube channels I was subscribed to.
Calendar - I had a reason for this at some point, I swear.

Extending it to other APIs is relatively trivial, I just haven’t had reason to do so yet. However, I have come across a few things that the official APIs just won’t do.

As someone with way too many videos in my “Watch Later” playlist on YouTube, it drives me crazy that I can’t programmatically access this list. Just how many minutes have I queued up, YouTube!?
Google Photos is unsurprisngly reticient to let me delete from their service automatically, which I get, but also makes testing very hard.
Hey Google Sheets, why can’t you tell me what color a cell is?

ProbablyDavid
.com

[personal website of David Lu!!]

ProbablyDavid
.com

[personal website of David Lu!!]

👀 googly 👀