How do I access (read, write) Google Sheets spreadsheets with Python?

Abhi picture Abhi · Mar 13, 2012 · Viewed 107.7k times · Source

I am wondering if you can point me to an example of reading/writing to/from a google doc/spreadsheet using python.

I did look at google docs API here https://developers.google.com/google-apps/spreadsheets/ but not sure if I hit the right link. Also an example will be of great help.

What I am trying to do is query spreadsheets based on the different columns more like a SQL query and then do some downstream parsing with the data and put it into another spreadsheet or doc at google docs.

Best, -Abhi

Answer

wescpy picture wescpy · Jun 16, 2016

(Jun-Dec 2016) Most answers here are now out-of-date as: 1) GData APIs are the previous generation of Google APIs, and that's why it was hard for @Josh Brown to find that old GData Docs API documentation. While not all GData APIs have been deprecated, all newer Google APIs do not use the Google Data protocol; and 2) Google released a new Google Sheets API (not GData). In order to use the new API, you need to get the Google APIs Client Library for Python (it's as easy as pip install -U google-api-python-client [or pip3 for Python 3]) and use the latest Sheets API v4+, which is much more powerful & flexible than older API releases.

Here's one code sample from the official docs to help get you kickstarted. However, here are slightly longer, more "real-world" examples of using the API you can learn from (videos plus blog posts):

The latest Sheets API provides features not available in older releases, namely giving developers programmatic access to a Sheet as if you were using the user interface (create frozen rows, perform cell formatting, resizing rows/columns, adding pivot tables, creating charts, etc.), but NOT as if it was some database that you could perform searches on and get selected rows from. You'd basically have to build a querying layer on top of the API that does this. One alternative is to use the Google Charts Visualization API query language, which does support SQL-like querying. You can also query from within the Sheet itself. Be aware that this functionality existed before the v4 API, and that the security model was updated in Aug 2016. To learn more, check my G+ reshare to a full write-up from a Google Developer Expert.

Also note that the Sheets API is primarily for programmatically accessing spreadsheet operations & functionality as described above, but to perform file-level access such as imports/exports, copy, move, rename, etc., use the Google Drive API instead. Examples of using the Drive API:

(*) - TL;DR: upload plain text file to Drive, import/convert to Google Docs format, then export that Doc as PDF. Post above uses Drive API v2; this follow-up post describes migrating it to Drive API v3, and here's a developer video combining both "poor man's converter" posts.

To learn more about how to use Google APIs with Python in general, check out my blog as well as a variety of Google developer videos (series 1 and series 2) I'm producing.

ps. As far as Google Docs goes, there isn't a REST API available at this time, so the only way to programmatically access a Doc is by using Google Apps Script (which like Node.js is JavaScript outside of the browser, but instead of running on a Node server, these apps run in Google's cloud; also check out my intro video.) With Apps Script, you can build a Docs app or an add-on for Docs (and other things like Sheets & Forms).

UPDATE Jul 2018: The above "ps." is no longer true. The G Suite developer team pre-announced a new Google Docs REST API at Google Cloud NEXT '18. Developers interested in getting into the early access program for the new API should register at https://developers.google.com/docs.

UPDATE Feb 2019: The Docs API launched to preview last July is now available generally to all... read the launch post for more details.

UPDATE Nov 2019: In an effort to bring G Suite and GCP APIs more inline with each other, earlier this year, all G Suite code samples were partially integrated with GCP's newer (lower-level not product) Python client libraries. The way auth is done is similar but (currently) requires a tiny bit more code to manage token storage, meaning rather than our libraries manage storage.json, you'll store them using pickle (token.pickle or whatever name you prefer) instead, or choose your own form of persistent storage. For you readers here, take a look at the updated Python quickstart example.