How to Download Your Google Sheet into Python in One Line of Code

August 11, 2017

How to Download Your Google Sheet into Python in One Line of Code

This example will show you how to connect your Google Sheets data with Python and download the results as a Pandas dataframe. This is an ideal choice for small datasets or for collaboration with colleagues who are comfortable with spreadsheets, but unwilling to work in code or databases. Thank you to Artem Zhukov whose Stackoverflow answer let me know this was possible.

Before you can download your Google Sheet into Python you need to publish your Sheet to the web. First open the Google Sheet you wish to download. Then in the toolbar under “File” select “Publish to the web…” This will open a menu prompting you to select the “Entire Document” for publication, or a selection in addition to the format the Sheet will be published in. Select “Entire Document” and “Comma-separated values (.csv)” (“Tab-separated values” should work as well if you prefer).

1: Publish to the web

Navegate to the Google Sheet you wish to download. Under “File” select “Publish to the web…”

alt publish_to_the_web

2: Select CSV and click “Publish”

Select “Entire Document” and “Comma-separated values (.csv)” then click “Publish”

alt publish_to_the_web_select_csv

You will now see a notification in grey “This document is published to the web.” Copy the link in the middle of the menu. This is the link to your data we will use in Python.

alt publish_to_the_web_copy_link

4: Import Pandas and download your Google Sheet

Start a Python session and import Pandas. Pandas will present our Google Sheet in tabular form where we can perform statistical analysis and plot the data.

import pandas as pd

Next we will load the Google Sheet using our link and Pandas “read_csv” function. Here it is in one line!

pd.read_csv('https://docs.google.com/spreadsheets/d/1dvCz2yFWeJFj2Z4mmgnY9pKsAFCgJ2aAlkAAJNDjxEo/pub?output=csv')
Park Camping Boating Biking_and_or_Hiking Fishing Equestrian Golf Museums OHV Winter_Sports Wildlife
0 Historic Union Pacific Rail Trail 0 0 1 0 1 0 0 0 1 1
1 This Is The Place 0 0 1 0 0 0 1 0 1 1
2 Anasazi 0 0 0 0 0 0 1 0 1 0
3 Antelope Island 1 1 1 0 1 0 1 0 1 1
4 Bear Lake Marina 0 1 0 1 0 0 0 0 1 1
5 Camp Floyd 0 0 0 0 0 0 1 0 1 0
6 Dead Horse Point 1 0 1 0 0 0 0 0 1 1
7 Deer Creek 1 1 1 1 0 0 0 0 1 1
8 East Canyon 1 1 1 1 0 0 0 1 1 1
9 Escalante 1 1 1 1 0 0 0 0 1 1
10 Fremont Indian 1 0 1 1 1 0 1 1 1 1
11 Goblin Valley 1 0 1 0 0 0 0 1 1 0
12 Great Salt Lake Marina 1 1 1 0 0 0 0 0 1 1
13 Green River 1 1 0 1 0 1 0 0 1 0
14 Huntington 1 1 0 1 0 0 0 0 1 1
15 Hyrum 1 1 1 1 0 0 0 0 1 1
16 Jordanelle 1 1 1 1 1 0 0 0 1 1
17 Otter Creek 1 1 0 1 0 0 0 0 1 1
18 Palisade 1 1 0 1 0 1 0 1 1 0
19 Piute 1 1 0 1 0 0 0 0 1 0
20 Red Fleet 1 1 0 1 0 0 0 0 1 1
21 Rockport 1 1 1 1 0 0 0 0 1 1
22 Scofield 1 1 0 1 0 0 0 0 1 0
23 Soldier Hollow 1 0 1 0 1 1 0 0 1 1
24 Stagecoach Inn 0 0 0 0 0 0 1 0 1 0
25 Starvation 1 1 0 1 0 0 0 1 1 0
26 Steinaker 1 1 1 1 0 0 0 1 1 1
27 Utah Field House Of Natural History 0 0 0 0 0 0 1 0 1 0
28 Utah Lake 1 1 1 1 0 0 0 0 1 0
29 Wasatch Mountain 1 0 1 0 1 1 0 1 1 1
30 Willard - North Marina 1 1 1 1 0 0 0 0 1 1
31 Yuba Lake 1 1 1 1 0 0 0 1 1 1
32 Territorial Statehouse 0 0 0 0 0 0 1 0 1 0
33 Jordan River OHV 0 0 1 0 0 0 0 1 1 1
34 Millsite 1 1 0 1 0 0 0 1 1 1
35 Fort Deseret 0 0 0 0 0 0 0 0 0 0
36 Flight Park State Recreation Area 0 0 1 0 0 0 0 0 1 1
37 Bear Lake - Rendezvous Beach 1 1 1 1 0 0 0 0 1 1
38 Bear Lake - First Point 0 1 1 1 1 0 0 0 1 1
39 Bear Lake - South Eden 1 1 1 1 1 0 0 0 1 1
40 Bear Lake - Cisco Beach 0 1 1 1 1 0 0 0 1 1
41 Bear Lake - Rainbow Cove 0 1 1 1 1 0 0 0 1 1
42 Bear Lake - North Eden 1 1 1 1 1 0 0 0 1 1
43 Bear Lake - Hodges Canyon Trailhead 0 0 1 0 1 0 0 1 1 1
44 Willard - South Marina 1 1 1 1 0 0 0 0 1 1

A cleaner, yet more verbose version:

Though it is nice to have it work in one line, defining the url beforehand and adding some arguments to read_csv can improve readability. Setting “index_col=0” will make our first column the index.

url = 'https://docs.google.com/spreadsheets/d/1dvCz2yFWeJFj2Z4mmgnY9pKsAFCgJ2aAlkAAJNDjxEo/pub?output=csv'
df = pd.read_csv(url, index_col=0)
df
Camping Boating Biking_and_or_Hiking Fishing Equestrian Golf Museums OHV Winter_Sports Wildlife
Park
Historic Union Pacific Rail Trail 0 0 1 0 1 0 0 0 1 1
This Is The Place 0 0 1 0 0 0 1 0 1 1
Anasazi 0 0 0 0 0 0 1 0 1 0
Antelope Island 1 1 1 0 1 0 1 0 1 1
Bear Lake Marina 0 1 0 1 0 0 0 0 1 1
Camp Floyd 0 0 0 0 0 0 1 0 1 0
Dead Horse Point 1 0 1 0 0 0 0 0 1 1
Deer Creek 1 1 1 1 0 0 0 0 1 1
East Canyon 1 1 1 1 0 0 0 1 1 1
Escalante 1 1 1 1 0 0 0 0 1 1
Fremont Indian 1 0 1 1 1 0 1 1 1 1
Goblin Valley 1 0 1 0 0 0 0 1 1 0
Great Salt Lake Marina 1 1 1 0 0 0 0 0 1 1
Green River 1 1 0 1 0 1 0 0 1 0
Huntington 1 1 0 1 0 0 0 0 1 1
Hyrum 1 1 1 1 0 0 0 0 1 1
Jordanelle 1 1 1 1 1 0 0 0 1 1
Otter Creek 1 1 0 1 0 0 0 0 1 1
Palisade 1 1 0 1 0 1 0 1 1 0
Piute 1 1 0 1 0 0 0 0 1 0
Red Fleet 1 1 0 1 0 0 0 0 1 1
Rockport 1 1 1 1 0 0 0 0 1 1
Scofield 1 1 0 1 0 0 0 0 1 0
Soldier Hollow 1 0 1 0 1 1 0 0 1 1
Stagecoach Inn 0 0 0 0 0 0 1 0 1 0
Starvation 1 1 0 1 0 0 0 1 1 0
Steinaker 1 1 1 1 0 0 0 1 1 1
Utah Field House Of Natural History 0 0 0 0 0 0 1 0 1 0
Utah Lake 1 1 1 1 0 0 0 0 1 0
Wasatch Mountain 1 0 1 0 1 1 0 1 1 1
Willard - North Marina 1 1 1 1 0 0 0 0 1 1
Yuba Lake 1 1 1 1 0 0 0 1 1 1
Territorial Statehouse 0 0 0 0 0 0 1 0 1 0
Jordan River OHV 0 0 1 0 0 0 0 1 1 1
Millsite 1 1 0 1 0 0 0 1 1 1
Fort Deseret 0 0 0 0 0 0 0 0 0 0
Flight Park State Recreation Area 0 0 1 0 0 0 0 0 1 1
Bear Lake - Rendezvous Beach 1 1 1 1 0 0 0 0 1 1
Bear Lake - First Point 0 1 1 1 1 0 0 0 1 1
Bear Lake - South Eden 1 1 1 1 1 0 0 0 1 1
Bear Lake - Cisco Beach 0 1 1 1 1 0 0 0 1 1
Bear Lake - Rainbow Cove 0 1 1 1 1 0 0 0 1 1
Bear Lake - North Eden 1 1 1 1 1 0 0 0 1 1
Bear Lake - Hodges Canyon Trailhead 0 0 1 0 1 0 0 1 1 1
Willard - South Marina 1 1 1 1 0 0 0 0 1 1

Documentation for read_csv can be found at: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

Sources:

comments powered by Disqus