Googlesheet APIv4 getting empty cells

GregMa picture GregMa · Jul 18, 2016 · Viewed 8.7k times · Source

I have a googlesheet where a column may contain no information in it. While iterating through the rows and looking at that column, if the column is blank, it's not returning anything. Even worse, if I do a get of a full row and include that common, say get 5 columns, I get back only 4 columns when any of the columns are empty. How do I return either NULL or an empty string if I'm getting a row of columns and one of the cells in a column is empty?

// Build a new authorized API client service.
Sheets service = GoogleSheets.getSheetsService();
range = "Functional Users!A3:E3";
response = service.spreadsheets().values().get(spreadsheetId, range).execute();
values = response.getValues();
cells = values.get(0);

I am getting 5 cells in the row. cells.size() should ALWAYS return five. However if any of the 5 cells are blank, it will return fewer cells. Say only the cell at B3 is empty. cells.size() will be 4. Next iteration, I get A4:E4 and cell D4 is empty. Again, cells.size() will be 4. With no way to know just which cell is missing. If A4 AND D4 AND E4 are empty, cells.size() will be 2.

How do I get it to return 5 cells regardless of empty cells?

Answer

Chase Wright picture Chase Wright · Nov 2, 2016

The way I solved this issue was converting the values into a Pandas dataframe. I fetched the particular columns that I wanted in my Google Sheets, then converted those values into a Pandas dataframe. Once I converted my dataset into a Pandas dataframe, I did some data formatting, then converted the dataframe back into a list. By converting the list to a Pandas dataframe, each column is preserved. Pandas already creates null values for empty trailing rows and columns. However, I needed to also convert the non trailing rows with null values to keep consistency.

# Authenticate and create the service for the Google Sheets API
credentials = ServiceAccountCredentials.from_json_keyfile_name(KEY_FILE_LOCATION, SCOPES)
http = credentials.authorize(Http())
discoveryUrl = ('https://sheets.googleapis.com/$discovery/rest?version=v4')
service = discovery.build('sheets', 'v4',
    http=http,discoveryServiceUrl=discoveryUrl)

spreadsheetId = 'id of your sheet'
rangeName = 'range of your dataset'
result = service.spreadsheets().values().get(
    spreadsheetId=spreadsheetId, range=rangeName).execute()
values = result.get('values', [])

#convert values into dataframe
df = pd.DataFrame(values)

#replace all non trailing blank values created by Google Sheets API
#with null values
df_replace = df.replace([''], [None])

#convert back to list to insert into Redshift
processed_dataset = df_replace.values.tolist()