I have been trying the following code
var response = UrlFetchApp.fetch("https://www.google.com/#q=this+is+a+test");
var contentText = response.getContentText();
Logger.log(contentText);
var thisdoc=DocumentApp.getActiveDocument().getBody() ;
thisdoc.setText(contentText);
Logger.log(contentText.indexOf("About"));
But it only seems to return the header, and empty body, and none of the search results. At minimum I should be able to see the "About xxx results" at the top of the browser but this doesn't appear in the text nor does the indexOf return a positive screen. I'm wondering if the search results are populated post page load meaning the body tag would indeed be empty and if so is there a workaround?
Edit: No it doesn't break the TOS as this is a GAFE app (which is a business app) and for business accounts they have both free and premium models of access to their API.
Google provides an API for authorized searches, so don't fuss with scraping web pages.
For example, you can use the Custom Search API with UrlFetch()
.
From the script editor, go to Resources -> Developer's Console Project... -> View Developer's Console
. Create a new key for Public API access. Follow the instructions from the Custom Search API docs to create a Custom search engine. Enter the key and ID into the script where indicated. (More details below.)
This example script will return an object containing the results of a successful search; you can navigate the object to pull out whatever info you want.
/**
* Use Google's customsearch API to perform a search query.
* See https://developers.google.com/custom-search/json-api/v1/using_rest.
*
* @param {string} query Search query to perform, e.g. "test"
*
* returns {object} See response data structure at
* https://developers.google.com/custom-search/json-api/v1/reference/cse/list#response
*/
function searchFor( query ) {
// Base URL to access customsearch
var urlTemplate = "https://www.googleapis.com/customsearch/v1?key=%KEY%&cx=%CX%&q=%Q%";
// Script-specific credentials & search engine
var ApiKey = "--get from developer's console--";
var searchEngineID = "--get from developer's console--";
// Build custom url
var url = urlTemplate
.replace("%KEY%", encodeURIComponent(ApiKey))
.replace("%CX%", encodeURIComponent(searchEngineID))
.replace("%Q%", encodeURIComponent(query));
var params = {
muteHttpExceptions: true
};
// Perform search
Logger.log( UrlFetchApp.getRequest(url, params) ); // Log query to be sent
var response = UrlFetchApp.fetch(url, params);
var respCode = response.getResponseCode();
if (respCode !== 200) {
throw new Error ("Error " +respCode + " " + response.getContentText());
}
else {
// Successful search, log & return results
var result = JSON.parse(response.getContentText());
Logger.log( "Obtained %s search results in %s seconds.",
result.searchInformation.formattedTotalResults,
result.searchInformation.formattedSearchTime);
return result;
}
}
Example:
[15-05-04 18:26:35:958 EDT] {
"headers": {
"X-Forwarded-For": "216.191.234.70"
},
"useIntranet": false,
"followRedirects": true,
"payload": "",
"method": "get",
"contentType": "application/x-www-form-urlencoded",
"validateHttpsCertificates": true,
"url": "https://www.googleapis.com/customsearch/v1?key=--redacted--&cx=--redacted--&q=test"
}
[15-05-04 18:26:36:812 EDT] Obtained 132,000,000 search results in 0.74 seconds.
(excerpted from Google's documentation.)
Go to the Google Developers Console.
Select a project, or create a new one.
In the sidebar on the left, expand APIs & auth. Next, click APIs. In the list of APIs, make sure the status is ON for the Custom Search API.
. . .
In the sidebar on the left, select Credentials.
Create your application's API key by clicking Create new Key under Public API access. For Google Script use, create a Browser key.
Once the Key for browser applications is created, copy the API key into your code.
Follow the instructions here. Once you've created your custom search engine, copy the Search engine ID into your code.