How and why is Google OAuth token validation performed?

Jakob picture Jakob · Jun 21, 2013 · Viewed 9k times · Source

Between access tokens, refresh tokens, scopes, audiences, and client IDs, I was confused when the Google OAuth documentation instructed me to validate all tokens in order to prevent the confused deputy problem. The Wikipedia article linked to only describes the general problem at a high level, not specific to OAuth or even network authentication. If I understand it correctly, token validation is not even part of OAuth2 but actually depends on the specific implementation. So here is my question:

How and why is Google OAuth token validation performed?

A concrete example of the confused deputy problem in this context would be especially appreciated. Also note that I ask this in the context of entirely client-side apps, if that makes a difference.

Answer

Andre D picture Andre D · Jul 3, 2013

Google is referring specifically to the access token.

In the context of OAuth 2.0, the confused deputy problem applies to the Implicit Grant protocol flow when used for authentication. What Google calls "OAuth 2.0 for Client-side Applications" is based on the implicit grant protocol flow.

Since the implicit flow exposes the access token to the end user through the URI fragment, it introduces the possiblity that the access token might be tampered with. A legitimate app (an OAuth client) can become the confused deputy by accepting an access token that was issued to a different (malicious) app, thereby giving an attacker access to a victim's account.

The critical step in validating the access token is that the app verifies that the access token was not originally issued to a different app. Google calls attention to this when they say:

Note: When verifying a token, it is critical to ensure the audience field in the response exactly matches your client_id registered in the APIs Console. This is the mitigation for the confused deputy issue, and it is absolutely vital to perform this step.

As a simplified example, imagine there are two apps: (1) FileStore, a legitimate file storage app, and (2) EvilApp. Both apps use Google's authentication process for client-side apps. Alice is an innocent end user, and her Google user ID is XYZ.

  1. Alice signs into FileStore using Google.
  2. After the auth process, FileStore creates an account for Alice and associates it with Google user ID XYZ.
  3. Alice uploads some files to her FileStore account. So far everything is fine.
  4. Later, Alice signs into EvilApp, which offers games that look kind of fun.
  5. As a result, EvilApp gains an access token that is associated with Google user ID XYZ.
  6. The owner of EvilApp can now construct the redirect URI for FileStore, inserting the access token it was issued for Alice's Google account.
  7. The attacker connects to FileStore, which will take the access token and check with Google to see what user it is for. Google will say that it is user XYZ.
  8. FileStore will give the attacker access to Alice's files because the attacker has an access token for Google user XYZ.

FileStore's mistake was not verifying with Google that the access token it was given was truly issued to FileStore; the token was really issued to EvilApp.

Others have described this much more elegantly than I:

I hope this explains the why part of access token validation with client-side apps, and how it relates to the confused deputy problem.