What is Authorization Code with Proof Key for Code Exchange?
The Authorization Code flow with Proof Key for Code Exchange (PKCE) is an authentication method. It’s part of OAuth2. It is used to authenticate end-users.
The OAuth2 protocol has been patched a couple of times over time. Some authentication methods turned out to be less secure than expected.
Read about the history of OAuth2 and the purpose of the protocol to understand how PKCE came to be.
This article covers:
- The history of the OAuth2 protocol
- How it got pwned
- How it has been patched
The earlier versions of OAuth
OAuth was born in 2006/2007, at Twitter. They were trying to solve a problem. They had numerous servers on which they ran their site. Traffic was routed to these servers randomly. They needed some mechanism to make their APIs stateless and still know who is “logged in”. So they invented a protocol.
The OAuth2 protocol (rfc6749), does two things:
- Authenticate humans (h2m)
- Authenticate machines (m2m)
The protocol has been designed for machine-to-machine (m2m) communication and human-to-machine (h2m) communication. To make this possible, the earlier OAuth versions had four “authentication methods” (grant types):
- Client Credentials (m2m)
Useful for a daemon service, for example. It posts the client_id + client_secret to the authorization server, and it responds with tokens. The tokens do not represent a user. - Resource Owner (m2m)
Useful for a daemon service, for example. It posts a username + password to the authorization server, and it responds with tokens. The tokens represent a user. - Implicit (h2m)
The Single-Page-Application-flow. Used this flow to identify end-users at the client side. - Authorization Code (h2m)
The server-side-website-flow. In the earlier versions of the protocol, this flow was explicitly used to identify an end-user on the server side.
Authentication manifests in three tokens:
- Access tokens
This is an opaque string that contains the users’ authorizations. - Refresh tokens
This is an opaque string that can be used to get a new access token.
How web authentication works
When a user wants to access a resource (e.g. invoke an API endpoint), the user must authenticate first. In the earlier versions of OAuth, with Single-Page-Applications specifically, the end-user obtains an access_token
at the client side. This authentication method is called the implicit
grant:
To get an access token, the user uses his browser to navigate to the authorization server via a special endpoint:
https://{your_auth_server}/authorize?client_id={your_client_id}&response_type=token&redirect_uri=http://localhost/callback
When the end user navigates to this address, the end user will see a log-in page. After a successful log-in, the user is redirected back to the site, to the redirect_uri
. The authorization server will include the tokens in the redirect URL as follows:
http://localhost/callback#access_token=2YotnFZFEjr1zCsicMWpAA&state=xyz&token_type=example&expires_in=3600
Note the #
character in the URL. This is to make sure the token is not logged on the server side. The Single-Page-Application can read the payload after the #
, and include the token in the requests to the APIs.
Think twice if you want to use the Implicit flow
Over time, it turned out that this approach had some down-sides:
- The token will be in the browser history
- There is no way of telling who is receiving the token
PKCE at the client-side
Later, PKCE
was introduced: rfc7636. Instead of using the Implicit
flow, Single-Page Applications started using it to obtain tokens.
PKCE
works as follows:
Just like with the Implicit flow, the user navigates to the /authorize
endpoint. But unlike Implicit, the Single-Page Application generates a password first and adds it to the authorize-request. Also, note that the response type is no longer token
. Instead, the Single-Page Application requests a Code
which it will exchange for a token
with the password it just generated:
https://{your_auth_server}/authorize?client_id={your_client_id}&redirect_uri=http://localhost/callback&response_type=code&code_challenge=elU6u5zyqQT2f92GRQUq6PautAeNDf4DQPayyR0ek_c&code_challenge_method=S256
Just like with the implicit flow, when the user navigates to this URL, the user will see a log-in page. After a successful log-in, the authorization server redirects to the provided redirect URL. Unlike the implicit flow, the URL will not contain an access token. Instead, it contains a code that can be exchanged for an access token:
http://localhost/callback#code=adgkjhedkjasdgfhsadjgalk
Now, the Single-Page Application must post this code and the secret password (code_challenge) to the authorization server. If they check out, the authorization server will respond with the requested token:
POST https://{your_auth_server}/token
client_id={your_client_id}&
code_verifier=7073d688b6dcb02b9a2332e0792be265b9168fda7a6&
redirect_uri=http%3A%2F%2Flocalhost%2Fcallback&
grant_type=authorization_code&
code=adgkjhedkjasdgfhsadjgalk
This mechanism has the following advantages:
- The code can be exchanged for a token only once
- Only whoever has the secret password will get the token
So, as a result, it is much safer to say that only the Single-Page Application has gotten the access_token
. And if somebody manages to get both the code and the password, there’s a very short window these can be abused.
Still not safe enough
But still, PKCE on the client side is considered not to be safe enough.
- You still can not be sure the
access_token
ended up on the intended computer. It might still be possible to ‘steal’ the code and the code_verifier. - Since the
access_token
is stored at the client-side (usually in localstorage or session storage), there are ways to ‘steal’ the access_token.
So, generally speaking, currently, the statement in the community is:
Do not store access_tokens at the client-side.
PKCE on the server side
The problem with PKCE on the client side is that the client_secret
is not used. The client_secret
may never be used on the client side. With the client_secret
one can ensure the token is given to a trusted party.
Therefore, the only way to ensure the token ends up at the intended machine is to move the process of obtaining a token to a trusted machine: your server. As a result, you are going to need some authentication gateway:
Summary
For more secure applications, do not use the Implicit flow or the PKCE flow on the client-side. Instead, use the PKCE-flow with client_secret at the server side.
Implementing API Authorization
If you have acquired an access_token
, you can use it to grant users authorization to access resources. To understand how this process works and how to implement it in ASP.NET Core, read this article.