Introduction

The last days I spent quite a bit of time reading up on OAuth2 and OpenID Connect. There are a lot of pointers online on this topic, but unfortunately it's not easy to find easy digestible explanations. In case you are still in an unenlighted state and you don't want to read all those dry RFC documents, I can highly recommend the talk OAuth 2.0 and OpenID Connect (in plain English) by Nate Barbettini, which gives a very good introduction of OAuth2, OpenID Connect and how they should be used for authentication and authorization.

When I was looking into the OAuth Implicit flow to use OpenID Connect in a sort of Single Page Application setup, I quickly stumbled on articles recommending against the implicit flow because of security issues. Instead, one should use the authorization code flow with PKCE ("Proof Key for Code Exchange" and apparently to be pronounced as "pixy"). PKCE replaces the static secret used in the authorization flow with a temporary one-time challenge, making it feasible to use in public clients.

Step by step walkthrough in Python

In this notebook, I will dive into the OAuth 2.0 Authorization Code flow with PKCE step by step in Python, using a local Keycloak setup as authorization provider. Basic knowledge about OAuth flows and PKCE is assumed, as the discussion will not go into much theoretical details. There is already enough material online on this, written by more knowledgeable people. The focus lies on practical, step by step low-level HTTP operations. We wont even use an actual browser nor need an actual HTTP server for the redirect URL.

Setup

We don't require special libraries, just the standard library will do, except for the well known requests library to make the HTTP operations a bit simpler.

In [1]:
import base64
import hashlib
import html
import json
import os
import re
import urllib.parse
import requests

We need an OAuth/OpenID Connect provider obviously. Let's run a local instance of Keycloak through docker, for example as follows:

docker run --rm -it -p 9090:8080 \
    -e KEYCLOAK_USER=admin -e KEYCLOAK_PASSWORD=admin \
    jboss/keycloak:7.0.0

A detailed discussion of Keycloak is a bit out of scope for this walkthrough. The most important information is that it will act as an OAuth/OpenID Connect provider. Through the administration console (a http://localhost:9090/auth/admin/), we:

  • set up a client "pkce-test" (in the "master" realm) with access type "public" (to be able to use PKCE) and the catch-all "*" as valid redirect URIs (which is a required field).
  • create a user "john" with non-temporary password.
In [2]:
provider = "http://localhost:9090/auth/realms/master"
client_id = "pkce-test"
username = "john"
password = "j0hn"
redirect_uri = "http://localhost/foobar"

Connect to authentication provider

The first phase of the flow is to connect to the OAuth/OpenID Connect provider and authenticate. For a PKCE-enabled flow we need a some PKCE ingredients from the start.

PKCE code verifier and challenge

We need a code verifier, which is a long enough random alphanumeric string, only to be used "client side". We'll use a simple urandom/base64 trick to generate one:

In [3]:
code_verifier = base64.urlsafe_b64encode(os.urandom(40)).decode('utf-8')
code_verifier = re.sub('[^a-zA-Z0-9]+', '', code_verifier)
code_verifier, len(code_verifier)
Out[3]:
('zo6yP8H9te4I0lk2Uclcry47yPbTT9jRbdnIZPdMUfazH5iD8vkNw', 53)

To create the PKCE code challenge we hash the code verifier with SHA256 and encode the result in URL-safe base64 (without padding)

In [4]:
code_challenge = hashlib.sha256(code_verifier.encode('utf-8')).digest()
code_challenge = base64.urlsafe_b64encode(code_challenge).decode('utf-8')
code_challenge = code_challenge.replace('=', '')
code_challenge, len(code_challenge)
Out[4]:
('hjooUY_1tBlE_dBuCKGUK8XuSRrc_zNByH-roC5sIXA', 43)

Request login page

We now have all the pieces for the initial request, which will give us the login page of the authentication provider. Adding the code challenge signals to the OAuth provider that we are expecting the PKCE based flow.

In [5]:
state = "fooobarbaz"
resp = requests.get(
    url=provider + "/protocol/openid-connect/auth",
    params={
        "response_type": "code",
        "client_id": client_id,
        "scope": "openid",
        "redirect_uri": redirect_uri,
        "state": state,
        "code_challenge": code_challenge,
        "code_challenge_method": "S256",
    },
    allow_redirects=False
)
resp.status_code
Out[5]:
200

Parse login page (response)

Get cookie data from response headers (requires a bit of manipulation).

In [6]:
cookie = resp.headers['Set-Cookie']
cookie = '; '.join(c.split(';')[0] for c in cookie.split(', '))
cookie
Out[6]:
'AUTH_SESSION_ID=af09d80d-9901-445f-b789-c6dfa33ec175.4a821131b1a5; KC_RESTART=eyJhbGciOiJIUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICIwZjBlNjZiZS1hOWNkLTRhMjktODdiNS00NTAwMGZjYzk1NjcifQ.eyJjaWQiOiJwa2NlLXRlc3QiLCJwdHkiOiJvcGVuaWQtY29ubmVjdCIsInJ1cmkiOiJodHRwOi8vbG9jYWxob3N0L2Zvb2JhciIsImFjdCI6IkFVVEhFTlRJQ0FURSIsIm5vdGVzIjp7InNjb3BlIjoib3BlbmlkIiwiaXNzIjoiaHR0cDovL2xvY2FsaG9zdDo5MDkwL2F1dGgvcmVhbG1zL21hc3RlciIsInJlc3BvbnNlX3R5cGUiOiJjb2RlIiwiY29kZV9jaGFsbGVuZ2VfbWV0aG9kIjoiUzI1NiIsInJlZGlyZWN0X3VyaSI6Imh0dHA6Ly9sb2NhbGhvc3QvZm9vYmFyIiwic3RhdGUiOiJmb29vYmFyYmF6IiwiY29kZV9jaGFsbGVuZ2UiOiJoam9vVVlfMXRCbEVfZEJ1Q0tHVUs4WHVTUnJjX3pOQnlILXJvQzVzSVhBIn19.fh7zYDiNvXtZrzAW9mfDp1PgoSkIWm_zZhQvZHPZ2NM'

Extract the login URL to post to from the page HTML code. Because the the Keycloak login page is straightforward HTML we can get away with some simple regexes.

In [7]:
page = resp.text
form_action = html.unescape(re.search('<form\s+.*?\s+action="(.*?)"', page, re.DOTALL).group(1))
form_action
Out[7]:
'http://localhost:9090/auth/realms/master/login-actions/authenticate?session_code=CVE8m4drDjMKiwNLmEYR_iHsIoPIn45X9xBM38-U3Fc&execution=368ecffb-b7a2-4c42-a339-53a88db0b897&client_id=pkce-test&tab_id=BkeNO89TWqw'

Do the login (aka authenticate)

Now, we post the login form with the user we created earlier, passing it the extracted cookie as well.

In [8]:
resp = requests.post(
    url=form_action, 
    data={
        "username": username,
        "password": password,
    }, 
    headers={"Cookie": cookie},
    allow_redirects=False
)
resp.status_code
Out[8]:
302

As expected we are forwarded, let's get the redirect URL.

In [9]:
redirect = resp.headers['Location']
redirect
Out[9]:
'http://localhost/foobar?state=fooobarbaz&session_state=af09d80d-9901-445f-b789-c6dfa33ec175&code=478296a3-c3ea-4a2f-bd2e-69f3181fb78c.af09d80d-9901-445f-b789-c6dfa33ec175.294e60a6-9596-406b-9eae-1912aebe04dd'
In [10]:
assert redirect.startswith(redirect_uri)

Extract authorization code from redirect

The redirect URL contains the authentication code.

In [11]:
query = urllib.parse.urlparse(redirect).query
redirect_params = urllib.parse.parse_qs(query)
redirect_params
Out[11]:
{'code': ['478296a3-c3ea-4a2f-bd2e-69f3181fb78c.af09d80d-9901-445f-b789-c6dfa33ec175.294e60a6-9596-406b-9eae-1912aebe04dd'],
 'session_state': ['af09d80d-9901-445f-b789-c6dfa33ec175'],
 'state': ['fooobarbaz']}
In [12]:
auth_code = redirect_params['code'][0]
auth_code
Out[12]:
'478296a3-c3ea-4a2f-bd2e-69f3181fb78c.af09d80d-9901-445f-b789-c6dfa33ec175.294e60a6-9596-406b-9eae-1912aebe04dd'

Exchange authorization code for an access token

We can now exchange the authorization code for an access token. In the normal OAuth authorization flow we should include a static secret here, but instead we provide the code verifier here which acts proof that the initial request was done by us.

In [13]:
resp = requests.post(
    url=provider + "/protocol/openid-connect/token",
    data={
        "grant_type": "authorization_code",
        "client_id": client_id,
        "redirect_uri": redirect_uri,
        "code": auth_code,
        "code_verifier": code_verifier,
    },
    allow_redirects=False
)
resp.status_code
Out[13]:
200

In the response we get, among others, the access token and id token:

In [14]:
result = resp.json()
result
Out[14]:
{'access_token': 'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJjd05fNm5WaEQyM0U4WWVnUGJob1pBU2c2Ynd4a2ktSkNsMHJlWFdJUEE4In0.eyJqdGkiOiIyZDA3OTBhNS1hNjMwLTQ2MDYtOTI5Ni1kNGQ0Y2M4NDg5NDgiLCJleHAiOjE1NjkzOTg0OTksIm5iZiI6MCwiaWF0IjoxNTY5Mzk4NDM5LCJpc3MiOiJodHRwOi8vbG9jYWxob3N0OjkwOTAvYXV0aC9yZWFsbXMvbWFzdGVyIiwiYXVkIjoiYWNjb3VudCIsInN1YiI6IjEwMzMzNmJmLWM0NzEtNGRkNS1iMzllLTQ2NTJhMDAzMmJlOCIsInR5cCI6IkJlYXJlciIsImF6cCI6InBrY2UtdGVzdCIsImF1dGhfdGltZSI6MTU2OTM5ODQzOCwic2Vzc2lvbl9zdGF0ZSI6ImFmMDlkODBkLTk5MDEtNDQ1Zi1iNzg5LWM2ZGZhMzNlYzE3NSIsImFjciI6IjEiLCJyZWFsbV9hY2Nlc3MiOnsicm9sZXMiOlsib2ZmbGluZV9hY2Nlc3MiLCJ1bWFfYXV0aG9yaXphdGlvbiJdfSwicmVzb3VyY2VfYWNjZXNzIjp7ImFjY291bnQiOnsicm9sZXMiOlsibWFuYWdlLWFjY291bnQiLCJtYW5hZ2UtYWNjb3VudC1saW5rcyIsInZpZXctcHJvZmlsZSJdfX0sInNjb3BlIjoib3BlbmlkIGVtYWlsIHByb2ZpbGUiLCJlbWFpbF92ZXJpZmllZCI6ZmFsc2UsInByZWZlcnJlZF91c2VybmFtZSI6ImpvaG4ifQ.C0723Ejex8k8dVGzTT2IRtEYXymAONBMkpoRCAGwd_E253L8WZGsEJ5-qkGgzpafgen85XpAD6c_x44QsD_q0P74J_9FqQPikY6JmpcUMNYD9eXjzZo21USVD2DKV7JOZ9Wp3N9GwcV50KCYZIcoIgHfGpHbCnhVppdHn5tuH936WsBGQL7tQ5zDFuT3fO1Op01XdJBg77LT91HTDq1zh42kH1fzgTO3zDzKKxlOJN6d7yBiMDCSIdZ3CVDRMSl65FK-7433SLWoJmNAQHIlH8RYrtvkNIfUZmABXe3CVBWQ2HJXG4Y-gocxkaiFxDoRwoYC6YfwiXKmjnita2vfSw',
 'expires_in': 60,
 'id_token': 'eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJjd05fNm5WaEQyM0U4WWVnUGJob1pBU2c2Ynd4a2ktSkNsMHJlWFdJUEE4In0.eyJqdGkiOiI5NmFkMmFiOS02ZDU0LTRjYmItYTIzYi01Y2ZkYWY1MDI0OTciLCJleHAiOjE1NjkzOTg0OTksIm5iZiI6MCwiaWF0IjoxNTY5Mzk4NDM5LCJpc3MiOiJodHRwOi8vbG9jYWxob3N0OjkwOTAvYXV0aC9yZWFsbXMvbWFzdGVyIiwiYXVkIjoicGtjZS10ZXN0Iiwic3ViIjoiMTAzMzM2YmYtYzQ3MS00ZGQ1LWIzOWUtNDY1MmEwMDMyYmU4IiwidHlwIjoiSUQiLCJhenAiOiJwa2NlLXRlc3QiLCJhdXRoX3RpbWUiOjE1NjkzOTg0MzgsInNlc3Npb25fc3RhdGUiOiJhZjA5ZDgwZC05OTAxLTQ0NWYtYjc4OS1jNmRmYTMzZWMxNzUiLCJhY3IiOiIxIiwiZW1haWxfdmVyaWZpZWQiOmZhbHNlLCJwcmVmZXJyZWRfdXNlcm5hbWUiOiJqb2huIn0.HAPQmX_ZxmTNxhxOst4U5STJZEEP-GgSfOh303p5oCYZ4y-jhk1SG4BMXW1dU7GWaTh9ccI2aVt8kYjOOsqin3jYvELoZRUxnMk0VftgARNcmb0vb-v2uCdSftSYUGvxmqU0TXeYL2hz7lELIJQSbH3C_DGg476yvRzWh7LEk2bdx8K3yS07jA6w0clDoB79uztfSrwnmtsB1S0soIsE14CaNwI93kiD40m6p9WU5EdPfIu0VaNqQrsCzQrt4LojqN5zAIwDLdBScZBukhWYn0WKmTqcw1djGZBWKHvwV9kP4m27T_0DKpa9Bwi0AomlFjhDK_b41ERuE-3-7MNH5A',
 'not-before-policy': 0,
 'refresh_expires_in': 1800,
 'refresh_token': 'eyJhbGciOiJIUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICIwZjBlNjZiZS1hOWNkLTRhMjktODdiNS00NTAwMGZjYzk1NjcifQ.eyJqdGkiOiJiOGYxZDJmZi1mNmYzLTRiYTEtYjU3OC0zMmMxZDZlMjllN2IiLCJleHAiOjE1Njk0MDAyMzksIm5iZiI6MCwiaWF0IjoxNTY5Mzk4NDM5LCJpc3MiOiJodHRwOi8vbG9jYWxob3N0OjkwOTAvYXV0aC9yZWFsbXMvbWFzdGVyIiwiYXVkIjoiaHR0cDovL2xvY2FsaG9zdDo5MDkwL2F1dGgvcmVhbG1zL21hc3RlciIsInN1YiI6IjEwMzMzNmJmLWM0NzEtNGRkNS1iMzllLTQ2NTJhMDAzMmJlOCIsInR5cCI6IlJlZnJlc2giLCJhenAiOiJwa2NlLXRlc3QiLCJhdXRoX3RpbWUiOjAsInNlc3Npb25fc3RhdGUiOiJhZjA5ZDgwZC05OTAxLTQ0NWYtYjc4OS1jNmRmYTMzZWMxNzUiLCJyZWFsbV9hY2Nlc3MiOnsicm9sZXMiOlsib2ZmbGluZV9hY2Nlc3MiLCJ1bWFfYXV0aG9yaXphdGlvbiJdfSwicmVzb3VyY2VfYWNjZXNzIjp7ImFjY291bnQiOnsicm9sZXMiOlsibWFuYWdlLWFjY291bnQiLCJtYW5hZ2UtYWNjb3VudC1saW5rcyIsInZpZXctcHJvZmlsZSJdfX0sInNjb3BlIjoib3BlbmlkIGVtYWlsIHByb2ZpbGUifQ.KmzF-T3meYK1m72ShHQGA3G0VGVA6GYXNIMZVoHkx9Q',
 'scope': 'openid email profile',
 'session_state': 'af09d80d-9901-445f-b789-c6dfa33ec175',
 'token_type': 'bearer'}

Decode the JWT tokens

The access and id tokens are JWT tokens apparently. Let's decode the payload.

In [15]:
def _b64_decode(data):
    data += '=' * (4 - len(data) % 4)
    return base64.b64decode(data).decode('utf-8')

def jwt_payload_decode(jwt):
    _, payload, _ = jwt.split('.')
    return json.loads(_b64_decode(payload))
In [16]:
jwt_payload_decode(result['access_token'])
Out[16]:
{'acr': '1',
 'aud': 'account',
 'auth_time': 1569398438,
 'azp': 'pkce-test',
 'email_verified': False,
 'exp': 1569398499,
 'iat': 1569398439,
 'iss': 'http://localhost:9090/auth/realms/master',
 'jti': '2d0790a5-a630-4606-9296-d4d4cc848948',
 'nbf': 0,
 'preferred_username': 'john',
 'realm_access': {'roles': ['offline_access', 'uma_authorization']},
 'resource_access': {'account': {'roles': ['manage-account',
    'manage-account-links',
    'view-profile']}},
 'scope': 'openid email profile',
 'session_state': 'af09d80d-9901-445f-b789-c6dfa33ec175',
 'sub': '103336bf-c471-4dd5-b39e-4652a0032be8',
 'typ': 'Bearer'}
In [17]:
jwt_payload_decode(result['id_token'])
Out[17]:
{'acr': '1',
 'aud': 'pkce-test',
 'auth_time': 1569398438,
 'azp': 'pkce-test',
 'email_verified': False,
 'exp': 1569398499,
 'iat': 1569398439,
 'iss': 'http://localhost:9090/auth/realms/master',
 'jti': '96ad2ab9-6d54-4cbb-a23b-5cfdaf502497',
 'nbf': 0,
 'preferred_username': 'john',
 'session_state': 'af09d80d-9901-445f-b789-c6dfa33ec175',
 'sub': '103336bf-c471-4dd5-b39e-4652a0032be8',
 'typ': 'ID'}

Conclusion

That's it. We worked client-side-style and managed thanks to PKCE to do a non-implicit authorization flow without having to work with a static secret.