Push notifications offer user retention and engagement. Integration is now possible with the web through the Push Protocols. Information is pretty scattered and most is generally regarding libraries so I'll cover in this article how to create a modern Push Notification server in a language-agnostic way using the Push API.
Push notifications offer user retention and engagement. Integration is now possible with the web through the Push Protocols.
A few years ago if someone was asked where the trend was going you’d probably say apps— an app for pizza, an app for clothes, for everything. This isn’t wrong, apps have taken off and vendors have provided many conveniences, specifically Push Notifications to deliver updates. The web, however, hasn’t gone away. The age of ‘web apps’ is making a sort-of come back as we see in new popular things such as AMP, and specifically PWAs (Progressive Web Apps) which are backed by Service Workers.
As a totally non-technical explanation, service workers are essentially ‘downloaded’ resources that allow you to run JS for your site without it being active (including offline). It is impractical to have a heavy V8 instance running a JavaScript with a WebSocket instance open to every server so you’ll quickly find service workers can’t obtain Push Notifications through that. Instead vendors will use alternative protocols (which are up to implementation) which will then refer to the JavaScript handler for the Push Notification.
Information is pretty scattered and most is generally regarding libraries so I’ll cover in this article how to create a modern Push Notification server in a language-agnostic way using the Push API.
#VAPID
VAPID (RFC link) is one of the key (no pun intended) ways your server and Push Notifiactions are authenticated. “VAPID keys” are essentially your regular old key pair along the P-256 ecliptic curve.
These keys can be easily generated with openssl ec
utility:
openssl ecparam -name prime256v1 -genkey -noout -out PRIVATE_KEY.pem
openssl ec -in PRIVATE_KEY.pem -pubout -out PUBLIC_KEY.pem
what we’ll be doing with these keys is:
- Provide them to the client
- Use them to encrypt our “JWT” which lets the client/push server validate the Push Notifications we send are indeed coming from us
#Client
Now let’s go back to the client. You’ll have two pieces of JavaScript on the frontend:
- Setup code: ordinary JS which obtains permission to display notifications and then prompts to user^[Please, please, please don’t show a modal on page-load offering PNs or show the permission request directly. Nobody cares about PNs until they do so leave it later in the UI flow]
- Service Worker: to display notification
Now your client needs to have the public key unencoded. Both PEM and DER key
formats encode the key1 (not just with base64) so what I recommend is using
an OpenSSL library[^If you are using Python M2Crypto] to read the key and have
an endpoint which returns the raw key (should begin with 0x04
).
If you do have an endpoint here’s some sample code:
const response = await fetch("/my/pubkey/endpoint", { method: "POST" });
const key = new Uint8Array(await response.arraybuffer());
Now when approriate, prompt the user for the permission to display Notifications. The notification API for some browsers uses Promises while others use a callback so to comply we’ll use this:
const permission = await new Promise((resolve, reject) => {
const promise = Notification.requestPermission(resolve);
if (promise) promise.then(resolve, reject);
});
const allowedToDisplayNotifs = permission === "granted";
Now we can ‘setup’ Push Notifications. This step DOES NOT talk to the server, you must, yourself, send the information regarding the registration for PNs to your server:
const pushSubscription = await registration.pushManager.subscribe({
userVisibleOnly: true,
applicationServerKey: key,
});
The userVisibleOnly
key specifies that we’ll only use this channel to send
user-facing notifications. This is required to be true in some browsers
(including Chrome at time of writing).
This pushSubscription
object will look a little like:
{
"endpoint": "https://push.example.com/unique-auth-token",
"keys": {
"p256dh": "<long string>",
"auth": "<shorter string>"
}
}
The endpoint
is all we need for sending PNs but if we wish to send a payload
(which yeah we probably want to do) we’ll use the p256dh
and auth
fields to
encrypt our payload. This way only your server and the client can ever read PNs
(the intermediate PN server by Chrome/Firefox can’t).
Now what you should do is setup and endpoint which receives all these values and stores it in your database.
#Sending the Notificatoin
Now when it’s time to send a notification we have to do quite some work to prepare the payload. They are two crypto-things we need:
- Tell the PN server that we are in fact the server
- Encrypt the payload so the PN server can’t snoop in on the notifications
#Key Derivation
So we’ll need to create some EC keys based off the client. We’ll at the end have 5 keys going around:
Server +
and Client | Server
Client Public | Public
know |
+------------------+
Only |
owner Client | Server
knows Private | Private
+
Combining diagonally we’ll be able to derive another key. However in this case. `Client Public + Serve Private = Server Public + Client Private`. This combination is known as a ‘Share Secret’ and this way the private keys are never exposed yet only the client and the server (even in the presence of a MITM) only the client and server know this. This process is known as ‘ECDH’ (Diffie-Hellman with Elliptic Curve key pairs)
Here’s a code example with Python (they are 10000 examples of Node.js on Google’s dev blog if that’s your thing):
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.serialization import load_pem_private_key
from cryptography.hazmat.primitives.asymmetric import ec
from base64 import urlsafe_b64decode
private_key = ec.generate_private_key(ec.SECP256R1(), default_backend())
client_public_key = urlsafe_b64decode(KEY_DATA)
shared_secret = private_key.exchange(
ec.ECDH(),
ec.EllipticCurvePublicNumbers.from_encoded_point(
ec.SECP256R1(),
client_public_key
).public_key(default_backend())
)
We’ll use this ‘shared secret’ as a symmetric cipher. Since both we (the server) and the client have this key. We can use symmetric encryption to encrypt the payload.
#Parameter Derivation
At the very end we’ll use AES to encrypt the message however we first have to produce the encryption parameters to pass to it. We’ll need a ‘key’ and an ‘iv’. This part has a binary format for the offical source I’ll refer you to [the RFC section][2].
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from base64 import urlsafe_b64decode
prk = HKDF(
algorithm=hashes.SHA256(),
length=32,
salt=urlsafe_b64decode(AUTH_DATA),
info=b'Content-Encoding: auth\0',
backend=default_backend()
).derive(shared_secret)
This prk
is a secure ‘random enough’ key that both we and the client know.
We’ll use this to basically generate two other strings that we’ll pass into the
AES function:
from cryptography.hazmat.primitives.serialization import load_pem_public_key
from struct import pack
with open('PATH_TO_PUBLIC_KEY.pem', 'rb') as key:
server_public_key = load_pem_public_key(
key.read(),
default_backend()
).public_numbers().encode_point()
context = b'P-256\0' + pack('!H', len(client_public_key)) + client_public_key + pack('!H', len(server_public_key)) + server_public_key
Now that we have this ‘context’ which provides the keys used to encrypt it we’ll create the two encryption parameters by suffixing a type. For this we’ll need a salt:
from M2Crypto.m2 import rand_bytes
salt = rand_bytes(16)
Now we’ll generate the two params (nonce and CEK) from the prk
. We only vary
the Content-Encoding
between the two
nonce = HKDF(
algorithm=hashes.SHA256(),
length=12,
salt=salt,
info=b'Content-Encoding: nonce\0' + context,
backend=default_backend()
).derive(prk)
cek = HKDF(
algorithm=hashes.SHA256(),
length=16,
salt=salt,
info=b'Content-Encoding: aesgem\0' + context,
backend=default_backend()
).derive(prk)
#Diagram
Here’s a diagram of what we just did:
Server Private Shared Client
+ ---> Secret 'auth'
Client Public | |
+----+-----+
|
v
PRK
Server Public |
+ +--------+----------+
Client Public | |
| v v
+-------> nonce <--- Salt ---> CEK
+ +
| payload |
| | |
| v |
+---> encrypted <---+
payload
#Encrypting the payload
This is the final part. We take the ‘nonce’ and CEK that we generated before and we use these as our parameters to the AES cipher. Implementation is simple enough we are using AES128-GCM which we can implement with:
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
cipher = Cipher(algorithms.AES(cek), modes.GCM(nonce), default_backend())
encryptor = cipher.encryptor()
encrypted_payload = encryptor.update(PAYLOAD_DATA) + encryptor.finalize() + encryptor.tag
That’s it for the payload! Now we just have to create the JWT header