Integrating Google OAuth 2.0 in a Desktop Application for Rich-Content Editing
/ 8 min read
A few months ago, I resumed work on LetterFlow, a side project aimed at building a flexible, Tauri-powered CMS for content creation. I wanted full control over data representation without relying on an off-the-shelf CMS, so I designed LetterFlow as a rich-content editor with a range of extensions, enabling authors to create structured documents. These documents are stored in a unified JSON format within a Firebase-backed Firestore collection, allowing easy retrieval and custom representation across various clients.
Currently, LetterFlow is still in development in a private GitHub repository, with key design decisions yet to be made—such as data modeling, database adapter options, and integrating ProseMirror extensions within Angular.
ProseMirror provides a solid foundation for document editing, and Tiptap simplifies this further, though a full Angular implementation of Tiptap has yet to be developed.
LetterFlow’s current stack consists of:
- Rust with Tauri
- Angular (including Tailwind CSS)
- Firebase with Google Cloud Platform (GCP) integration
- ProseMirror / Tiptap
To connect users with their data, Google OAuth 2.0 is used because of its seamless integration with GCP.
The Challenge
On paper, the task sounds relatively straightforward:
A user downloads the LetterFlow desktop app, signs in via “Sign in with Google” to connect to their Firebase GCP resource and then uses the interface to create and edit rich content documents.
Implementing OAuth 2.0 in a desktop application built with Tauri reveals its own set of complexities, especially in terms of user experience.
Interface and Data Models
Besides defining the data models (“Author,” “Post,” etc.), a key aspect of the desktop application is building an interface that links users to their data. Without a robust and scalable connection between users and their Cloud Firestore data, the app would be of limited use.
Authorization Alternatives
While OAuth 2.0 was selected for its versatility, I initially considered other authorization and authentication methods. Many of these options ultimately proved unnecessary, at least for now, as they introduced more complexity than required. Below are a few alternative specifications and the reasons they weren’t chosen or why they became irrelevant to the implementation:
API Keys
API keys are a straightforward way to authenticate by using a unique string to grant access to an API. This method is simple to implement but generally provides less security than the other standards mentioned, often lacking the same granularity and protection.
This option was ruled out from the start because it would have required managing a separate database. That would reduce the flexibility of the app in terms of usage and security. Additionally, if an API key were to be lost or exposed in a data breach, the resulting issues would be unpredictable.
OpenID Connect (OIDC)
OpenID Connect is a layer built on top of OAuth 2.0, making it easy to authenticate users and obtain basic profile information. It allows applications to retrieve identity information from an identity provider (IdP) without users having to re-enter credentials. OIDC uses JWTs (JSON Web Tokens) for exchanging information, offering a standardized authentication process.
OpenID Connect wasn’t required for LetterFlow because it’s integrated into Google OAuth 2.0. Additionally, OIDC alone wouldn’t suffice for the operations the desktop application would need. While profile information is part of the authentication flow and interface, it alone doesn’t provide enough permission to communicate with Google Cloud Platform.
FIDO (Fast Identity Online)
FIDO is a protocol for secure, password-free authentication, using public key cryptography and, often, biometrics like fingerprint or facial recognition. I find this form of authentication promising, given the growing availability of devices with the necessary hardware. However, since not all users may have access to the latest devices, and given the additional complexity of implementing FIDO, I decided to place it on the “nice-to-have” feature list.
SAML (Security Assertion Markup Language)
SAML is an XML-based protocol mainly used for Single Sign-On (SSO) in enterprise environments. It enables secure exchange of authentication and authorization data between an identity provider and a service provider. SAML is common in large organizations and federated identity systems.
Since LetterFlow requires a Firebase or GCP account, implementing SAML was unnecessary. I’m not managing any separate infrastructure, so at this point, OAuth is a perfectly fitting choice. If the application expands to support other cloud providers, SAML could become a viable alternative to OAuth, especially if self-managed servers are added to the infrastructure.
JWT (JSON Web Tokens)
While JWTs are often used alongside OAuth, they can also serve as a standalone method for authentication and authorization. JWTs enable secure data transmission between parties by encoding JSON data in a compact format that can be digitally signed. This is especially useful for stateless authentication in modern web applications.
Based on the app’s design, JWT would be an appropriate solution. I aimed to give users greater control over permissions for GCP services and to ease potential future access to additional GCP resources. For example, connecting to Google Cloud Scheduler, which allows the creation of Cron jobs for scheduled content publication, would be straightforward with JWTs alone.
Ideally, the desktop app would use both OAuth and JWT. Initial authentication and authorization (for GCP resources) would occur with OAuth 2.0, while all subsequent HTTP requests would carry a JWT in the “Authorization” header, typically as a “Bearer” token.
Fortunately, this is precisely how Google’s OAuth 2.0 implementation works.
OAuth 2.0 Flow
So what actually happens during an OAuth 2.0 flow within the desktop app?
When a user starts the app, it checks in the background if an authentication session from a previous session exists. Session data is stored locally using keyring (a Rust crate) that leverages platform-specific credential storage. keyring uses the “Windows Credential Manager” on Windows, the DBus-based “Secret Service” or “kernel keyutils” on Linux, and the local keychain on macOS. Users must approve this storage on initial use or intermittently.
A session’s data structure includes an OAuth 2.0 token (access token) and, if applicable, a refresh token, as well as basic user details like name and the last-selected Firebase project.
If a valid access token from a prior session exists, the OAuth 2.0 flow is skipped, and the user gains access to their documents directly.
Access Tokens and Renewal
To obtain an access token, users must consent to LetterFlow’s access to their GCP resources, after which an HTTP request is sent to Google’s OAuth 2.0 authorization endpoint to secure that consent. Upon approval, an authorization code is returned.
This authorization code is then exchanged for an access token via a second HTTP request to Google’s token endpoint. If successful, Google returns an access token and possibly a refresh token.
LetterFlow stores and uses these tokens to access the user’s Cloud Firestore resources. The access token has a limited lifespan (usually one hour), so it’s crucial to renew it in time. This is done with the refresh token, if available, to generate a new access token without needing the user to sign in again.
If the refresh token has expired or is invalid, the user must repeat the OAuth 2.0 flow.
Authentication and Authorization in LetterFlow
The app expects users to sign in with a Google account linked to Firebase resources. Users may have multiple Google accounts, and not all will be associated with these resources, making it unnecessary to sign in with unrelated accounts.
Authentication and authorization are complex processes, especially when cross-platform compatibility is required. For LetterFlow, it took several weeks of testing and fine-tuning, with ongoing improvements for a stable release.
OAuth 2.0 specifies that users are redirected to a success URL upon completing the sign-in process. Various OAuth flows exist, but the underlying mechanisms are similar.
Some OAuth flows use pop-up windows where a browser opens, guiding the user through the process. To support pop-up OAuth flows, native apps need to handle “deep links,” where the OAuth provider is configured to look for an application on the user’s system that intercepts the redirect.
LetterFlow employs an “in-app” flow, which I find significantly improves user experience.
Instead of running the OAuth flow in a pop-up, we start a temporary local server specifically for this process, which hosts the redirect URL. Before starting the OAuth flow, a random, available port number is generated to prevent conflicts with other applications.
When initializing the OAuth client in Rust, we configure the temporary redirect URL and start the sign-in process. This keeps the user within the app throughout the flow. Once completed, the temporary server shuts down.
Limitations
One drawback of the in-app OAuth flow is that the app “looses” it’s reactivity, which is natively supported when using AngularFire (the Angular bindings for Firebase’s JavaScript SDK) - but then again I would have to sacrifice control over infrastructure and data in that users would have to create an account with LetterFlow, which is not the direction I want to go.
Final words
I am still figuring out how best to bridge the gap between OAuth credentials stored through LetterFlow in the user’s system, handling automatic token renewal and providing a seamless user experience when a user undergoes the OAuth process.