Common problems with Keycloak: Prevent Keycloak from becoming a single point of failure

Navigate safely through the challenges of Keycloak! In our first article in the new information series “Common problems with Keycloak”, we tackle what is perhaps the most important question: How can I avoid a failure?

Reading time: 4 minutes

 

Three common problems with Keycloak

Welcome to the new year and the first post in our new blog series: “Common problems with Keycloak”. Today we’re looking at how we can prevent Keycloak from becoming a single point of failure.

 

Preventing keycloak from becoming a single point of failure

Keycloak is characterised in particular by its ability to guarantee the security of web applications and also to take over user management. As an integral part of our security system, its reliability is of the utmost importance. In our experience, Keycloak works very reliably and smoothly. However, if it fails, access to the associated applications can be blocked, which in the worst case can lead to significant downtime and loss of revenue.

The following tips should help to make Keycloak operation simple and stable.

 

Problem: Database

On the technical side, Keycloak requires a persistent database for permanent user storage. If Keycloak is operated with the standard image from Dockerhub, for example, an in-memory database is used by default. However, all data is lost when the container is restarted. Even if users are integrated via user federation, e.g. from an Active Directory, the use of a persistent database is highly advisable in order to activate the option of multi-factor authentication.

Many hosters offer managed databases, including Postgres or MariaDB, both of which are compatible with Keycloak. If the database itself is operated, we recommend MariaDB for Docker environments, as high-performance cluster operation is easier to realise here. For Kubernetes, there are Helm Charts from Bitnami, which also run a Postgres DB at the same time.

 

Problem: High Availability

How high the requirements for reliability should be always depends on the deployment scenario. However, a test environment should at least be available in this scenario so that it can be checked before updates whether the interaction of all components involved continues to function smoothly. A simple alternative would be to use Keycloak in a standalone clustered mode, e.g. based on Docker, which offers a higher level of security than single-host operations.

But how do we ensure that these systems also remain resilient? One solution is to operate two virtual machines in different availability zones (AZs) to increase resilience. In this scenario, a managed database from the hoster can be used as the database, which is synchronised in both AZs. Alternatively, a Maria DB Galera cluster can also be used, which can simply be installed as a Docker container in parallel to each Keycloak.

If Kubernetes is available, operation in “Standalone Clustered Mode” is also recommended here and initially offers all the advantages of Kubernetes. With many providers, Kubernetes clusters can also be set up across several AZs.

 

Problem: Scalability

The scalability requirements of Keycloak depend heavily on user behaviour and the intended use. What is the login behaviour of the users? Do users log in evenly throughout the day? How long can a user session remain valid?

In the case of an app, for example, it is desirable to provide users with offline tokens to prevent them from having to log in every time they use the app from time to time. This means that each user basically only logs in once and the offline tokens are checked when the app is used. The load is presumably spread over the course of the day and week.

In business or B2B scenarios, user login is often concentrated in just a few hours in the morning. In these cases, the use of Keycloak in a Kubernetes cluster with autoscaling function is particularly effective. With this technique, additional nodes are automatically added at a certain utilisation (CPU/memory) and shut down again as required to increase efficiency.

In addition, it is always advisable to perform load tests to validate the correct dimensioning of your Keycloak setup or to check the correct autoscaling in Kubernetes. There is a Gatling-based project on GitHub that provides a good basis for such load tests.

 

Conclusion

In conclusion, Keycloak is a powerful and reliable solution for web application security. However, careful configuration, coupled with a suitable fail-safe and scaling strategy, is crucial to minimise the risk of a single point of failure. Since the operation of Keycloak is not without its pitfalls, especially with high loads, it is advisable to leave the operation to the experts if in doubt. You are also welcome to contact us for help with specific Keycloak problems.

In the next blog post, we will dive deeper into specific problems and how to solve them. Stay tuned to learn how you can make your web applications even more secure!

Weitere interessante Beiträge

WordPress theme development by WordPress service provider aceArt.