Validation of Keycloak clustering deployment on Azure Container Apps #47099
Replies: 3 comments 3 replies
-
|
Hello again, Following up on this — since we haven’t received feedback yet, we wanted to simplify the question. Given that:
Is there any known limitation or architectural reason why this setup (Keycloak on Azure Container Apps with embedded clustering) would not be considered reliable in production? In other words, are there known edge cases (e.g. related to container lifecycle, networking, or JGroups behavior) that might not be visible in our current testing scenarios and validations? Any guidance would be greatly appreciated. |
Beta Was this translation helpful? Give feedback.
-
|
Hello @roisanchezriveira , Thank you for your response. In our setup, we are using the Workload Profiles (Consumption_v2) model in Azure Container Apps. Could you please clarify whether your setup was used in a production environment before you encountered the issue? Additionally, it would be helpful to know whether you have tested the -Djgroups.bind.address=GLOBAL configuration in your setup, and if so, whether it had any impact. Thank you in advance for your insights. |
Beta Was this translation helpful? Give feedback.
-
|
Hi there, We have the same problem of replicas communicating with each other for caches on ACA. After a lot of tries and failures, I found this post by accident. I have tried the settings above and can confirm that it works well in our setup. I scaled up the replicas count to 10 and back to 1, the replicas have no issues of internal communications. Logs and server info showed correct cluster info as well. I looked further into the changes, and think the flag I will continue watching the behavior. From my tests, the solution looks promising. Thanks for sharing the snippet, it helps a lot! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
We would like to ask for feedback regarding a deployment architecture we are currently using for Keycloak clustering on Azure Container Apps, in order to achieve high availability. Since this platform is not commonly referenced in the official documentation, we would like confirmation from the community and maintainers that our approach is considered technically supported.
Environment
Keycloak version: 26.5.0
Hosting platform: Azure Container Apps
Number of instances: 2 to 5
Database: PostgreSQL
Deployment approach
Keycloak is deployed to Azure Container Apps using the following approach:
Keycloak is deployed as a single Azure Container App (azurerm_container_app) named "keycloak", running in Single revision mode with 2–5 replicas for HA.
Container Image: quay.io/keycloak/keycloak:26.5.0
Cluster configuration
To enable clustering, we configured the following values:
KC_CACHE=ispn,
KC_CACHE_STACK=jdbc-ping
(The defaults, for this version of keycloak)
All the following configuration is passed via environment variables:
KC_DB=postgres,
KC_DB_URL = ...,
KC_DB_USERNAME = ...,
KC_DB_PASSWORD = ...,
KC_HTTP_ENABLED=true,
KC_PROXY_HEADERS=xforwarded,
KC_HTTP_PORT=8080,
KC_HOSTNAME = ...,
KC_HOSTNAME_STRICT = ... {We update this after Keycloak FQDN had been created from ACA (check phase 1 -> phase 2 at README)},
JAVA_OPTS_APPEND with -Djgroups.* system properties (e.g. "-Djgroups.jdbc.connection_url={...} -Djgroups.jdbc.connection_username={...} -Djgroups.jdbc.connection_password={...} -Djgroups.jdbc.driver_name=postgresql -Djgroups.bind.address=GLOBAL -Djgroups.bind.port=7800 -Djava.net.preferIPv4Stack=true -Djgroups.use.mcast_addr=false"),
KC_BOOTSTRAP_ADMIN_USERNAME = ...,
KC_BOOTSTRAP_ADMIN_ PASSWORD = ...,
KC_HEALTH_ENABLED=true,
KC_METRICS_ENABLED=true
Other configured options:
Plain start (not start --optimized) is used, meaning Keycloak performs its auto-build/augmentation phase at first launch.
Resource Allocation:
CPU: 1.0 vCPU per replica
Memory: 2 GiB per replica
Ingress:
External HTTPS ingress is enabled via ACA's built-in Envoy load balancer, terminating TLS and forwarding plain HTTP to port 8080 internally.
Observed behavior
After deploying multiple instances, our observations indicate that the nodes appear to successfully form a cluster.
Specifically:
Verify HA Cluster
Some results for the HA testing we performed and verified can be found here:
https://github.com/hoolser/terraform-azurerm-keycloak-aca/blob/master/testing-ha-results.md
(Additional detailed tests we performed are documented here:
https://github.com/hoolser/terraform-azurerm-keycloak-aca/blob/master/testing-ha.md
)
Question
Based on the above setup and behavior:
Is this deployment architecture considered a valid and supported Keycloak clustering configuration, or are there potential limitations or risks we should be aware of when running Keycloak on Azure Container Apps in this way?
We have added our working implementation with terraform for Keycloak in ACA at the following github repository (with a detailed redme) to contribute to others they are facing similar issues/concerns:
https://github.com/hoolser/terraform-azurerm-keycloak-aca
We would appreciate any feedback or confirmation from the community or maintainers.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions