Avoid O(N) session-cache scan when removing a user#48758
Open
renekloth wants to merge 1 commit intokeycloak:mainfrom
Open
Avoid O(N) session-cache scan when removing a user#48758renekloth wants to merge 1 commit intokeycloak:mainfrom
renekloth wants to merge 1 commit intokeycloak:mainfrom
Conversation
0fd36ef to
1fcab32
Compare
1fcab32 to
01e3f02
Compare
Member
|
@renekloth, thank you for your PR. Can you share your benchmark? I want to run it myself because it may negatively affect users with multi-az deployments: one network call versus K network calls in networks with >10ms latency. |
01e3f02 to
81201ba
Compare
Closes keycloak#48755 * PersistentUserSessionProvider.onUserRemoved scanned the entire user-session and client-session caches to evict the user's entries -> O(N) where N is number of total sessions in cache. Thus cost scaled with total cached sessions. To fix this, look up the user's session ids first and then remove the cache entries directly -> O(K) where k is number of offline Sessions for specific user * Use still old behavior (full-cache-scan) for useBatches `--spi-user-sessions--infinispan--use-batches=true` option, because cache entries can exist whose INSERT is still in the writer queue and not yet visible to the persister. * additional: rename namedQuery of PersistentUserSessionEntity: `findUserSessionsByUserId` -> `findUserSessionsByUserIdLastSessionRefresh`, because it is conflicting the naming convention and a query without lastSessionRefresh is required Signed-off-by: Rene Buchholz <rene.buchholz@spree.de>
57d4c00 to
ef6e35e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes per-user delete latency that grows with Infinispan session-cache size
Problem
PersistentUserSessionProvider.onUserRemovedinvokes a fullentrySet().stream()over both the user-session and client-session caches (online and offline) to evict the user's entries. The scan cost scales with the total number of cached sessions, not with the user's own.In a 26.x cluster with
PERSISTENT_USER_SESSIONSenabled and many active offline-token users, the offline-session cache fills up over uptime as sessions are lazily loaded on refresh. Observed delete latency for a user with no sessions of their own:Independent of how many users have been deleted, only of how full the cache is.
Fix
Replace the full cache scan with O(K) targeted eviction:
findUserSessionsByUserId(nolastSessionRefreshfilter — idle-expired-but-still-cached entries must also be evicted).UserSessionPersisterProvider.onUserRemoved.getAdvancedCache().withFlags(...).remove, bypassingJpaChangesPerformerso no per-key DB delete is issued.Cost is now O(K) over the user's own sessions, independent of total cache size.
Async-batch fallback
When
--spi-user-sessions--infinispan--use-batches=trueis active, the cache may legitimately hold entries whose INSERT is still in the writer queue and not yet visible to the persister. In that mode the previous full-cache-scan eviction path is retained as a fallback to remain correct.Naming convention
@NamedQuerynames now make thelastSessionRefreshfilter explicit:findUserSessionsByUserId(with filter) → renamed tofindUserSessionsByUserIdLastSessionRefreshfindUserSessionsByUserIdThe persister method follows the same convention. Discussion item: the other
find*queries with the same filter (findUserSession,findUserSessionsByBrokerSessionId,findUserSessionsByClientId,findUserSessionsByExternalClientId) could be renamed for consistency in a follow-up cleanup.Tests
UserSessionPersisterProviderTest.testFindUserSessionsByUserId— verifies the new query returns idle-expired sessions thatloadUserSessionsStreamwould filter out.UserSessionPersisterProviderTest.testFindUserSessionsByUserIdWithoutClientSessions— regression guard for the LEFT JOIN: a user session without any client session must still be returned (NULL clientId case).Reviewer notes
UserSessionPersisterProvider.onUserRemoved(bulk DELETE byuserId) is unchanged — it remains the authoritative DB cleanup. The new code only changes the cache-eviction path.findUserSessionsByUserIdpersister method has a default implementation in the interface that throwsIllegalStateException; onlyJpaUserSessionPersisterProviderand the persistent-session caller need the dedicated implementation.IDX_OFFLINE_USS_BY_USER(USER_ID, REALM_ID, OFFLINE_FLAG) onOFFLINE_USER_SESSIONexists since 14.0.0 and supports the new query; no schema change needed.