Skip to content

KEYCLOAK-18727 Improve user search query#8346

Merged
hmlnarik merged 1 commit intokeycloak:mainfrom
bosch-io:feature/KEYCLOAK-18727-Improve-user-search-query
Jan 26, 2022
Merged

KEYCLOAK-18727 Improve user search query#8346
hmlnarik merged 1 commit intokeycloak:mainfrom
bosch-io:feature/KEYCLOAK-18727-Improve-user-search-query

Conversation

@artur-baltabayev
Copy link
Copy Markdown
Contributor

@artur-baltabayev artur-baltabayev commented Aug 4, 2021

The purpose of this PR is to improve the performance of the user search (specifically when you search for a user in the UI),
since customers were facing bad performance on systems with a large amount of users. The implementation is following
a suggestion from @hmlnarik . You can find the full discussion here: https://groups.google.com/g/keycloak-dev/c/jn5Yy9Y4Lw4 .

TL;DR :
You can now search for a user in 3 different ways:

  1. Prefix, e.g : foo or foo* (new default)
  2. Exact, e.g : "foo"
  3. Infix, e.g : *foo* (old default)

However, we would like to make the following remarks and like to hear your feedback:

  1. Changing the default behavior of the user search endpoint is technically breaking the API. One might also consider making a new endpoint and let the UI use it instead.

  2. For consistency, we also changed the user count (if a search string is part of the input) to work similarly to the user search.

  3. We only adjusted the JpaUserProvider and MapUserProvider because we wanted to hear your feedback on whether it is also necessary to adjust the LdapStorageProvider.

  4. While adjusting the JpaUserProvider and MapUserProvider, we decided that it would be consistent to unify their queries. This means the JpaUserProvider now searches for the search string on both the FIRST_NAME and LAST_NAME columns separately instead of concatenating them with coalesc. Also the search string gets split on both providers now.

@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

I also wanted to add that i'm on vacation for three weeks from 14.08 until 5.09. I won't be able to respond during that time.

Comment thread model/map/src/main/java/org/keycloak/models/map/user/MapUserProvider.java Outdated
@artur-baltabayev artur-baltabayev force-pushed the feature/KEYCLOAK-18727-Improve-user-search-query branch 2 times, most recently from 714cec2 to a622a50 Compare August 5, 2021 09:35
@gmorse81
Copy link
Copy Markdown

gmorse81 commented Jan 4, 2022

Is there any chance this change can be merged? I also have this issue searching for users.

@hmlnarik hmlnarik self-assigned this Jan 4, 2022
@hmlnarik hmlnarik added area/storage Indicates an issue that touches storage (change in data layout or data manipulation) kind/enhancement Categorizes a PR related to an enhancement labels Jan 4, 2022
@hmlnarik hmlnarik requested review from sguilhen and vramik January 4, 2022 20:31
Copy link
Copy Markdown
Contributor

@vramik vramik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @artur-baltabayev for the PR, could you please do the rebase? There has been done some changes in codebase what will probably affect your PR. Thanks.

@artur-baltabayev artur-baltabayev force-pushed the feature/KEYCLOAK-18727-Improve-user-search-query branch from a622a50 to 0680a01 Compare January 10, 2022 15:02
@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

artur-baltabayev commented Jan 10, 2022

@vramik Thank you for your response. I just rebased the branch and it's up to date with master. But somehow there is still a merge conflict displayed. Maybe it just needs time to synchronize.

@hmlnarik
Copy link
Copy Markdown
Contributor

@artur-baltabayev I believe that it has been synced with master. Keycloak's main branch is now called main which causes showing the conflicts. Could you please rebase to the latest main?

@vramik
Copy link
Copy Markdown
Contributor

vramik commented Jan 11, 2022

I've started discussion regarding the change of the default behavior: #9507

@artur-baltabayev artur-baltabayev force-pushed the feature/KEYCLOAK-18727-Improve-user-search-query branch from 0680a01 to 600c2d6 Compare January 11, 2022 15:42
@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

@hmlnarik Thanks for the hint. Now it's rebased on the main branch,

@vramik
Copy link
Copy Markdown
Contributor

vramik commented Jan 12, 2022

@hmlnarik Thanks for the hint. Now it's rebased on the main branch,

Thank you @artur-baltabayev, it seems there has happened some issue as the branch you're sending the PR from seems to be 118 commits behind keycloak:main, could you please double-check it?

Copy link
Copy Markdown
Contributor

@vramik vramik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it seems good to me from the first look. I have 2 comments for now (please see inline) and I'll look at it once it'll be rebased, thank you @artur-baltabayev

Comment on lines 848 to 849
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried it and it seems the issue is not reproducible on current main.

Copy link
Copy Markdown
Contributor Author

@artur-baltabayev artur-baltabayev Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed for undertow. See comment below.

@artur-baltabayev artur-baltabayev force-pushed the feature/KEYCLOAK-18727-Improve-user-search-query branch 8 times, most recently from b4df847 to 5ae27ab Compare January 13, 2022 15:08
@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

@vramik Implemented your feedback. I also removed some imports that were unused in the test classes.

@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

artur-baltabayev commented Jan 14, 2022

@vramik Ok i remember now that the single wildcard workaround was needed for undertow. As you can see, currently every undertow test that searches for a empty string fails. That is because it defaults to a 'empty' prefix search which is represented by a single wildcard character (i.e. '%'). We couldn't come up with another fix for this problem, so i would appreciate suggestions. Otherwise i can add the workaround again.

@vramik
Copy link
Copy Markdown
Contributor

vramik commented Jan 19, 2022

Thank you @artur-baltabayev. Please correct me if I'm wrong, but I think search for empty string is equivalent to getting all users. In that case it could be checked in UsersResource e.g. following way:

if (search != null && ! search.trim().isEmpty())

@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

@vramik You mean we should move the workaround to the UsersResource instead of being MapProvider specific ?

@vramik
Copy link
Copy Markdown
Contributor

vramik commented Jan 19, 2022

@vramik You mean we should move the workaround to the UsersResource instead of being MapProvider specific ?

I don't think there should be added any kind of workaround. When you asked for suggestions regarding the empty string searches I've realized that empty string search should be considered as equal to getting all users.

It should work for both legacy-jpa and map-storage, why exactly it is failing for map storage and not for legacy-jpa?

@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

@vramik My observation was that the test cases failed after i removed the workaround. But since you were pointing out (https://issues.redhat.com/browse/KEYCLOAK-18969), that the mentioned single wildcard issue no longer exists i will have to do some further testing before i can answer you question.

@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

Hello @vramik , after i did some more research i can now explain why you couldn't recreate the mentioned single wildcard issue (https://issues.redhat.com/browse/KEYCLOAK-18969) and why it's failing in the first place :

You can only recreate the issue on this branch (not on main) since it is tied to how the new user search works. I recreated it locally by running the following command : mvn clean install -nsu -B -Pauth-server-undertow -Pmap-storage -Dtest=UserTest -Dkeycloak.logging.level=info -f testsuite/integration-arquillian/tests/base/pom.xml .

The error is map storage specific because it comes from the 'like' and 'ilike' methods in the CriteriaOperator class. There is a substring call that wasn't made for input strings that only contain one character (in our case it's the single wildcard).
Here is the actual code snippet (affected logic is similar between the 'like' and 'ilike' method) :

public static Predicate<Object> ilike(Object[] value) {
    Object value0 = getFirstArrayElement(value);
    if (value0 instanceof String) {
        String sValue = (String) value0;
        boolean anyBeginning = sValue.startsWith("%");
        boolean anyEnd = sValue.endsWith("%");

        Pattern pValue = Pattern.compile(
          (anyBeginning ? ".*" : "")
          + Pattern.quote(sValue.substring(anyBeginning ? 1 : 0, sValue.length() - (anyEnd ? 1 : 0)))
          + (anyEnd ? ".*" : ""),
          Pattern.CASE_INSENSITIVE + Pattern.DOTALL
        );
        return o -> {
            return o instanceof String && pValue.matcher((String) o).matches();
        };
    }
    return ALWAYS_FALSE;
}

If value0 equals to %, both anyBeginning and anyEnd will be evaluated to true which leads to a substring(1, 0) call. This produces a java.lang.StringIndexOutOfBoundsException: String index out of range: -1 exception. Now considering all search queries originally started and ended with a wildcard, there wasn't a need to do a corner case check for a single character string. But now every search without a asterisk (*) in it defaults to a prefix search, and prefix searches are implemented to end with a single wildcard. This means that a empty search string will get translated to a single wildcard character. I hope the issue is clearer now.

I have several suggestions to fix this issue :

  1. Going back to the old map storage specific fix.
  2. Defaulting every empty search string to a double wildcard string instead of just the map storage.
  3. Fixing the issue in the map storage directly (maybe it could be part of this PR ?).

@hmlnarik
Copy link
Copy Markdown
Contributor

Option 3 please. If value0 consists only of % characters (i.e. matches ^%+$ regex), the predicate should be just returning true unconditionally.

@artur-baltabayev artur-baltabayev force-pushed the feature/KEYCLOAK-18727-Improve-user-search-query branch from 5ae27ab to 9a1c9dc Compare January 21, 2022 09:10
@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

@hmlnarik @vramik Implemented the fix in the CriteriaOperator class. Now every string that matches the suggested pattern will be evaluated to true.

Copy link
Copy Markdown
Contributor

@vramik vramik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @artur-baltabayev for the update. The code looks good to me. Approving.

@hmlnarik
Copy link
Copy Markdown
Contributor

@artur-baltabayev Thank you for the code changes, they now look good to me. Could you please prepare a PR for release notes in https://github.com/keycloak/keycloak-documentation/blob/main/release_notes/topics for version 17 (if not present, please introduce file 17_0_0.adoc)? The note in the release notes should cover the information stated in the description of this PR.

@artur-baltabayev
Copy link
Copy Markdown
Contributor Author

artur-baltabayev commented Jan 24, 2022

@vramik @hmlnarik Thank you two a lot for all the feedback! I just made the documentation PR : keycloak/keycloak-documentation#1378 .

Copy link
Copy Markdown
Contributor

@hmlnarik hmlnarik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

@hmlnarik hmlnarik merged commit 9621d51 into keycloak:main Jan 26, 2022
@hmlnarik hmlnarik linked an issue Jan 26, 2022 that may be closed by this pull request
@sguilhen sguilhen removed their request for review February 8, 2022 19:53
@rosario-vandebron
Copy link
Copy Markdown

Just out of curiosity using % in the search parameter instead of * will produce the infix/prefix search result. Is it wanted?

@mkopeyka
Copy link
Copy Markdown

Good improvement. Just one question why it doesn't include the case *value?

@danielFesenmeyer
Copy link
Copy Markdown
Contributor

Good improvement. Just one question why it doesn't include the case *value?

It's maybe not that intuitive, but you should be able to handle that case like that: "%value"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/storage Indicates an issue that touches storage (change in data layout or data manipulation) kind/enhancement Categorizes a PR related to an enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve user search query

8 participants