Skip to content

Querying MVP in Client v2#48342

Draft
Pepo48 wants to merge 2 commits intokeycloak:mainfrom
Pepo48:issue-48294
Draft

Querying MVP in Client v2#48342
Pepo48 wants to merge 2 commits intokeycloak:mainfrom
Pepo48:issue-48294

Conversation

@Pepo48
Copy link
Copy Markdown
Contributor

@Pepo48 Pepo48 commented Apr 21, 2026

Closes #48294

The query string is parsed by ANTLR4 into a parse tree, then evaluated in-memory against the loaded client representations. This follows the same approach used by the SCIM filter parser and the workflow expression evaluator in model/jpa - both use ANTLR4 grammars with custom error listeners and visitor-based evaluation. The lexer uses two modes (default + list) so that colons inside bracket expressions are treated as part of the entry rather than as field separators.

Queryable fields are restricted to a whitelist in FieldResolver - unknown fields return 400. This mirrors the JpaClientProviderFactory.searchableAttributes pattern where the set of searchable attributes is explicitly controlled. The whitelist currently covers 12 fields across BaseClientRepresentation and OIDCClientRepresentation. For MVP I did not include SAML-specific fields, I guess that's for a follow-up.

Filtering currently happens in-memory after the v1 bridge loads all clients , so .toList() is called before JAX-RS serialization. I labeled it as TODO, we can get back to it once #48286 (comment) is established and we have a better idea of the performance implications. The query is validated upfront (QueryParseUtils.validate()) so errors surface before the stream is consumed - .toList()is not needed (see the review discussion below).

Closes keycloak#48294

Signed-off-by: Peter Zaoral <[email protected]>
Copy link
Copy Markdown
Contributor

@shawkins shawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start @Pepo48.

I'm hoping there will eventually be re-usable logic with the scim predicate pushdown.

If we want to give clients the ability to directly support the second option from #48286 (comment) we'll need to account for other comparison operators (COLON means equality at this point right?). If not, we'd need pagination control parameters for that.

Do we know that all of field names conform to BAREWORD? I guess we can just make the grammar more json like and allow for the fieldpath to contain quoted strings if needed.

And to confirm, the application of these filters will always just be against the root object in a conjunctive way? There won't for example be a way to filter children. e.g. give me all clients that have a role with description foo vs. give me all clients with only roles that have description foo

Comment on lines +119 to +123
// TODO .toList() materializes all results in memory
return stream
.filter(client -> ClientQueryEvaluator.matches(queryCtx, client))
.toList()
.stream();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO .toList() materializes all results in memory
return stream
.filter(client -> ClientQueryEvaluator.matches(queryCtx, client))
.toList()
.stream();
return stream
.filter(client -> ClientQueryEvaluator.matches(queryCtx, client));

Copy link
Copy Markdown
Contributor Author

@Pepo48 Pepo48 Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deliberately kept toList() here because the validation lives in the visitor - following the same pattern used by SCIM and workflow evaluators that keep logic in visitors rather than in separate validation steps. Since the visitor runs inside stream.filter(), without toList() the exception escapes the try-catch and the user gets a misleading "Cannot parse the JSON" instead of "Unknown query field: unknownField". The toList() forces the evaluation so the exception appears.

If we want to remove it, the validation would need to move out of the visitor into a separate pre-validation step. I tried it earlier but decided to keep the pattern consistent.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the user gets a misleading "Cannot parse the JSON"

Relates to #34857 in particular the most recent comments

If we want to remove it, the validation would need to move out of the visitor into a separate pre-validation step. I tried it earlier but decided to keep the pattern consistent.

We definitely want to keep Streaming where possible. The validation doesn't seem necessary against every single element of the stream as you can validate the accessors and match lists upfront - this is typically logic that "compiles" the actual visitor to use for matching.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's a different approach from what I tried. I tried to address your suggestion. PTAL. Thanks!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Pepo48 that looks good.

@Pepo48
Copy link
Copy Markdown
Contributor Author

Pepo48 commented Apr 21, 2026

I'm hoping there will eventually be re-usable logic with the scim predicate pushdown.

Actually, this was the reason, why I studied how core team uses SCIM filter parser to implement it as close as possible. From what I understand, the main advantage of the visitor pattern is that it gives us an opportunity to be as generic by design - in our case the grammar and parse infrastructure are generic. The ClientQueryEvaluator is a visitor, so a second visitor producing JPA predicates (like ScimJPAPredicateEvaluator does) can be added without changing the grammar. Only FieldResolver is client-specific. So the answer is - I hope so too 😄 because that was my goal - all we would need is a new visitor with its own field to column mapping.

COLON means equality at this point right?

Yes, currently it means strict equality. So correct, the comparison operators would be needed for a cursor-based pagination. The proposal mention them under "key differences" compared to GH syntax - we considered it as a possible improvement. I don't see a problem with it - specifically for the comparison operators the grammar could be just extended (without breaking the existing queries).

Do we know that all of field names conform to BAREWORD?

Currently yes, all 12 whitelisted fields.

guess we can just make the grammar more json like and allow for the fieldpath to contain quoted strings if needed.

Sure, the same as the comparison operators.

And to confirm, the application of these filters will always just be against the root object in a conjunctive way? There won't for example be a way to filter children.

Correct and also correct. Per my current understanding SCIM solves this with bracket syntax on the field path, e.g. roles[description eq "Full access"]. Our brackets are on the value side, e.g. roles:[admin,user], not the field side.

Signed-off-by: Peter Zaoral <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Querying MVP in Client v2

2 participants