ObjectId Sidebar Filter Fixes #5415

benjaminpkane · 2025-01-21T15:14:29Z

What changes are proposed in this pull request?

Allow invalid ObjectId searches

Allow > 24 hexadecimal character searches
Allow non-hexadecimal character searches
Fixes cursor matching for id fields in embedded lists, e.g. ground_truth.detections.id

How is this patch tested? If it is not, please explain why.

GraphQL tests

Release Notes

Fixed hard errors related to invalid ID searches in the sidebar

What areas of FiftyOne does this PR affect?

App: FiftyOne application changes
Build: Build and test infrastructure changes
Core: Core fiftyone Python library changes
Documentation: FiftyOne documentation changes
Other

Summary by CodeRabbit

New Features
- Enhanced search functionality for ObjectId fields
- Improved query capabilities in lightning server
- Added more flexible search matching criteria
Bug Fixes
- Improved handling of ObjectId search validation
- Fixed search term processing for complex queries
Tests
- Added new unit tests for ObjectId lightning queries
- Expanded test coverage for search functionality

The changes optimize search performance and provide more robust handling of ObjectId fields across the application's query and filtering systems.

coderabbitai · 2025-01-21T15:14:36Z

Walkthrough

This pull request introduces improvements to handling object ID searches across multiple components. The changes focus on enhancing the search functionality for object ID fields in the FiftyOne platform. The modifications span several files, including utility functions, server-side query logic, and test cases. The primary goal is to provide more robust and flexible handling of object ID searches, with specific attention to validation, matching, and query performance.

Changes

File	Change Summary
`app/packages/core/src/components/Filters/StringFilter/state.ts`	Added `isObjectIdString` import and updated `stringSearchResults` selector to handle object ID field searches
`app/packages/core/src/components/Filters/StringFilter/useOnSelect.ts`	Updated type imports for `RecoilState` and `Result`
`app/packages/utilities/src/type-check.ts`	Modified `isObjectIdString` function to support strict and non-strict object ID validation
`fiftyone/server/lightning.py`	Enhanced distinct query functionality with new search handling for object ID fields
`tests/unittests/lightning_tests.py`	Added test class for object ID lightning queries and updated query execution methods

Possibly related PRs

Coinciding filter results for QP sidebar #5121: Modifications to stringSearchResults selectorFamily
Merge release/v1.1.0 to develop #5130: Updates to stringSearchResults selector
Fix bounds for embedded lists in QP #5202: Bounds computations for embedded lists
Improve search results for multikey indexes in group datasets #5208: Search functionality for multikey indexes
Do not collapse valid list field filters #5280: Sidebar summary fields behavior

Suggested Labels

enhancement

Suggested Reviewers

minhtuev
CamronStaley

Poem

🐰 Hop, hop, through the code we go,
Object IDs dancing to and fro,
Searching with precision and might,
Making queries clean and bright!
A rabbit's quest for perfect parse, 🔍
Filtering magic without a sparse!

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

kaixi-wang · 2025-01-21T15:25:02Z

fiftyone/server/lightning.py

@@ -423,6 +425,23 @@ async def _do_distinct_pipeline(
    return values


+def _add_search(query: DistinctQuery):
+    # strip chars after 24


Does this mean sidebar only filters by first 24 chars? So long file path filters could potentially not include the file name?

What's the purpose of limiting to the first 24

Good comment. I refactored this incorrectly. It should only apply to ObjectId fields

I think I will add frontend validation to omit these queries completely. That is a more holistic approach.

An object id is always a 24 char hexadecimal string. The search results can be omitted when the search does not comply

I will undraft when I'm happy with the solution

benjaminpkane · 2025-01-30T22:33:59Z

Undrafting

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

tests/unittests/lightning_tests.py (1)

1175-1218: Enhance test coverage for ObjectId searches.

The current test only covers basic scenarios. Consider adding tests for:

Invalid hex characters
Empty strings
Null values
Mixed case hex strings
Strings shorter than 24 characters

Here's an example of additional test cases:

@drop_async_dataset
async def test_object_ids_edge_cases(self, dataset: fo.Dataset):
    keys = _add_samples(
        dataset,
        dict(id="0" * 24),
        dict(id="a" * 24),
        dict(id="f" * 24)
    )
    
    # Test invalid hex
    result = await _execute(
        query, dataset, fo.ObjectIdField, keys,
        frames=False, search="0123456789abcdefghijklmn"
    )
    for path in result.data["lightning"]:
        self.assertListEqual(path["values"], [])
    
    # Test empty string
    result = await _execute(
        query, dataset, fo.ObjectIdField, keys,
        frames=False, search=""
    )
    for path in result.data["lightning"]:
        self.assertListEqual(path["values"], [])
    
    # Test mixed case
    result = await _execute(
        query, dataset, fo.ObjectIdField, keys,
        frames=False, search="A" * 24
    )
    for path in result.data["lightning"]:
        self.assertListEqual(path["values"], ["a" * 24])

app/packages/core/src/components/Filters/StringFilter/state.ts (1)

60-63: Improve code organization and documentation.

Consider extracting the ObjectId validation logic to a separate function and adding documentation.

Apply this diff:

+      // Helper function to validate ObjectId search strings
+      const isValidObjectIdSearch = (path: string, search: string): boolean => {
+        return !(get(fos.isObjectIdField(path)) && !isObjectIdString(search, false));
+      };
+
-      // for object id searches, skip request when the string is not <= 24 hex
-      if (get(fos.isObjectIdField(path)) && !isObjectIdString(search, false)) {
+      if (!isValidObjectIdSearch(path, search)) {
         return { values: [] };
       }

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd1cfe8 and 8b28480.

📒 Files selected for processing (5)

app/packages/core/src/components/Filters/StringFilter/state.ts (2 hunks)
app/packages/core/src/components/Filters/StringFilter/useOnSelect.ts (1 hunks)
app/packages/utilities/src/type-check.ts (1 hunks)
fiftyone/server/lightning.py (4 hunks)
tests/unittests/lightning_tests.py (5 hunks)

✅ Files skipped from review due to trivial changes (1)

app/packages/core/src/components/Filters/StringFilter/useOnSelect.ts

🧰 Additional context used

📓 Path-based instructions (2)

app/packages/core/src/components/Filters/StringFilter/state.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

app/packages/utilities/src/type-check.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🪛 Ruff (0.8.2)

fiftyone/server/lightning.py

348-348: Do not assign a lambda expression, use a def

Rewrite matcher as a def

(E731)

351-351: Do not assign a lambda expression, use a def

Rewrite matcher as a def

(E731)

354-354: Do not assign a lambda expression, use a def

Rewrite matcher as a def

(E731)

437-437: Do not use bare except

(E722)

coderabbitai · 2025-01-30T22:36:13Z

app/packages/utilities/src/type-check.ts

+export function isObjectIdString(value: string, strict = true) {
+  return isHex(value) && strict ? value.length === 24 : value.length <= 24;


🛠️ Refactor suggestion

Improve the ObjectId validation logic.

The current implementation has a few issues that could be improved:

The regex pattern should be anchored and shouldn't use the global flag

The function logic can be simplified

Apply this diff to improve the implementation:

-export function isObjectIdString(value: string, strict = true) { - return isHex(value) && strict ? value.length === 24 : value.length <= 24; +export function isObjectIdString(value: string, strict = true) { + if (!value || (strict ? value.length !== 24 : value.length > 24)) { + return false; + } + return /^[0-9a-f]+$/.test(value); +}

This implementation:

Validates the length first (fail fast)

Uses an anchored regex without global flag

Handles empty/null values

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export function isObjectIdString(value: string, strict = true) {

return isHex(value) && strict ? value.length === 24 : value.length <= 24;

export function isObjectIdString(value: string, strict = true) {

if (!value || (strict ? value.length !== 24 : value.length > 24)) {

return false;

}

return /^[0-9a-f]+$/.test(value);

}

coderabbitai · 2025-01-30T22:36:13Z

fiftyone/server/lightning.py

+    matcher = lambda v: False
    if query.search:
        match = query.search
+        matcher = lambda v: match not in v
+        if query.is_object_id_field:
+            match = match[:_TWENTY_FOUR]
+            matcher = lambda v: v < match


🛠️ Refactor suggestion

Replace lambda functions with regular functions.

Using lambda functions for complex logic reduces readability and maintainability. Convert these to regular functions.

Apply this diff:

- matcher = lambda v: False + def matcher(v): + return False if query.search: match = query.search - matcher = lambda v: match not in v + def matcher(v): + return match not in v if query.is_object_id_field: match = match[:_TWENTY_FOUR] - matcher = lambda v: v < match + def matcher(v): + return v < match

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

matcher = lambda v: False

if query.search:

match = query.search

matcher = lambda v: match not in v

if query.is_object_id_field:

match = match[:_TWENTY_FOUR]

matcher = lambda v: v < match

def matcher(v):

return False

if query.search:

match = query.search

def matcher(v):

return match not in v

if query.is_object_id_field:

match = match[:_TWENTY_FOUR]

def matcher(v):

return v < match

🧰 Tools

🪛 Ruff (0.8.2)

348-348: Do not assign a lambda expression, use a def

Rewrite matcher as a def

(E731)

351-351: Do not assign a lambda expression, use a def

Rewrite matcher as a def

(E731)

354-354: Do not assign a lambda expression, use a def

Rewrite matcher as a def

(E731)

coderabbitai · 2025-01-30T22:36:14Z

fiftyone/server/lightning.py

+def _add_search(query: DistinctQuery):
+    # strip chars after 24
+    if query.is_object_id_field:
+        search = query.search[:_TWENTY_FOUR]
+        add = (_TWENTY_FOUR - len(search)) * "0"
+        if add:
+            search = f"{search}{add}"
+        try:
+            value = {"$gte": ObjectId(search)}
+        except:
+            # search is not valid
+            value = {"$lt": ObjectId("0" * _TWENTY_FOUR)}
+    else:
+        value = Regex(f"^{search}")
+    return {"$match": {query.path: value}}


⚠️ Potential issue

Improve error handling in ObjectId search logic.

The current implementation has a few issues:

Uses bare except clause which can hide errors

Doesn't validate the hex string before attempting ObjectId conversion

Apply this diff:

def _add_search(query: DistinctQuery): - # strip chars after 24 if query.is_object_id_field: search = query.search[:_TWENTY_FOUR] add = (_TWENTY_FOUR - len(search)) * "0" if add: search = f"{search}{add}" try: + if not all(c in '0123456789abcdef' for c in search.lower()): + raise ValueError("Invalid hex string") value = {"$gte": ObjectId(search)} - except: - # search is not valid + except (ValueError, TypeError) as e: + # search is not a valid hex string or ObjectId value = {"$lt": ObjectId("0" * _TWENTY_FOUR)} else: value = Regex(f"^{search}") return {"$match": {query.path: value}}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _add_search(query: DistinctQuery):

# strip chars after 24

if query.is_object_id_field:

search = query.search[:_TWENTY_FOUR]

add = (_TWENTY_FOUR - len(search)) * "0"

if add:

search = f"{search}{add}"

try:

value = {"$gte": ObjectId(search)}

except:

# search is not valid

value = {"$lt": ObjectId("0" * _TWENTY_FOUR)}

else:

value = Regex(f"^{search}")

return {"$match": {query.path: value}}

def _add_search(query: DistinctQuery):

if query.is_object_id_field:

search = query.search[:_TWENTY_FOUR]

add = (_TWENTY_FOUR - len(search)) * "0"

if add:

search = f"{search}{add}"

try:

if not all(c in '0123456789abcdef' for c in search.lower()):

raise ValueError("Invalid hex string")

value = {"$gte": ObjectId(search)}

except (ValueError, TypeError) as e:

# search is not a valid hex string or ObjectId

value = {"$lt": ObjectId("0" * _TWENTY_FOUR)}

else:

value = Regex(f"^{search}")

return {"$match": {query.path: value}}

🧰 Tools

🪛 Ruff (0.8.2)

437-437: Do not use bare except

(E722)

benjaminpkane added 2 commits January 21, 2025 10:06

improve object id filtering for qp

5505649

allow non hex search

9ddc487

benjaminpkane added bug Bug fixes app Issues related to App features labels Jan 21, 2025

benjaminpkane self-assigned this Jan 21, 2025

kaixi-wang reviewed Jan 21, 2025

View reviewed changes

bug, add frontend check

8b28480

benjaminpkane marked this pull request as ready for review January 30, 2025 22:34

coderabbitai bot reviewed Jan 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ObjectId Sidebar Filter Fixes #5415

ObjectId Sidebar Filter Fixes #5415

benjaminpkane commented Jan 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 21, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

kaixi-wang Jan 21, 2025

benjaminpkane Jan 21, 2025

benjaminpkane Jan 21, 2025

benjaminpkane commented Jan 30, 2025

coderabbitai bot left a comment

coderabbitai bot Jan 30, 2025

coderabbitai bot Jan 30, 2025

coderabbitai bot Jan 30, 2025

		export function isObjectIdString(value: string, strict = true) {
		return isHex(value) && strict ? value.length === 24 : value.length <= 24;

ObjectId Sidebar Filter Fixes #5415

Are you sure you want to change the base?

ObjectId Sidebar Filter Fixes #5415

Conversation

benjaminpkane commented Jan 21, 2025 • edited by coderabbitai bot Loading

What changes are proposed in this pull request?

How is this patch tested? If it is not, please explain why.

Release Notes

What areas of FiftyOne does this PR affect?

Summary by CodeRabbit

coderabbitai bot commented Jan 21, 2025 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested Labels

Suggested Reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

kaixi-wang Jan 21, 2025

Choose a reason for hiding this comment

benjaminpkane Jan 21, 2025

Choose a reason for hiding this comment

benjaminpkane Jan 21, 2025

Choose a reason for hiding this comment

benjaminpkane commented Jan 30, 2025

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 30, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 30, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 30, 2025

Choose a reason for hiding this comment

benjaminpkane commented Jan 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 21, 2025 •

edited

Loading