-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ObjectId Sidebar Filter Fixes #5415
base: develop
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces improvements to handling object ID searches across multiple components. The changes focus on enhancing the search functionality for object ID fields in the FiftyOne platform. The modifications span several files, including utility functions, server-side query logic, and test cases. The primary goal is to provide more robust and flexible handling of object ID searches, with specific attention to validation, matching, and query performance. Changes
Possibly related PRs
Suggested Labels
Suggested Reviewers
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
@@ -423,6 +425,23 @@ async def _do_distinct_pipeline( | |||
return values | |||
|
|||
|
|||
def _add_search(query: DistinctQuery): | |||
# strip chars after 24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean sidebar only filters by first 24 chars? So long file path filters could potentially not include the file name?
What's the purpose of limiting to the first 24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good comment. I refactored this incorrectly. It should only apply to ObjectId fields
I think I will add frontend validation to omit these queries completely. That is a more holistic approach.
An object id is always a 24 char hexadecimal string. The search results can be omitted when the search does not comply
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will undraft when I'm happy with the solution
Undrafting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
tests/unittests/lightning_tests.py (1)
1175-1218
: Enhance test coverage for ObjectId searches.The current test only covers basic scenarios. Consider adding tests for:
- Invalid hex characters
- Empty strings
- Null values
- Mixed case hex strings
- Strings shorter than 24 characters
Here's an example of additional test cases:
@drop_async_dataset async def test_object_ids_edge_cases(self, dataset: fo.Dataset): keys = _add_samples( dataset, dict(id="0" * 24), dict(id="a" * 24), dict(id="f" * 24) ) # Test invalid hex result = await _execute( query, dataset, fo.ObjectIdField, keys, frames=False, search="0123456789abcdefghijklmn" ) for path in result.data["lightning"]: self.assertListEqual(path["values"], []) # Test empty string result = await _execute( query, dataset, fo.ObjectIdField, keys, frames=False, search="" ) for path in result.data["lightning"]: self.assertListEqual(path["values"], []) # Test mixed case result = await _execute( query, dataset, fo.ObjectIdField, keys, frames=False, search="A" * 24 ) for path in result.data["lightning"]: self.assertListEqual(path["values"], ["a" * 24])app/packages/core/src/components/Filters/StringFilter/state.ts (1)
60-63
: Improve code organization and documentation.Consider extracting the ObjectId validation logic to a separate function and adding documentation.
Apply this diff:
+ // Helper function to validate ObjectId search strings + const isValidObjectIdSearch = (path: string, search: string): boolean => { + return !(get(fos.isObjectIdField(path)) && !isObjectIdString(search, false)); + }; + - // for object id searches, skip request when the string is not <= 24 hex - if (get(fos.isObjectIdField(path)) && !isObjectIdString(search, false)) { + if (!isValidObjectIdSearch(path, search)) { return { values: [] }; }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
app/packages/core/src/components/Filters/StringFilter/state.ts
(2 hunks)app/packages/core/src/components/Filters/StringFilter/useOnSelect.ts
(1 hunks)app/packages/utilities/src/type-check.ts
(1 hunks)fiftyone/server/lightning.py
(4 hunks)tests/unittests/lightning_tests.py
(5 hunks)
✅ Files skipped from review due to trivial changes (1)
- app/packages/core/src/components/Filters/StringFilter/useOnSelect.ts
🧰 Additional context used
📓 Path-based instructions (2)
app/packages/core/src/components/Filters/StringFilter/state.ts (1)
Pattern **/*.{ts,tsx}
: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.
app/packages/utilities/src/type-check.ts (1)
Pattern **/*.{ts,tsx}
: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.
🪛 Ruff (0.8.2)
fiftyone/server/lightning.py
348-348: Do not assign a lambda
expression, use a def
Rewrite matcher
as a def
(E731)
351-351: Do not assign a lambda
expression, use a def
Rewrite matcher
as a def
(E731)
354-354: Do not assign a lambda
expression, use a def
Rewrite matcher
as a def
(E731)
437-437: Do not use bare except
(E722)
export function isObjectIdString(value: string, strict = true) { | ||
return isHex(value) && strict ? value.length === 24 : value.length <= 24; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve the ObjectId validation logic.
The current implementation has a few issues that could be improved:
- The regex pattern should be anchored and shouldn't use the global flag
- The function logic can be simplified
Apply this diff to improve the implementation:
-export function isObjectIdString(value: string, strict = true) {
- return isHex(value) && strict ? value.length === 24 : value.length <= 24;
+export function isObjectIdString(value: string, strict = true) {
+ if (!value || (strict ? value.length !== 24 : value.length > 24)) {
+ return false;
+ }
+ return /^[0-9a-f]+$/.test(value);
+}
This implementation:
- Validates the length first (fail fast)
- Uses an anchored regex without global flag
- Handles empty/null values
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
export function isObjectIdString(value: string, strict = true) { | |
return isHex(value) && strict ? value.length === 24 : value.length <= 24; | |
export function isObjectIdString(value: string, strict = true) { | |
if (!value || (strict ? value.length !== 24 : value.length > 24)) { | |
return false; | |
} | |
return /^[0-9a-f]+$/.test(value); | |
} |
matcher = lambda v: False | ||
if query.search: | ||
match = query.search | ||
matcher = lambda v: match not in v | ||
if query.is_object_id_field: | ||
match = match[:_TWENTY_FOUR] | ||
matcher = lambda v: v < match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace lambda functions with regular functions.
Using lambda functions for complex logic reduces readability and maintainability. Convert these to regular functions.
Apply this diff:
- matcher = lambda v: False
+ def matcher(v):
+ return False
if query.search:
match = query.search
- matcher = lambda v: match not in v
+ def matcher(v):
+ return match not in v
if query.is_object_id_field:
match = match[:_TWENTY_FOUR]
- matcher = lambda v: v < match
+ def matcher(v):
+ return v < match
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
matcher = lambda v: False | |
if query.search: | |
match = query.search | |
matcher = lambda v: match not in v | |
if query.is_object_id_field: | |
match = match[:_TWENTY_FOUR] | |
matcher = lambda v: v < match | |
def matcher(v): | |
return False | |
if query.search: | |
match = query.search | |
def matcher(v): | |
return match not in v | |
if query.is_object_id_field: | |
match = match[:_TWENTY_FOUR] | |
def matcher(v): | |
return v < match |
🧰 Tools
🪛 Ruff (0.8.2)
348-348: Do not assign a lambda
expression, use a def
Rewrite matcher
as a def
(E731)
351-351: Do not assign a lambda
expression, use a def
Rewrite matcher
as a def
(E731)
354-354: Do not assign a lambda
expression, use a def
Rewrite matcher
as a def
(E731)
def _add_search(query: DistinctQuery): | ||
# strip chars after 24 | ||
if query.is_object_id_field: | ||
search = query.search[:_TWENTY_FOUR] | ||
add = (_TWENTY_FOUR - len(search)) * "0" | ||
if add: | ||
search = f"{search}{add}" | ||
try: | ||
value = {"$gte": ObjectId(search)} | ||
except: | ||
# search is not valid | ||
value = {"$lt": ObjectId("0" * _TWENTY_FOUR)} | ||
else: | ||
value = Regex(f"^{search}") | ||
return {"$match": {query.path: value}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve error handling in ObjectId search logic.
The current implementation has a few issues:
- Uses bare except clause which can hide errors
- Doesn't validate the hex string before attempting ObjectId conversion
Apply this diff:
def _add_search(query: DistinctQuery):
- # strip chars after 24
if query.is_object_id_field:
search = query.search[:_TWENTY_FOUR]
add = (_TWENTY_FOUR - len(search)) * "0"
if add:
search = f"{search}{add}"
try:
+ if not all(c in '0123456789abcdef' for c in search.lower()):
+ raise ValueError("Invalid hex string")
value = {"$gte": ObjectId(search)}
- except:
- # search is not valid
+ except (ValueError, TypeError) as e:
+ # search is not a valid hex string or ObjectId
value = {"$lt": ObjectId("0" * _TWENTY_FOUR)}
else:
value = Regex(f"^{search}")
return {"$match": {query.path: value}}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def _add_search(query: DistinctQuery): | |
# strip chars after 24 | |
if query.is_object_id_field: | |
search = query.search[:_TWENTY_FOUR] | |
add = (_TWENTY_FOUR - len(search)) * "0" | |
if add: | |
search = f"{search}{add}" | |
try: | |
value = {"$gte": ObjectId(search)} | |
except: | |
# search is not valid | |
value = {"$lt": ObjectId("0" * _TWENTY_FOUR)} | |
else: | |
value = Regex(f"^{search}") | |
return {"$match": {query.path: value}} | |
def _add_search(query: DistinctQuery): | |
if query.is_object_id_field: | |
search = query.search[:_TWENTY_FOUR] | |
add = (_TWENTY_FOUR - len(search)) * "0" | |
if add: | |
search = f"{search}{add}" | |
try: | |
if not all(c in '0123456789abcdef' for c in search.lower()): | |
raise ValueError("Invalid hex string") | |
value = {"$gte": ObjectId(search)} | |
except (ValueError, TypeError) as e: | |
# search is not a valid hex string or ObjectId | |
value = {"$lt": ObjectId("0" * _TWENTY_FOUR)} | |
else: | |
value = Regex(f"^{search}") | |
return {"$match": {query.path: value}} |
🧰 Tools
🪛 Ruff (0.8.2)
437-437: Do not use bare except
(E722)
What changes are proposed in this pull request?
Allow invalid ObjectId searches
Allow > 24 hexadecimal character searches
Allow non-hexadecimal character searches
Fixes cursor matching for id fields in embedded lists, e.g.
ground_truth.detections.id
How is this patch tested? If it is not, please explain why.
GraphQL tests
Release Notes
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
New Features
Bug Fixes
Tests
The changes optimize search performance and provide more robust handling of ObjectId fields across the application's query and filtering systems.