-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: ✨ scale_detections function added to adjust bbox,masks,obb for scaled images #1711
base: develop
Are you sure you want to change the base?
Conversation
bounding box coordinates and masks and obb for scaled images Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
from supervision.detection.core import ORIENTED_BOX_COORDINATES, Detections | ||
|
||
|
||
def scale_detections( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@onuralpszr can we change function name to specific to letterbox otherwise it will confuse users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hardikdava scale_letterbox_detections ? (maybe) what do you think @LinasKo ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Un-letterboxing feels like a very narrow use case.
We should provide general functions which also enable letterbox reversal, and highlight that case in the examples & docs.
While it might be frustrating, let me take the time to review this in-depth over the weekend & start of next week, for reversing API decisions is hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LinasKo I can add enum parameter and change other two to src_res and target_res params and based on enum parameter it can be letterbox depended scale or normal scale, I can also add other extra cases to handle normal scale cases without "padding" ?
@onuralpszr a minor change required. Everything else looks good. |
Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a deeper look at the PR.
@SkalskiP, I need your thoughts on this.
- This PR submits the first function to
detection_utils.py
- utils for operations on entire detections objects, rather than their components. We'll move more utils here soon. - The PR makes a common operation simpler:
- Letterbox an image
- Send to a model
- (new) un-letterbox detections, to fit original image
- I'm suggesting we reuse some utils we had (e.g.
move_boxes
), but cannot use others. For example,scale_boxes
cannot be used, as it anchors to detection centers and scales them in-place, whereas here we assume the image changed size, so gaps between detections need to increase as well.
Because of the last point, splitting this entire function into multiple more general methods is tricky.
Ultimately, this is helpful for other projects, but very low value for us right now. Yet if we merge the wrong API, we'll struggle to remove it later.
|
||
boxes = detections.xyxy.copy() | ||
boxes[:, [0, 2]] -= padding_left | ||
boxes[:, [1, 3]] -= padding_top |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use move_boxes
from utils
mask = mask[ | ||
padding_top : padding_top + height_new, | ||
padding_left : padding_left + width_new, | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use move_masks
despite #1715 - we know that padding will be positive.
if ORIENTED_BOX_COORDINATES in detections.data: | ||
obbs = np.array(detections.data[ORIENTED_BOX_COORDINATES]).copy() | ||
obbs[:, :, 0] -= padding_left | ||
obbs[:, :, 1] -= padding_top |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use move_oriented_boxes
from utils
.
from supervision.detection.core import ORIENTED_BOX_COORDINATES, Detections | ||
|
||
|
||
def scale_detections( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest renaming to either scale_detections_to_fill
or undo_letterboxing_detections
.
I'd like some advice from @SkalskiP here - I've explained more in the global review comment.
|
||
Returns: | ||
Detections: A new Detections object with scaled to target resolution. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc should come with an example showing how to use this. It should make it evident that letterboxing can be undone using this operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also find a place in the docs where we could showcase the scenario of:
- Letterbox an image
- Send the appropriately sized image to the model
- Undo the letterboxing
Without this, discovery of the method will be very low.
Hey @hardikdava, I took a deeper look. I know I asked for the PR, and it solves your problem, but we need time to think. Most importantly, if we just add it as is, the discovery for it will be very low. I need to speak with @SkalskiP about this first. |
@LinasKo Thanks for taking a look. I have already duplicated the function in other required project. So you can think from the future perspective and based only on supervision project. |
Description
Instead of just handling xyxy values based on previous PR #1708, I approached with Detections object and I handle also mask and obb translations so that it can work for all cases
Test Cases
Test Seg and Bbox Case
Test OBB Case