Replies: 4 comments
-
I think it has to do with the way the memory is managed
#[wasm_bindgen]
pub struct Pointer {
ptr: *const u8,
len: usize,
}
#[wasm_bindgen]
pub fn f(bytes: Vec<u8>) -> Pointer {
let result = bytes.to_vec(); // low maximum
let result = ManuallyDrop::new(bytes.to_vec()); // high maximum, like Vec/Box
Pointer {
ptr: result.as_ptr(),
len: result.len(),
}
} |
Beta Was this translation helpful? Give feedback.
-
It also has the same issue when running on latest Chrome and Firefox, Node 20.3.1, and Deno 1.36.1
|
Beta Was this translation helpful? Give feedback.
-
I found out that when freeing the returned bytes, like wasm-bindgen does by default, the benchmarks are normal try {
const retptr = wasm.__wbindgen_add_to_stack_pointer(-16);
const ptr0 = passArray8ToWasm0(bytes, wasm.__wbindgen_malloc);
const len0 = WASM_VECTOR_LEN;
const ptr1 = passArray8ToWasm0(mask, wasm.__wbindgen_malloc);
const len1 = WASM_VECTOR_LEN;
wasm.xor_mod_unsafe(retptr, ptr0, len0, ptr1, len1);
var r0 = getInt32Memory0()[retptr / 4 + 0];
var r1 = getInt32Memory0()[retptr / 4 + 1];
wasm.__wbindgen_free(r0, r1 * 1);
} finally {
wasm.__wbindgen_add_to_stack_pointer(16);
}
So it seems the more objects are in memory, the function becomes slower, why is this happening? |
Beta Was this translation helpful? Give feedback.
-
My first guess would be that it isn't slower because there are more objects in memory, but because it can't reuse memory, it has to constantly allocate new memory for every call. Allocation performance is common thing to optimize first in your application. In theory your browser developer tools should be able to tell you exactly what it's spending time on, did you try that yet? |
Beta Was this translation helpful? Give feedback.
-
Let's suppose a function that takes some Box/Vec in...
...and a custom glue code that just passes the bytes...
...that I run with the following code (using any benchmarking lib)
When benchmarking this, we get the following results
If I change the function to return a boolean and keep the same glue code...
...the benchmark is almost the same
If I change the function to return the bytes as a custom "pointer" struct, and still keep the same glue code...
...the benchmark is still great
But if I return the bytes as a Vec or Box<[u8]>, and still keep the same glue code...
...the minimum is still great, but the maximum is almost 10x worse
WHY
Beta Was this translation helpful? Give feedback.
All reactions