Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any example of online inference of S4 block? #158

Open
traidn opened this issue Dec 27, 2024 · 5 comments
Open

Any example of online inference of S4 block? #158

traidn opened this issue Dec 27, 2024 · 5 comments

Comments

@traidn
Copy link

traidn commented Dec 27, 2024

Are there any examples of how to infer s4 block in recurrent mode? I tried using the step function, but it gives errors. I'm attaching my script. What could be the problem?

import torch
from s4 import S4
from sashimi import ResidualBlock

def s4_block(dim):
    layer = S4(
        d_model=dim,
        d_state=16,
        bidirectional=False,
        dropout=0.0,
        transposed=True,
    )
    return ResidualBlock(
        d_model=dim,
        layer=layer,
        dropout=0.0,
    )

model = s4_block(16)
for module in model.modules():
    if hasattr(module, 'setup_step'): module.setup_step(mode="diagonal")
model.eval()

input_seg = torch.randn(1, 16, 100)

full_out, _ = model(input_seg)
print(full_out)

s4_state = model.default_state()
stream_res = []
for i in range(input_seg.shape[-1]):
    part_input = input_seg[:, :, i]
    print(part_input.shape)
    part_res, s4_state = model.step(part_input, s4_state)
    stream_res.append(part_res)

stream_res = torch.cat(stream_res, dim=2)
print(stream_res)
print(torch.allclose(full_out, stream_res))
@traidn traidn changed the title Any example of online inferense of S4 block? Any example of online inference of S4 block? Dec 27, 2024
@sendeniz
Copy link

sendeniz commented Jan 4, 2025

Hi @traidn, I am working on something similar at the moment. Could you post your error ? I tried running your code. The code runs despite this error at the beginning:

Diagonalization error: tensor(0.2134, grad_fn=<DistBackward0>) Diagonalization error: tensor(0.2134, grad_fn=<DistBackward0>)

Is this the error you are referring to ? If yes you can find in the documentation of sashimi.py:

S4 recurrence mode. Using `diagonal` can speed up generation by 10-20%.
`linear` should be faster theoretically but is slow in practice since it
dispatches more operations (could benefit from fused operations).
Note that `diagonal` could potentially be unstable if the diagonalization is numerically unstable
(although we haven't encountered this case in practice), while `dense` should always be stable.

So setting it to module.setup_step(mode="linear"), instead of diagonal resolve this issues, even-though it is slower than diagonal. Please let me know if this worked for you.

Best Deniz

@traidn
Copy link
Author

traidn commented Jan 13, 2025

Hello, thanks for answer. But I still have problems. I try to run code above on Windows and on cpu. I guess it can be the reason. But I'm gonna inference model on cpu in future, therefore I need to check it now. My error is something like this:

CUDA extension for cauchy multiplication not found. Install by going to extensions/cauchy/ and running python setup.py install. This should speed up end-to-end training by 10-50%
[KeOps] Warning : 
    The default C++ compiler could not be found on your system.
    You need to either define the CXX environment variable or a symlink to the g++ command.
    For example if g++-8 is the command you can do
      import os
      os.environ['CXX'] = 'g++-8'
[KeOps] Warning : Cuda libraries were not detected on the system or could not be loaded ; using cpu only mode
Falling back on slow Cauchy kernel. Install at least one of pykeops or the CUDA extension for efficiency.
Falling back on slow Vandermonde kernel. Install pykeops for improved memory efficiency.
tensor([[[-1.7425, -0.1773, -0.7354,  ...,  1.5515, -0.7987, -0.0329],
         [-1.5545, -1.2401,  1.4183,  ...,  3.3138,  0.4148,  1.6980],
         [ 0.9791,  0.7177, -1.2344,  ..., -0.8843, -2.8430,  1.2774],
         ...,
         [-0.0802, -1.6018, -0.6896,  ..., -0.0973,  0.6955,  0.4760],
         [-0.3425,  0.6922,  0.0774,  ..., -0.2113,  1.2491,  0.0076],
         [-1.1684,  1.2272,  0.6717,  ...,  0.1415,  1.3729,  0.5358]]],
       grad_fn=<AddBackward0>)
torch.Size([1, 16])
Traceback (most recent call last):
  File "D:\Python_Projects\aTENNuate_test\S4M\dummy_func.py", line 34, in <module>
    part_res, s4_state = model.step(part_input, s4_state)
  File "D:\Python_Projects\aTENNuate_test\S4M\sashimi.py", line 196, in step
    z, state = self.layer.step(z, state, **kwargs)
  File "D:\Python_Projects\aTENNuate_test\S4M\s4.py", line 1557, in step
    y, next_state = self.kernel.step(u, state) # (B C H)
  File "D:\Python_Projects\aTENNuate_test\S4M\s4.py", line 1353, in step
    y, state = self.kernel.step(u, state, **kwargs)
  File "D:\Python_Projects\aTENNuate_test\S4M\s4.py", line 1021, in step
    new_state = self._step_state(u, state)
  File "D:\Python_Projects\aTENNuate_test\S4M\s4.py", line 928, in _step_state
    b = self.input_contraction(self.dB, u)
  File "D:\Python_Projects\aTENNuate_test\.venv\lib\site-packages\opt_einsum\contract.py", line 763, in __call__
    return self._contract(ops, out, backend, evaluate_constants=evaluate_constants)
  File "D:\Python_Projects\aTENNuate_test\.venv\lib\site-packages\opt_einsum\contract.py", line 693, in _contract
    return _core_contract(list(arrays),
  File "D:\Python_Projects\aTENNuate_test\.venv\lib\site-packages\opt_einsum\contract.py", line 591, in _core_contract
    new_view = _einsum(einsum_str, *tmp_operands, backend=backend, **einsum_kwargs)
  File "D:\Python_Projects\aTENNuate_test\.venv\lib\site-packages\opt_einsum\sharing.py", line 151, in cached_einsum
    return einsum(*args, **kwargs)
  File "D:\Python_Projects\aTENNuate_test\.venv\lib\site-packages\opt_einsum\contract.py", line 353, in _einsum
    return fn(einsum_str, *operands, **kwargs)
  File "D:\Python_Projects\aTENNuate_test\.venv\lib\site-packages\opt_einsum\backends\torch.py", line 45, in einsum
    return torch.einsum(equation, operands)
  File "D:\Python_Projects\aTENNuate_test\.venv\lib\site-packages\torch\functional.py", line 381, in einsum
    return einsum(equation, *_operands)
  File "D:\Python_Projects\aTENNuate_test\.venv\lib\site-packages\torch\functional.py", line 386, in einsum
    return _VF.einsum(equation, operands)  # type: ignore[attr-defined]
RuntimeError: einsum(): the number of subscripts in the equation (1) does not match the number of dimensions (2) for operand 0 and no ellipsis was given

Do you have any idea? Or I have to switch to Linux and CUDA?

@sendeniz
Copy link

sendeniz commented Jan 13, 2025

Hey @traidn you have one warning and one error.
I think that you can ignore the Keyops warning for now as it will only impact efficiency or speed. We can look into this later. The error that causes the code to abort, is the mismatch in dimensions. Its hard to debug unless seeing your code. Could you share how the S4 Model is defined and how you initialize it in your code ? I think I have an idea on how to solve it.

@traidn
Copy link
Author

traidn commented Jan 13, 2025

@sendeniz Oh, it seems like I found error - I used old version of S4, which I found in another repo. When I paste code from this repo code above works fine. I'll check one more time and close the issue.

@sendeniz
Copy link

@traidn any updates ? Did it work successful ? Performance is desirable ? Let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants