Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the class att_rnn #6

Open
infected4098 opened this issue Aug 4, 2023 · 0 comments
Open

Regarding the class att_rnn #6

infected4098 opened this issue Aug 4, 2023 · 0 comments

Comments

@infected4098
Copy link

infected4098 commented Aug 4, 2023

Hi. This is Yong Joon Lee. I am implementing LAS model based on your code. I know you might not remember the actual code cuz obviously you implemented it 3 years ago. But I think I found out that class att_rnn might have a tiny mistake in code ordering. If you see the class att_rnn's call part. you define s twice in a row then move onto c, which is a attention context.

your ordering is as below:

s       = self.rnn(inputs = inputs, states = states) # s = m_{t}, [m_{t}, c_{t}] #m is memory(hidden) and c is carry(cell)
s       = self.rnn2(inputs=s[0], states = s[1])[1] # s = m_{t+1}, c_{t+1}
c       = self.attention_context([s[0], h])

but isn't it supposed to be as below?

s       = self.rnn(inputs = inputs, states = states) # s = m_{t}, [m_{t}, c_{t}]
c       = self.attention_context([s[0], h]) 
s       = self.rnn2(inputs=s[0], states = s[1])[1] # s = m_{t+1}, c_{t+1}

As the original paper suggests, attention context vector at timestep t is made by applying attention to the s_t and h, where h is a result of pBLSTM. But I think by your way of ordering you are deriving attention context vector from s_{t+1} and h. Thank you for your great work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant