You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. This is Yong Joon Lee. I am implementing LAS model based on your code. I know you might not remember the actual code cuz obviously you implemented it 3 years ago. But I think I found out that class att_rnn might have a tiny mistake in code ordering. If you see the class att_rnn's call part. you define s twice in a row then move onto c, which is a attention context.
your ordering is as below:
s = self.rnn(inputs = inputs, states = states) # s = m_{t}, [m_{t}, c_{t}] #m is memory(hidden) and c is carry(cell)
s = self.rnn2(inputs=s[0], states = s[1])[1] # s = m_{t+1}, c_{t+1}
c = self.attention_context([s[0], h])
but isn't it supposed to be as below?
s = self.rnn(inputs = inputs, states = states) # s = m_{t}, [m_{t}, c_{t}]
c = self.attention_context([s[0], h])
s = self.rnn2(inputs=s[0], states = s[1])[1] # s = m_{t+1}, c_{t+1}
As the original paper suggests, attention context vector at timestep t is made by applying attention to the s_t and h, where h is a result of pBLSTM. But I think by your way of ordering you are deriving attention context vector from s_{t+1} and h. Thank you for your great work.
The text was updated successfully, but these errors were encountered:
Hi. This is Yong Joon Lee. I am implementing LAS model based on your code. I know you might not remember the actual code cuz obviously you implemented it 3 years ago. But I think I found out that class att_rnn might have a tiny mistake in code ordering. If you see the class att_rnn's call part. you define s twice in a row then move onto c, which is a attention context.
your ordering is as below:
but isn't it supposed to be as below?
As the original paper suggests, attention context vector at timestep t is made by applying attention to the s_t and h, where h is a result of pBLSTM. But I think by your way of ordering you are deriving attention context vector from s_{t+1} and h. Thank you for your great work.
The text was updated successfully, but these errors were encountered: