Followup about recent question about Pickle #925
Unanswered
bobby-c-shen
asked this question in
1. Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is a follow-up question to issue #652 about 25 days ago.
If this is an inappropriate place to ask this question, I completely understand. (In this case, I would also be very grateful if you could refer me to somewhere I might find the answer to my own questions, as I don't know where.)
As a follow-up to this recent issue, I am wondering if my outline of decoding, at least for the AAC and AC3 audio codecs, is accurate.
Let's assume that all sample rates are 48000 and that all audio is mono.
Main section 1. Is my outline of the decoding process correct? Which points are wrong? Some of the points are from https://libav.org/documentation/doxygen/master/group__lavc__encdec.html.
6.1. The internal state of D is potentially very complex. (how complex?)
https://wiki.multimedia.cx/index.php/Understanding_AAC
--
Main section 2. Let's suppose that my outline in section 1 is accurate. If not, then the rest of my message might be moot.
Let's suppose we have initial decoder object D and either the AAC or AC3 codec and packets P[0], P[1], ..., P[999]. Assuming that the decoder state matters a lot, I'd like to consider 3 orders of passing the packets to D.
Order 1: The same order as the packets. P[0], P[1], ..., P[999]
Order 2: we remove P[0] completely. P[1], P[2], ..., P[999]
Order 3: We replace P[0] with an arbitrary packet, P_new. (e.g. P_new = P[1], but P_new could be an arbitrary packet not in the list.) P_new, P[1], ..., P[999]
In order 1, suppose that the output arrays are Y[0], Y[1], ..., Y[999]
In order 2, since the state may matter, we can't say that the first array output is Y[1]. Instead, we use different symbols Y2[1], Y2[2], ..., Y2[999]. (indexing from 1. This output list has 999 elements.)
In order 3, suppose that the output arrays are Y3[0], Y3[1], ..., Y3[999]. (1000 elements).
My main questions are: Is the state of D flushed fairly quickly or is the state very persistent such that any sequence 'mutation' will significantly change state, or somewhere in between? Although the lists Y, Y2, and Y3 are clearly similar waveforms perceptually, are they completely different at a low level or do they converge.
If hypothetically the state of D is flushed after 50 packets, then would Y[n], Y2[n], Y3[n] be approximately equal length-1024 float arrays for n >= 51? Is there any such value of n? Or maybe the state of D depends on how many packets are decoded and is otherwised flushed after 50 packets? If so, is Y[n] ~ Y3[n] for n >= 51 but Y[n] != Y2[n] for any large n because the decoder processed n packets before outputting Y[n] but only n-1 packets before Y2[n]
Note that I have experimented with PyAV and I suspect that for the AC3 codec and a deletion mutation, there is no such value of n. The decoder states will always be different. I do not know about a substitution mutation or the AAC codec or if I am doing my PyAV analysis correctly. I have only done experimenting with PyAV snce I am not used to using C.
Sincerely,
Bobby
Beta Was this translation helpful? Give feedback.
All reactions