-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement concat command based on merge command #1010
Conversation
186659f
to
63c8342
Compare
This proposal seems specific to your use-case. MCAP files don't define a clear start and end of recording time, they only record the time of the first message and the time of the last message. Therefore, when joining start-to-end like this, you have to introduce some "gap" (you choose 100ms here) so as not to have a discontinuity in the message rate. That gap is always going to be an approximation, and the correct value depends entirely on how the MCAPs were recorded. I could imagine decomposing this concept into |
@james-rms Our use-case is to combine multiple captures together that have been recorded at different times in a way that we can easily run through our playback infrastructure and get a single output that covers a lot of different scenarios. Your suggestion of a |
@kevswims are you able to go into more detail on what other use-cases you think this would be useful for? |
The use cases we have identified are:
To create this file with this code we have already written some scripts that take a CSV file that indicates all of the source files and what start and end timestamps that we want and it cuts them up and combines them. Think of this like making a video, we go out and record lots of data, most of which is duplicate or useless but we want to edit that down and combine multiple clips into one thing that tells the story. I could also see a tool like this that would integrate with the Foxglove Data Platform to allow anyone to easily trim and combine mcap files to create these datasets. |
2d58f04
to
df18340
Compare
df18340
to
121b1fd
Compare
@james-rms did the use-cases I provided make sense for this? |
@kevswims They make sense, but we haven't seen anyone else ask for this, or work their own solutions out for this problem. I think until we see more interest from the community on this, it makes the most sense for this patch to remain in your fork. |
I think concatenation has good usecases, particularly if you can concatenate without decompressing chunks, which will require some more tricks than this but should be totally feasible. The usecase I have in mind is more around compacting small files into larger ones for better/more consistent performance in cloud storage. For me it would suffice to have a smarter "merge" command, that looks at the indexes and determines a plan of "concat" and "merge" operations that minimizes the number of file handles/chunks open at one time. If you supply it non-overlapping data, it will perform a pure concatenation. Otherwise it will concatenate where it can. |
I'm gonna close this out as not something we are pursuing right now. The best approach today for custom workflows like changing the timestamps in mcap files is to use one of the available libraries to read the files and produce the MCAP file suited to your needs. |
Public-Facing Changes
Adds a
concat
command to the mcap CLI that combines files with the timestamps rewritten sequentially starting from 0.Description
We are using this along with the
filter
command to take small chunks from a lot of files and combine them into one file that we can run simulations on.