Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New launch method for zip archives. #291

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

saschaklick
Copy link

"progressiveZipExtraction (PZE)" extracts only the file list at first and extracts only the selected files afterwards.
Might replace "makeLocalCopy". Much faster and less data transmitted especially on merged romset archives. For zip only; not 7z.

It works nicely on Linux. It is meant for low storage devices that will pull their files from slow storage; such as a Fire TV accessing merged archives on an SMB-share. Getting the list of files in archive is lighting fast with PZE. When extracting a progress dialog displays extraction progress.

Behind the scenes it works like this: only the required parts of the zip are downloaded to temporary local storage and repaired to appear like a zip containing just that one (or more if requested) file. Python's zipfile will then process the local temporary archive.

It's basically "makeLocalCopy 2.0". But only for zip files. 7z would not profit from this approach. makelocalCopy is still the best choice for 7z archives.

"progressiveZipExtraction" extracts only the file list at first and extracts only the selected files afterwards.
Might replace "makeLocalCopy". Much faster and less data transmitted especially on merged romset archives. For zip only; not 7z.
@bruny
Copy link
Collaborator

bruny commented Sep 10, 2017

I'm not a really big fan of having specific code for one type of zip archive but not another. I'm also not really sure what the problem is that this PR addresses - could you explain a bit more?

@saschaklick
Copy link
Author

Here are the short answers to your questions/concerns. I have provided a long, long report of my struggles further below.

I'm not a really big fan of having specific code for one type of zip archive but not another.

It only works on zip (as in PKZIP) files, because of their structure compared to 7z, which are the only two archive types supported by RCB. If you do not want to or cannot use 7z, which I cannot do, zip is the only choice and progressiveZipExtraction improves its performance when used with non-local files.

I'm also not really sure what the problem is that this PR addresses

Downloading a merged ROM zip archive (which are up to 200MB for N64 fullsets) from a centralized network share or cloud storage took far to long for a pleasant user experience. progressiveZipExtraction cuts that time down considerably. It only reads the part of the zip archive that are really required, the filelist and then - after the user chooses a file - the part of the zip that contains the chosen file. This make loading a merged zip archive as fast as loading an unmerged romset, the latter not being what most people seem to be using these days.


Here is the long version.

I have developed this feature for my specific setup which is a FireTV 1. So Android, unrooted. py7zip library is not installed, meaning I cannot use 7zip. zipfile library works, but only on files stored locally. That's why I provided the "Make local copy" feature some time ago. Problem was, it copies the whole zip file. It works OK on the small NES archives, already iffy on merged SNES sets (FF6 gets over 120MB with all translations and patches), unacceptably slow on merged N64. Downloading that takes minutes on my weak Wifi, which is about as fast as cloud storage on a mobile. Then extract the filelist using the zipfile library, then extract the specific rom with zipfile again, then pass the extracted ROM to the emu. I had to download the whole zip before I could pass it to the zipfile lib.

Zip-compression is bad for merged ROM sets because it stores each file individually one after the other, then puts the filelist at the end. If one fullset copy of FF6 compresses down to ~2MB and I have 60 different translations/hacks/mods/betas they all add up to 120MB in a single zip. Better then uncompressed, much worst than 7zip which just stores the changes for each file. I thinks it's below 10MB for a full FF6 in 7zip.

I wanted to at least be able to get everything up to the N64 to work nicely with my setup. So in came progressiveZipExtraction. I only download the filelist at the end of the zip which is about 1-2 KB, show the filelist, pick the file I want and then only download the part of the zip that contains that particular ROM. I never download more then the filelist plus the compressed size of the desired file - ~2MB in my FF6 example - before the game is ready to start.

On my slow Wifi that really makes a difference. The filelist pops up basically immediately. The ROM is loaded in 10 seconds. With makeLocalCopy I had to wait a mind-numbing 4 minutes before the filelist showed up.

As described above this can and will only work on zip files because of their specific structure. It only makes sense on merged ROM sets that need the makeLocalCopy feature to work, which quite some people seem to use after all.

For me using progressiveZipExtraction on a zipped FF6 would load faster then makeLocalCopy on a 7zipped FF6, because 7zip still needs to download the whole archive; which downloads in ~20 seconds but of course it fails to extract the filelist because I have no access to the py7zip lib on my FireTV. With progressiveZipExtraction FF6 runs in less than 5 seconds.

It might even be faster on local files, depending on how efficiently the zipfile library handles local file filelist extraction.

@saschaklick
Copy link
Author

I forgot to mention that - just like the makeLocalCopy-feature - progressiveZipExtraction is optional and disabled by default on all ROM collections, so including it will not break working installs. It is accessible via the ROM collection's config dialog.

@bruny
Copy link
Collaborator

bruny commented Sep 11, 2017

OK, so let me confirm what I understand your use case to be.

You have merged sets, i.e. multiple "versions" of a rom are in a single zip file. If I understand, similar to MAME where all clones are in the same zip file as the parent.

These are on network storage, and when trying to select a romset from the zip file to launch, for whatever reason zip is slow reading the table of contents (TOC) over the network for large files. Which is a problem for you because the collection is set to "Extract Zip Files" so that you can pick the specific romset in the merged romset to use.

Is that correct?

@bruny
Copy link
Collaborator

bruny commented Sep 11, 2017

Also, are you using SMB or NFS for the network?

@saschaklick
Copy link
Author

Yes, that is correct. On Android "Extract Zip Files" was a very important feature on Android (at least in 2014) since many emus did not support zip natively. It is "Extract Zip Files" that makes RCB extract and display the filelist instead of sending the whole archive to the emu.

I am using FTP, but xmbcvfs would handle SMB and NFS as well. I used SMB before. Too slow and cumbersome. I am not so much concerned with MAME roms but console roms, where one file equals a complete game but I made sure that progressiveZipExtraction can handle a request to extract multiple files from one archive, just like "Extract Zip Files" does, which is important for MAME roms, I believe.

The reading of the TOC was slow, because the whole zip file was transferred before RCB even started extracting the TOC via the zipfile lib. Transferring 100MB of data to extract a 3MB file was inefficient. I fixed that for me with progressiveZipExtraction.

@bruny
Copy link
Collaborator

bruny commented Sep 11, 2017

Your romsets are on an ftp share? What's the performance like if they are on an nfs mountpoint?

I don't have any issues getting a file listing of a compressed zip file of >200Mb on an nfs mount point, either using Python code or unzip -l.

Are you able to run unzip -l directly, without Kodi or RCB?

I haven't been able to find anything documenting that pythons zip module copies the whole file across the network before it can read the TOC. If it was a common problem I'd expect to find some known workarounds, do you have any docs or references? I suspect it's more likely to be related to having your romsets on an ftp share.

I'll be honest and state that I'm reluctant to introduce code that:
a. Adds MORE config settings
b. Adds low level file system workarounds for stuff that would be better off handled by kodi or zip library

I do agree that we don't have to unzip the entire archive if you just want a single file.

@saschaklick
Copy link
Author

saschaklick commented Sep 13, 2017

Are you able to run unzip -l directly, without Kodi or RCB?
On FireTV one can do nothing but the most basic stuff on the command line. It's a totally locked down Linux. That is why Kodi is no nice. It does all the heavy lifting when it comes to network file access. An unrooted FireTV (or other similarly locked down OS) cannot mount any kind of network share.

Your NFS is already fast, because - from Kodi's and zipfile's perspective - the file is local. The OS does the fetching, reading, seeking. On a locked down Android, Kodi can do what the OS does not. A FireTV has neither native FTP nor NFS nor SMB mounting or any mounting support at that. But Kodi can access network shares with relative ease, just not as convenient and transparent to the libraries as a real mountpoint share would.

It just breaks when passing the KODI filepath to other Python libs that do not know how to handle KODI filepaths because why would they. xmbcvfs is a KODI feature, not a general Python feature.

This feature does not cover the use cases of Windows or Mac or Linux users. They have better options then routing everything through xmbcvfs when dealing with ROMs on network shares. Some need to. Many those Android HDMI stick users need to. People who sideload KODI onto locked down Android TVs need to. FireTV 1 users need to. Even FireTV 2 users need to if they don't want to go through the hassle of rooting. Hey, maybe even Windows and Mac users might find using KODI's network share support more convenient to use than their OS's native methods. They would profit too,

It's for those users, I myself being one of them. I think it would make RCB better. It would not break anything. It is completely optional. It is as cryptic as the rest of most of the collection options and launch parameters. It makes very little sense in the main KODI API unless we fork and rewrite zipfile completely to use xmbcvfs instead of the vanilla Python file access/stream libs. Is that something that is encouraged by the KODI development guidelines?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants