Separation of functionality in pyradiomics and pyradiomicsbatch command line tools #203

fedorov · 2017-02-16T17:59:18Z

@Radiomics/developers: is there a good reason to have these two as opposed of having a single pyradiomics command line tool that would support operation for both single input and directory?

Among other things, having a single processing script might help make things more straightforward with Docker deployment.

JoostJM · 2017-02-17T14:35:31Z

@fedorov, I made 2 scripts at first because they represent two types of pyradiomics usage, with the difference mainly being in how the input and output is provided. I think it shouldn't be much of a problem to create 1 script. Main issue would be how to handle the fact that in pyradiomics, there are two separate input files required (representing one combination) and in pyradiomicsbatch, only 1 file is required (representing the combinations in a csv file).

fedorov · 2017-02-17T14:44:42Z

We could just have different command line flags, but perhaps you want to simplify by allowing usage when no options are needed. Alternatively, we could make the script automatically detect whether the input is a directory or file. I don't have strong preference, just wanted to discuss this.

JoostJM · 2017-02-17T14:52:50Z

@fedorov, currently input files are positional arguments, without output file optional (and parameterized) in pyradiomics and positional and required in pyradiomicsbatch. As to automatically detecting folders, this would not work, as the input for the batch is also a file, not a directory. We could check for extensions, with the extension .csv pointing to batch processing and otherwise to single image processing.

fedorov · 2017-02-22T15:47:43Z

As discussed at the meeting, this is postponed for further discussion and for another release.

fedorov · 2017-02-22T15:49:43Z

Todo for myself - rebase after v1.1.0 is out

alannavial · 2017-02-28T05:09:03Z

Could you please provide an example input file for the pyradiomicsbatch command? The formatting for the batch file is confusing. It makes sense that you need to provide a path to the image and mask along with a patientID to identify the separate files. However I don't understand what you are supposed to put for sequence name (image identifier) and 3) reader (segmentation identifier).

"The input file for batch processing is a CSV file where each row represents one combination of an image and a segmentation and contains 5 elements: 1) patient ID, 2) sequence name (image identifier), 3) reader (segmentation identifier), 4) path/to/image, 5) path/to/mask."

JoostJM · 2017-02-28T10:45:37Z

@alannavial, This is due to the fact that a patient ID alone is not enough. A patient can have more sequences (images, e.g. multiparametric MRI, multiple phase CT) and each image can have multiple segmentations (different structures, different readers). To ensure each extraction has a unique identifier, it is comprised of patient-sequence-segmentation. These are the first three elements of each line and are not 'active' fields, they are just copied to the output. Only element 4 and 5 (image and label location) are used for the extraction. If you don't have separate readers or sequences, you can fill in anything you want (I usually use "N/A" in these cases). However, the code expects 5 elements, so for the moment, ensure that you don't omit the sequence and reader.

I will update the batch processing to be more flexible (i.e. copy every line, and use the last two fields as image and mask location). This will be part of a new release.

alannavial · 2017-02-28T21:28:12Z

Hi @JoostJM, thank you for clearing that up. My main confusion was, what you meant by the terms reader and sequences. I think it would be best to provide examples of what would go in these fields to make it clearer. As it is, I'm still unsure by what would go in the reader (segmentation identifier) field.

Also as an additional question, have you looked into adapting your toolbox to read DICOM-RT file formats? Most institutions seem to be using DICOM-RT more commonly now.

JoostJM · 2017-02-28T21:35:02Z

@alannavial, simplest example for reader: the filename of the mask.

As to your additional question, we are currently looking into this, but have no support currently. It is possible to build an extension for 3D slicer which enables use of pyradiomics via the slicer interface. This could potentially be combined with other slicer modules which can read DICOM-RT. This is not tested yet however.

JoostJM · 2018-02-14T17:02:04Z

@fedorov As previously discussed, I'm going to take a second look at the commandline scripts with the goal of having 1 entry point with different subcommands (like the git commandline tools).
Currently I'm thinking about the following. Do you have any additions/comments?

pyradiomics General entry point, only provides some information on the possible tools, maybe an XML description for use in SlicerCLIs?
pyradiomics single Extract features for 1 set of image + segmentation
pyradiomics batch Extract features for multiple sets of image + segmentation (supplied in .csv input; "batch mode")
pyradiomics voxel (after merging ADD: Add voxel-based extraction #337) Extract voxel-based parameter maps for 1 set of image + segmentation
pyradiomics voxel-batch (after merging ADD: Add voxel-based extraction #337) Extract voxel-based parameter maps in batch mode
pyradiomics model (after merging Add PyRadiomics Models #338) Apply a model to 1 set of image + segmentation
pyradiomics model-batch (after merging Add PyRadiomics Models #338) Apply a model in batch mode

fedorov · 2018-02-19T21:23:39Z

Few ideas to consider/discuss:

instead of using "single/batch/model", maybe better to use "--operation [single|batch|model]"? This would also match better the capabilities of the Slicer CLI XML
how about instead of adding the "voxel" mode, assume that voxel map is requested when label is not specified, and log warnings for those features that required presence of the label?
I really like the idea of merging pyradiomics and pyradiomicsbatch, since this will reduce duplication and simplify maintenance

JoostJM · 2018-02-20T09:43:20Z

instead of using "single/batch/model", maybe better to use "--operation [single|batch|model]"? This would also match better the capabilities of the Slicer CLI XML

The main issue I see here is that --operation identifies it as an optional argument, whereas it's arguably the most important one. Moreover, the different operating modes also have different commandline requirements. For example: in single mode, you specify the image and mask directly, but in batch mode, you specify a csv file that lists the respective cases.

how about instead of adding the "voxel" mode, assume that voxel map is requested when label is not specified, and log warnings for those features that required presence of the label?

Also in voxel-based extraction, a labelmap is required (for now, later we can make it optional). This has a 2-fold reason: 1) especially in large images, you don't want to perform a voxelbased extraction on the whole image, this is much too computationally intensive and 2) in voxel-based extraction, it is possible to mask the kernel with the ROI, ensuring the features are still only calculated on the ROI intensities.

I really like the idea of merging pyradiomics and pyradiomicsbatch, since this will reduce duplication and simplify maintenance

+1

JoostJM · 2018-02-20T09:45:40Z

Alternatively, I think I can tweak it around a bit to have similar commandlines for single and batch: I can make the labelmap argument optional and 'detect' wheter to operate in batch mode by checking if Image argument is a csv file. (will rename to "Input").

In that case we can also use the --operation argument you suggested: is omitted, extract segment-based (segment), and the total list of options: --operation=[segment|voxel|model]

fedorov · 2018-02-20T15:52:51Z

Indeed, I didn't think about those points you raised. I agree with your points.

I suggest we should not optimize the command line parameters to deal with the limitations of Slicer CLI. The main goal should be to simplify the process for the users. Maybe it is indeed better to have separate command line tools and not overload one making it too complicated to understand. Good discussion topic for the tomorrow call.

JoostJM · 2018-03-13T09:18:54Z

Fixed by #347

fedorov added the question label Feb 16, 2017

JoostJM added this to the PyRadiomics 2.0 Release milestone Feb 20, 2018

JoostJM closed this as completed Mar 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separation of functionality in pyradiomics and pyradiomicsbatch command line tools #203

Separation of functionality in pyradiomics and pyradiomicsbatch command line tools #203

fedorov commented Feb 16, 2017

JoostJM commented Feb 17, 2017

fedorov commented Feb 17, 2017

JoostJM commented Feb 17, 2017

fedorov commented Feb 22, 2017

fedorov commented Feb 22, 2017

alannavial commented Feb 28, 2017

JoostJM commented Feb 28, 2017

alannavial commented Feb 28, 2017

JoostJM commented Feb 28, 2017

JoostJM commented Feb 14, 2018

fedorov commented Feb 19, 2018

JoostJM commented Feb 20, 2018

JoostJM commented Feb 20, 2018 •

edited

Loading

fedorov commented Feb 20, 2018

JoostJM commented Mar 13, 2018

Separation of functionality in pyradiomics and pyradiomicsbatch command line tools #203

Separation of functionality in pyradiomics and pyradiomicsbatch command line tools #203

Comments

fedorov commented Feb 16, 2017

JoostJM commented Feb 17, 2017

fedorov commented Feb 17, 2017

JoostJM commented Feb 17, 2017

fedorov commented Feb 22, 2017

fedorov commented Feb 22, 2017

alannavial commented Feb 28, 2017

JoostJM commented Feb 28, 2017

alannavial commented Feb 28, 2017

JoostJM commented Feb 28, 2017

JoostJM commented Feb 14, 2018

fedorov commented Feb 19, 2018

JoostJM commented Feb 20, 2018

JoostJM commented Feb 20, 2018 • edited Loading

fedorov commented Feb 20, 2018

JoostJM commented Mar 13, 2018

JoostJM commented Feb 20, 2018 •

edited

Loading