Segmentation

Some segmentation models (random_forest and unet) require a pre-trained model as a function argument.

segmentation module:

module for blood vessels and tumor boundary segmentation

This module consists of four main functions:

random_forest
segmentation_wrapper
thresholding
unet

`random_forest(data_path, output_directory, model_file)`

Function that performs the segmentation of the blood vessels using Random Forest model (if you want to segment e.g. tumors this function can still be used. However, you need to provide a pre-trained model (model_file) that is pretrained for tumors)

Function performs the segmentation and saves the results (individual segmented images) to the output_directory path provided.

Parameters

data_path: (pathlib.PosixPath object) relative path to the images which should be segmented (notice that we work with preprocessed images which are in npy for and not in .tiff format)

output_directory (pathlib.PosixPath object) relative path to a folder where the results should be saved

model_file: (pathlib.PosixPath object) relative path to the pre-trained model binary file, which should be used for the segmentation

Returns

None: Results are stored on the disk.

Example Usage

>>>from src.segmentation import random_forest
>>>random_forest(
    data_path = Path('ppdm/data/5IT_DUMMY_STUDY/source/transformed/np_and_resized/vessel'),
    output_directory = Path('ppdm/data/5IT_DUMMY_STUDY/RF/raw/vessel'),
    model_file = Path('pretrained_models/rf_model.joblib')
)

Source code in src/segmentation.py

@log_step
def random_forest(
    data_path: pathlib.PosixPath,
    output_directory: pathlib.PosixPath,
    model_file,
) -> None:

    """
    Function that performs the segmentation of the blood vessels using Random Forest model (if you want to segment e.g. tumors this function can still be used. However,
    you need to provide a pre-trained model (model_file) that is pretrained for tumors)

    Function performs the segmentation and saves the results (individual segmented images) to the output_directory path provided.


    Parameters
    ----------

    **data_path**: *(pathlib.PosixPath object)*
    relative path to the images which should be segmented (notice that we work with preprocessed images which are in npy for and not in .tiff format)

    **output_directory** *(pathlib.PosixPath object)*
    relative path to a folder where the results should be saved

    **model_file**: *(pathlib.PosixPath object)*
    relative path to the pre-trained model binary file, which should be used for the segmentation


    Returns
    ------

    None: Results are stored on the disk.


    Example Usage
    --------------
    ```python

    >>>from src.segmentation import random_forest
    >>>random_forest(
        data_path = Path('ppdm/data/5IT_DUMMY_STUDY/source/transformed/np_and_resized/vessel'),
        output_directory = Path('ppdm/data/5IT_DUMMY_STUDY/RF/raw/vessel'),
        model_file = Path('pretrained_models/rf_model.joblib')
    )
    ```
    """

    loaded_rf = load(model_file)

    segment_rf_partial = partial(
        _segment_random_forest,
        loaded_estimator_to_use=loaded_rf,
        output_directory=output_directory,
    )

    directory_ok = check_content_of_two_directories(
        data_path, output_directory
    )

    if directory_ok is False:
        if output_directory.exists():
            remove_content(output_directory)

        parallel(
            segment_rf_partial,
            sorted(list(data_path.glob("*.npy"))),
            n_workers=12,
            progress=True,
            threadpool=True,
        )

`segmentation_wrapper(data_path, method, method_parameters)`

This function is a main wrapper for segmentation used within the automated pipeline.

It uses functions written in this script and wraps them in this wrapper (one big function). If one want to use segmentations methods in a script, you may use individual segmentation functions -> unet, thresholding, random_forest

Parameters

data_path : (pathlib.PosixPath object) relative path to the study which should be segmented (notice that we work with preprocessed images which are in npy for and not in .tiff format)

method : (str) method to use for segmentation (unet, thresholding, random_forest)

method_parameters: (Dict) relative path to the pre-trained model binary file, which should be used for the segmentation

device: (str) graphic card which should be used (relevant only to cloud computing environment). For local code the graphic card is selected automatically based on the PC hardware.

Returns

None: Results are stored on the disk.

Example Usage

>>>from src.segmentation import unet
>>>unet(
    input_directory = Path('ppdm/data/5IT_DUMMY_STUDY/source/transformed/np_and_resized/vessel'),
    output_directory = Path('ppdm/data/5IT_DUMMY_STUDY/DOCUMENTATION/raw/vessel'),
    model_file = Path('pretrained_models/unet_train_full_5IT_7IV_8IV.pt')
    )

Source code in src/segmentation.py

@log_step
def segmentation_wrapper(
    data_path: pathlib.PosixPath, method: str, method_parameters: Dict
) -> pathlib.PosixPath:

    """

    This function is a main wrapper for segmentation used within the automated pipeline.

    It uses functions written in this script and wraps them in this wrapper (one big function).
    If one want to use segmentations methods in a script, you may use individual segmentation functions -> unet, thresholding, random_forest


    Parameters
    ----------

    **data_path** : *(pathlib.PosixPath object)*
    relative path to the study which should be segmented (notice that we work with preprocessed images which are in npy for and not in .tiff format)

    **method** : *(str)*
    method to use for segmentation (unet, thresholding, random_forest)

    **method_parameters**: *(Dict)*
    relative path to the pre-trained model binary file, which should be used for the segmentation

    **device**: *(str)*
    graphic card which should be used (relevant only to cloud computing environment). For local code the graphic card is selected automatically based on the PC hardware.

    Returns
    ------

    None: Results are stored on the disk.


    Example Usage
    --------------

    ```python

    >>>from src.segmentation import unet
    >>>unet(
        input_directory = Path('ppdm/data/5IT_DUMMY_STUDY/source/transformed/np_and_resized/vessel'),
        output_directory = Path('ppdm/data/5IT_DUMMY_STUDY/DOCUMENTATION/raw/vessel'),
        model_file = Path('pretrained_models/unet_train_full_5IT_7IV_8IV.pt')
        )
    ```

    """

    module_results_path = "results/segmentation"
    module_name = "segment"
    segmentation_function = _select_method(method)
    parameters_as_string = join_keys_and_values_to_list(method_parameters)
    output_folder_name = join_to_string(
        [module_name, method, *parameters_as_string]
    )

    output_directory = create_relative_path(
        data_path, module_results_path, output_folder_name
    )

    directory_ok = check_content_of_two_directories(
        data_path, output_directory
    )

    if directory_ok is False:
        if output_directory.exists():
            remove_content(output_directory)
        segmentation_function(data_path, output_directory, **method_parameters)
    return output_directory

`thresholding(data_path, output_directory, method, mask_path=None)`

Function that performs the segmentation using thresholding. This function can be used for any channel (e.g. vessels, tumors) as it calculates "optimal" threshold value automatically.

Function performs the segmentation and saves the results (individual segmented images) to the output_directory path provided.

Parameters

data_path: (pathlib.PosixPath object) relative path to the images which should be segmented (notice that we work with preprocessed images which are in npy for and not in .tiff format)

output_directory (pathlib.PosixPath object) relative path to a folder where the results should be saved

method: (str) segmentation method to be used. Supported methods are "th_otsu", "th_triangle", "th_yen".

ANOTHER OPTIONAL PARAMETERS:

mask_path: (pathlib.PosixPath object) relative path for the binary mask to be used to filter the area of interest on which the segmentation will be performed.

Returns

None: Results are stored on the disk.

Example Usage

>>>from src.segmentation import thresholding
>>>thresholding(
    data_path = Path('ppdm/data/5IT_DUMMY_STUDY/source/transformed/np_and_resized/vessel'),
    output_directory = Path('msd_projects/ppdm/data/5IT_DUMMY_STUDY/THRES/raw/vessel'),
    method = "th_otsu"
)

Source code in src/segmentation.py

@log_step
def thresholding(
    data_path: pathlib.PosixPath,
    output_directory: pathlib.PosixPath,
    method: str,
    mask_path: pathlib.PosixPath = None,
) -> None:

    """
    Function that performs the segmentation using thresholding. This function can be used for any channel
    (e.g. vessels, tumors) as it calculates "optimal" threshold value automatically.

    Function performs the segmentation and saves the results (individual segmented images) to the output_directory path provided.


    Parameters
    ----------

    **data_path**: *(pathlib.PosixPath object)*
    relative path to the images which should be segmented (notice that we work with preprocessed images which are in npy for and not in .tiff format)

    **output_directory** *(pathlib.PosixPath object)*
    relative path to a folder where the results should be saved

    **method**: *(str)* segmentation method to be used.
    Supported methods are "th_otsu", "th_triangle", "th_yen".


    ANOTHER OPTIONAL PARAMETERS:

    **mask_path**: *(pathlib.PosixPath object)*
    relative path for the binary mask to be used to filter the area of interest on which the segmentation will be performed.



    Returns
    ------

    None: Results are stored on the disk.


    Example Usage
    --------------
    ```python

    >>>from src.segmentation import thresholding
    >>>thresholding(
        data_path = Path('ppdm/data/5IT_DUMMY_STUDY/source/transformed/np_and_resized/vessel'),
        output_directory = Path('msd_projects/ppdm/data/5IT_DUMMY_STUDY/THRES/raw/vessel'),
        method = "th_otsu"
    )

    ```

    """

    supported_methods = ["th_otsu", "th_triangle", "th_yen"]
    if method not in supported_methods:
        raise ValueError(
            f"thresholding method must be either in : {supported_methods}, you have provided: {method}"
        )

    threshold_value = _calculate_threshold_value(data_path, method, mask_path)
    print(f"threshold {method} value: {threshold_value}")
    _segment_and_save_masks_threshold(
        data_path, threshold_value, output_directory
    )

`unet(input_directory, output_directory, model_file, device=None)`

Function that performs the segmentation of the blood vessels (if you want to segment e.g. tumors this function can still be used. However, you need to provide a pre-trained model (model_file) that is pretrained for tumors)

Function performs the segmentation and saves the results (individual segmented images) to the output_directory path provided.

Parameters

input_directory : (pathlib.PosixPath object) relative path to the images which should be segmented (notice that we work with preprocessed images which are in npy for and not in .tiff format)

output_directory : (pathlib.PosixPath object) relative path to a folder where the results should be saved

model_file: (pathlib.PosixPath object) relative path to the pre-trained model binary file, which should be used for the segmentation

device: (str) graphic card which should be used (relevant only to cloud computing environment). For local code the graphic card is selected automatically based on the PC hardware.

Returns

None: Results are stored on the disk.

Example Usage

>>>from src.segmentation import unet
>>>unet(
    input_directory = Path('msd_projects/ppdm/data/5IT_DUMMY_STUDY/source/transformed/np_and_resized/vessel'),
    output_directory = Path('msd_projects/ppdm/data/5IT_DUMMY_STUDY/DOCUMENTATION/raw/vessel'),
    model_file = Path('pretrained_models/unet_train_full_5IT_7IV_8IV.pt')
    )

Source code in src/segmentation.py

@log_step
def unet(
    input_directory: pathlib.PosixPath,
    output_directory: pathlib.PosixPath,
    model_file: pathlib.PosixPath,
    device: str = None,
) -> None:
    """
    Function that performs the segmentation of the blood vessels (if you want to segment e.g. tumors this function can still be used. However,
    you need to provide a pre-trained model (model_file) that is pretrained for tumors)

    Function performs the segmentation and saves the results (individual segmented images) to the output_directory path provided.


    Parameters
    ----------

    **input_directory** : *(pathlib.PosixPath object)*
    relative path to the images which should be segmented (notice that we work with preprocessed images which are in npy for and not in .tiff format)

    **output_directory** : *(pathlib.PosixPath object)*
    relative path to a folder where the results should be saved

    **model_file**: *(pathlib.PosixPath object)*
    relative path to the pre-trained model binary file, which should be used for the segmentation

    **device**: *(str)*
    graphic card which should be used (relevant only to cloud computing environment). For local code the graphic card is selected automatically based on the PC hardware.

    Returns
    ------

    None: Results are stored on the disk.


    Example Usage
    --------------

    ```python

    >>>from src.segmentation import unet
    >>>unet(
        input_directory = Path('msd_projects/ppdm/data/5IT_DUMMY_STUDY/source/transformed/np_and_resized/vessel'),
        output_directory = Path('msd_projects/ppdm/data/5IT_DUMMY_STUDY/DOCUMENTATION/raw/vessel'),
        model_file = Path('pretrained_models/unet_train_full_5IT_7IV_8IV.pt')
        )
    ```
    """
    input_directory_paths = sorted(list(input_directory.glob("*.npy")))

    directory_ok = check_content_of_two_directories(
        input_directory, output_directory
    )

    if directory_ok is False:
        if output_directory.exists():
            remove_content(output_directory)

        if device is None:
            device = f"cuda:{select_avail_gpu()}"
            model = torch.load(model_file)
            model = model.to(device).eval()

        for file in tqdm(input_directory_paths):
            image = np.load(file)
            # convert 16 bit images to 8bits
            if image.dtype == "uint16":
                image = image / 65535 * 255
                image = image.astype(np.uint8)
            image = np.expand_dims(image, axis=0)
            # split images to tiles
            tiler = VolumeSlicer(
                image.shape,
                voxel_size=(image.shape[0], 512, 512),
                voxel_step=(image.shape[0], 512, 512),
            )
            tiles = tiler.split(image)
            tiles_processed = _unet_runner(tiles, model, device)
            # merge tiles back to one image
            tiles_stiched = tiler.merge(tiles_processed)
            mask = tiles_stiched[0, :, :]
            # save mask
            save_path_mask = output_directory / file.name
            output_directory.mkdir(parents=True, exist_ok=True)
            np.save(save_path_mask, mask.astype(np.uint8))