Leveraging Datoma’s potential

This section delves in a more complex example executed on the datoma library, showing an advanced usage of this tool.

As we show on the example, a great way of leveraging datoma’s potential is to link inputs with other execution outputs.

In this example, we first submit a job to Datoma’s infrastructure. Then we use its output as input for a workflow.

We first create the DatomaJob and DatomaWorkflow objects.
Then we create set the job’s inputs with local files, we also modify the parameters we wish.
We submit the job to Datoma’s infrastructure.
Instead of just listing the outputs, we will store them in the output_files variable.
Next, we will set the workflow’s global input with our job’s output.
After that, we submit the workflow and download the output files.
Finally, we export a JSON file containing the information of our workflow.

# Make the necessary imports
from datoma import DatomaJob
from datoma import DatomaWorkflow

# Create DatomaJob and DatomaWorkflow objects
job = DatomaJob("rmsistd", "rmsi")
dw = DatomaWorkflow(official_name = 'rmsiannotation')

# set the job's input
job.set_input(  input_dict = {"input": [
                            {"file":"path/to/file.ibd"}, 
                            {"roi":"path/to/file.imzML"}]}, 
                preserve_name = True)

# Create a dictionary with parameters to modify from the task's default values
job.set_params(params_dict =    {'preprocessing:smoothing:enable': False, 
                                'preprocessing:smoothing:kernelsize': 8})

# Submit the job to Datoma's infrastructure, you can name the job if you want
job.submit(job_name = "rmsi_execution")

# We will store the location of the "*.imzML" files in a variable
output_files, total_size = await job.list_outputs(regex=".*\.imzML")

# Get the total size of the filtered output files 
print(total_size)

# In this case, we are using job's output as input for the workflow
# The output files are located in Datoma's infrastructure 's3://...'
global_input_dict = {"imaging_files":[{"file":"/path/to/file.ibd"},
                                      {"file": output_files[0]}]}

# Set the global input of the workflow
dw.set_global_input(global_input_dict, preserve_name = True)

# Submit the workflow to Datoma's infrastructure, you can name it if you want
dw.submit(name = "rmsiannotation_execution")

# Check the status of the workflow, when it finishes, the output files will be downloaded
await dw.download(output_path="path/to/output/folder")

# You can also export the workflow to a json file
dw.export_json("path/to/save/file.json")

To see an example on how to import a saved JSON file, refer to Importing a Datoma object.