Skip to content

Discovery of valid parameters for OO-syntax #117

@FrithiofJensen

Description

@FrithiofJensen

Hi!

When configuring a Ruffus pipeline using the OO-Syntax, one will encounter issues with passing parameters to Ruffus Task Objects.

Certain types of tasks, such as 'Split()' will refuse a 'pipeline_dir=' parameter that other tasks, such as 'Transform()' will be happy to work with (Is this a bug or deliberate?).

One way to discover which 'variant' parameters are accepted by Tasks is to use the 'inspect' module.

In a Python session one can try to look at the source for Tasks _prepare_<task-type> function, like so:

import inspect
import ruffus
from ruffus import *

def tf(*args, **kwargs):
    print(args, kwargs)

pl = ruffus.Pipeline(name='testing')
task = pl.split(task_func=tf, name='atask', output='stuff')


print(inspect.getsource(task._prepare_split))
    def _prepare_split(self, unnamed_args, named_args):
        """
        Common code for @split and pipeline.split
        """
        self.error_type = ruffus_exceptions.error_task_split
        self._set_action_type(Task._action_task_split)
        self._setup_task_func = Task._split_setup
        self.needs_update_func = self.needs_update_func or needs_update_check_modify_time
        self.job_wrapper = job_wrapper_io_files
        self.job_descriptor = io_files_one_to_many_job_descriptor
        self.single_multi_io = self._one_to_many
        # output is a glob
        self.indeterminate_output = 1

        #
        #   Parse named and unnamed arguments
        #
        self.parsed_args = parse_task_arguments(unnamed_args, named_args,
                                                ["input", "output", "extras"],
                                                self.description_with_args_placeholder)

print(inspect.getsource(task._prepare_transform))
    def _prepare_transform(self, unnamed_args, named_args):
        """
        Common function for pipeline.transform and @transform
        """
        self.error_type = ruffus_exceptions.error_task_transform
        self._set_action_type(Task._action_task_transform)
        self._setup_task_func = Task._transform_setup
        self.needs_update_func = self.needs_update_func or needs_update_check_modify_time
        self.job_wrapper = job_wrapper_io_files
        self.job_descriptor = io_files_job_descriptor
        self.single_multi_io = self._many_to_many

        #   Parse named and unnamed arguments
        self.parsed_args = parse_task_arguments(unnamed_args, named_args,
                                                ["input", "filter", "modify_inputs",
                                                 "output", "extras", "output_dir"],
                                                self.description_with_args_placeholder)


The pattern seems to be that a list of 'permitted parameters' are passed in 'parse_task_arguments' - some of these may be optional, others required.

Some parameters are not explicitly mentioned here but always passed, like 'name' and 'task_func'.

Anyways, Hope this helps someone a little!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions