Skip to content

Make options declaration mandatory #237

@pounard

Description

@pounard

In all multiple column anonymizers that accept other options than column names, columns cannot be named with the same names as options, such as sample_size, source, etc...

In #234 I introduced this in validation:

        $columns = $this->options->get('columns', null, true);
        if (!\is_array($columns)) {
            throw new ConfigurationException("'columns' must be an array of string or null values.");
        }
        $invalidNames = ['source', 'columns', 'file_csv_enclosure', 'file_csv_escape', 'file_csv_separator', 'file_skip_header'];
        foreach ($columns as $index => $column) {
            if (\in_array($column, $invalidNames)) {
                throw new ConfigurationException(\sprintf("'columns' values cannot be one of ('%s') for column #%d.", \implode("', '", $invalidNames), $index));
            }
            if (!\is_string($column) && null !== $column) {
                throw new ConfigurationException(\sprintf("'columns' must be an array of string or null values (invalid type for column #%d.", $index));
            }
        }

This works, but I now have to repeat this a second, then a third time for new anonymizer implementations.

It's becoming hard to maintain:

  • copy-pasted code more than two times is wrong,
  • some other options can be set at the abstract anonymizer level (sample size, for example).

What I propose is to force anonymizers to declare their options, for example:

    protected function initialize(): void
    {
        $this->addOption(
            'sample_size',
            Option::TYPE_INT,
            "Some short description",
            null /* default value) */,
            true /* Is optional (default true) */,
            "Some long description..."
        );
        // ...
    }

Then, in AbstractAnonymizer, when validation happens, check the given Options array and validate automatically at least the mandatory and type state.

  • We can also generate documentation from this.
  • We also can get rid of long class PHP doc.
  • We can generate meaningful console commands that documents it as well.
  • We can automatize some validation in check command.
  • We can use this declared attribute list for the custom validation I presented on top of this issue: instead of iterating over a raw array, we could iterate over \array_keys($this->getDeclaredOptions()) for example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions