Usage#

This page details how to use the configuration framework of Neba.

Declaring parameters#

To use the configuration framework, you must first define your configuration in Python. Here is an example of how it will look:

from neba.config import Application, Section
from traitlets import Enum, Float, List, Unicode

class App(Application):
    """The application will retrieve and store parameters."""

    result_dir = Unicode("/data/results", help="Directory containing results")

    class model(Section):
        """A nested section."""
        coefficients = List(Float(), [0.5, 1.5, 10.0], help="Some coefficients for computation.")
        style = Enum(["serial", "parallel"], "serial", help="Only some values are accepted.")

app = App()

Traits#

The configuration is specified through Section classes. Each section contains parameters in the form of class attribute of type traitlets.TraitType (for instance Float, Unicode, or List).

Note

Traits can be confusing at first. They are a sort of descriptor. A container class has instances of traits bound as class attribute. For instance:

class Container(Section):
    name = Float(default_value=1.)

We can access the trait instance from the container class, but it only contains the information used for its definition, it does not hold any actual value:

>>> Container.name
<traitlets.traitlets.Float>

However, if we access name from a container instance we will obtain a value, not a trait:

>>> c = Container()
>>> type(c.name)
float
>>> c.name = 2  # we can change it
>>> c.name
2.0

It behaves nearly like a normal float attribute. When we change the value for example, the trait (again which is a class attribute) will be used to validate the new value, or do some more advanced things. But the value remains tied to the container instance c.

Here are some of the basic trait types:

Int, Float, Bool

Unicode

For strings (Traitlets differentiates unicode and bytes strings).

List, Set,

Containers can check the element type: List(Float()), or not: List().

Tuple

Tuples are fixed length. To check the types of its elements, you must specify every element: Tuple(Int(), Unicode())

Dict

Dict can specify keys and/or values: Dict(key_trait=Unicode(), value_trait=Int())

Enum

Must be one of the specified values: Enum(["a", "b"], default_value="a")

Union

Multiple types are permitted. Will try to convert values in the order types are specified until it succeds. For instance, prefer this order: Union([Int(), Float()], otherwise integers will always be converted to floats.

Type

Type(klass=MyClass) will allow subclasses of MyClass. In your configuration files you can use an import string (“my_module.MyClass”).

Instance

This is currently unsupported.

Neba provides two new of types traits. Range is a list of integers or floats that can be parsed from a slice specification of the form start:stop[:step]. ‘stop’ is inclusive.

  • With year = Range(Int()), --year=2002:2004 will be parsed as [2002, 2003, 2004]

  • With coef = Range(Float()), --coef=0:1:0.5 will be parsed as [0.0, 0.5, 1.0].

To get a descending list, change the order of start and stop: --year=2008:2002:4 will be parsed as [2008, 2004]. It can still take in lists of values normally (--year 2002 2005 2006).

Fixable is meant to work with filefinder, for parameters defined in filename patterns. It can take

  • a single value

  • a string that will be interpreted as a range of values if the trait type allows it (Int or Float)

  • a string that will be interpreted as a regular expression (this is disabled by default as it can be dangerous: any value from the command line that cannot be parsed would still be allowed).

  • a list of values

Subsections#

A section can contain sub-sections, allowing a tree-like, nested configuration. It can be done by in two ways:

  • Subsections can be defined directly inside another section class definition. The name of the nested class will be used to access the subsection and its traits. The class definition will be renamed and moved under the attribute _{name}SectionDef. For example:

    class MyConfig(Section):
    
        class log(Section):
            level = Unicode("INFO")
    
        class sst(Section):
            dataset = Enum(["a", "b"])
    
            class a(Section):
                location = Unicode("/somewhere")
                time_resolution = Int(8, help="in days")
    
            class b(Section):
                location = Unicode("/somewhere/else")
    
    MyConfig().sst.a.location = "/fantastic"
    

    A mypy plugin is provided to support these dynamic definitions. Add it to the list of plugins in your mypy configuration file, for instance in ‘pyproject.toml’:

    [tool.mypy]
    plugins = ['neba.config.mypy_plugin']
    
  • A more standard way is by using the Subsection class and setting it as an attribute in the parent section:

    from neba.config import Subsection
    
    class MySubsection(Section):
        b = Int(2)
    
    class MySection(Section):
        a = Int(1)
    
        sub = Subsection(MySubsection)
    
    >>> sec = MySection()
    >>> sec.a
    1
    >>> sec.sub.b
    2
    

Note

Like traits, Subsections are also descriptors: accessing from an instance will give the subsection instance (sec.sub is a MySubsection instance), and accessing from a class will give a Subsection object which contains information about the subsection type (MySection.sub.klass is MySubsection).

Aliases#

It is possible to define aliases with the Section.aliases attribute. It is a mapping of shortcut names to a deeper subsection:

{"short": "some.deeply.nested.subsection"}

Aliases can be used in configuration files, in the command line, and when accessing values as a mapping (section["short"]).

Documentation#

Traits can be documented with the help argument. Sections should be documented with a class docstring. Both will be re-used:

Application#

The principal section, at the root of the configuration tree, is the Application. As a subclass of Section, it can hold directly all your parameters and nested subsections. It will also be responsible for gathering the parameters from configuration files and the command line, and more.

Starting the application#

By default, when the application is instantiated it executes its starting sequence with the start() method. It will:

  • Parse command line arguments

  • Read parameters from configuration files

  • Instantiate all subsections with the obtained parameters

This can be controlled with __init__ arguments start, ignore_cli, and instantiate.

Note

Even though some features are still available if the subsections are not instantiated (since the subsections classes contain information about the parameters), instantiating them is necessary to fully validate the parameters.

Logging#

The base application contains some parameters to easily log information. A logger instance is available at Application.log that by default will log to the console (stderr), and can be configured via the (trait) parameters log_level, log_format, and log_datefmt.

The configuration of the logging setup is kept minimal. Users needing to configure it further may look into Application._get_logging_config().

Note

The logger will have the application class fullname (module + class name), so logging inheritance rules will apply.

Generating configuration files#

The application can generate configuration files automatically with Application.write_config(), for any supported format. It will write the values currently held by the application.

For each parameter, it will add some information as comments (the trait location, type, and default value). You can reduce the amount of comments by passing comment="no-help" or comment="none". You can also comment the parameters if their value is equal to the trait default by passing comment_default=True.

If a file already exists, you can completely overwrite the file, or update it. In the latter case, the application will use the parameters values in the existing file but will replace everything else (comments, new traits, etc).

Neba tries its best to generate valid configuration files, but some traits cannot be serialized (an Instance, or a None value when using TOML for example) and will be commented.

Accessing parameters#

In a section, the value of a parameter can be accessed (or changed) just like any other attribute. Subsections are also accessible which allows for deeply nested access:

app.some.deeply.nested.trait = 2

Tip

It is possible to only show subsections and configurable traits in autocompletion. Set the class attribute Section._attr_completion_only_traits to True.

Sections also implements the interface of a MutableMapping and most of the interface of a dict. Parameters can be accessed with a single key of dot-separated attributes. This still benefits from all features of traitlets.

app["some.deeply.nested.trait"] = 2
# is equivalent to
app["some"]["deeply"]["nested"]["trait"] = 2

By default Section.keys(), Section.values() and Section.items() do not list subsections objects or aliases, but this can be altered via keyword arguments. They also return a flat output; to obtain a nested dictionary pass nest=True.

Important

The omission of subsections and aliases is done to allow a straightforward conversion with dict(section). Similarly, len and iter do not account for subsections and aliases.

However, other methods such as “get”, “set” and “contains” will allow subsections keys and aliases:

>>> "subsection" in section
True
>>> section["subsection"]  # No KeyError

Sections have an update() method allowing to modify it with a mapping of several parameters (or another section instance):

app.update({"computation.n_cores": 10, "physical.threshold": 5.})

Similarly to Section.setdefault(), it can add new traits to the section with some specific input, see the docstring for details.

Warning

Adding traits to a Section instance (via add_trait(), update(), or setdefault()) internally creates a new class and modifies in-place the section instance; something along the lines of:

section.__class__ = type("NewClass", (section.__class__), {<the new traits>})

References to section classes necessary to operate the nested structure are updated accordingly, but this is a possibly dangerous operation and it would be preferred to set traits statically.

Note

When changing the value of a trait (with any method), traitlets will validate the new value and trigger callbacks if registered. This ensures the configuration stays valid. Refer to the traitlets documentation for more details on how to use these features.

Select parameters#

We can select only some of the parameters by name by using select():

>>> app.select("result_dir", "model.style")
{
    "result_dir": "/data/results",
    "model.style": "serial",
}

Each trait can be tagged. This can be used to group traits together. For instance if we tag some traits with:

some_parameter = Bool(True).tag(group_a=True)

we can recover them all across the configuration by using the metadata argument in many methods such as keys() or select() (app.select(group_a=True)). Use @tag_all_traits to tag all traits of a section:

class App(Application):

    @tag_all_traits(group_a=True)
    class subsection(Section):
        a = Int(0)
        b = Int(1).tag(group_a=False)  # will not be tagged as True

If some traits are meant to be used as arguments to a specific function, Section.values_from_func_signature() will find the parameters that share the same name as arguments from a function signature.

Input parameters#

During its start() sequence, the Application class will first parse command line (CLI) arguments (unless deactivated) and then read values from specified configuration files.

The configuration values are retrieved by ConfigLoader objects adapted for each source. Their outputs is a flat dictionary mapping keys to a ConfigValue. Aliases are expanded so that each key is unique.

Note

The ConfigValue class allows to store more information about the value: its origin, the original string and parsed value if applicable, and a priority value used when merging configs. To obtain the value, use ConfigValue.get_value().

Parameters obtained from configuration files and from CLI are merged. Parameters are stored in file_config, cli_config and the merged version in config.

Finally, the application will recursively instantiate all sections with the retrieved configuration values. Unspecified values will take the trait default value. All values will undergo validation from traitlets.

Important

By default, all this process is automatic, to use your application you only have to instantiate your application:

class App(Application):
    ...

app = App()
app.my_parameter  # retrieved from config files or CLI

From configuration files#

The application can retrieve parameters from configuration files by invoking Application.load_config_files(). It will load the file (or files) specified in Application.config_files. If multiple files are specified, they are read in order (ie if config_files = [first_file, second_file] the parameters of the second file will replace those from the first). The resulting configuration will be stored in the file_config attribute.

Note

The config_files attribute is a trait, which allows to select configuration files from the command line. To specify it from your script use:

class App(Application):
    pass

App.config_files.default_value = ...

or if you do not need to change the value using command line arguments:

class App(Application):
    config_files = ...

Different file formats require specific subclasses of FileLoader. A loader is selected by looking at the config file extension. As some loaders have external dependencies, they are only imported when needed, according to the import string in Application.file_loaders.

File extensions

Class

Library

toml

toml.TomlkitLoader

tomlkit

py, ipy

python.PyLoader

yaml, yml

yaml.YamlLoader

ruamel

json

json.JsonLoader

json

Neba supports and recommends TOML configuration files. It is both easily readable and unambiguous. Despite allowing nested configuration, it can be written without indentation, allowing to add long comments for each parameters. The tomllib builtin module does not support writing, so we use (for both reading and writing) one of the recommended replacement: tomlkit.

The package also support python scripts as configuration files, similarly to how traitlets is doing it. To load a configuration file, the file loader creates a PyConfigContainer object. That object will be bound to the c variable in the script/configuration file. It allows setting nested attribute so that the following syntax is valid:

c.section.subsection.parameter = 5

Important

Remember that this script will be executed, so arbitrary code can be run inside, maybe changing some value depending on the OS, the hostname, or more advanced logic.

Of course running arbitrary code dynamically is a security liability, do not load parameters from a python script unless you trust it.

The loader does not support the traitlets feature of configuration file inheritance via (in the config file) load_subconfig("some_other_script.py"). This would be doable, but for the moment we recommend instead that you specify multiple configuration files in Application.config_files, remembering that each configuration file replaces the values of the previous one in the list.

Yaml is supported via YamlLoader and the third-party module ruamel.yaml.

Despite not being easily readable, the JSON format is also supported via JsonLoader and the builtin module json. The decoder and encoder class can be customized.

From the command line#

Parameters can be set from parsing command line arguments, although it can be skipped by either setting the Application.ignore_cli attribute or the ignore_cli argument in Application.start(). The configuration obtained will be stored in the cli_config attribute and will take priority over parameters from configuration files.

The keys are indicated following one or two hyphen. Any subsequent hyphen is replaced by an underscore. So -computation.n_cores and --computation.n-cores are equivalent. Parameters keys are dot-separated paths leading to a trait. Aliases can be used for brevity.

Note

This behavior can be changed with attributes CLILoader.allow_kebab and CLILoader.prefix.

Command line arguments need to be parsed. The corresponding trait object will deal with the parsing, using its from_string or from_string_list (for containers) methods.

Note

Nested containers parameters (list of list e.g.) are not currently supported.

Note

The list of command line arguments is obtained by Application.get_argv(). It tries to detect if python was launched from IPython or Jupyter, in which case it ignores the arguments before the first --.

List arguments#

For any and every parameter, the argument action is “append”, with type str (since the parsing is left to traitlets), and nargs="*" meaning that any parameter can receive any number of values. To indicate multiple values, for a List trait for instance, the following syntax is to be used:

--physical.years 2015 2016 2017

and not as is the case with vanilla traitlets:

--physical.years 2015 --physical.years 2016 ...

That would raise an error since duplicates are forbidden to avoid possible mistakes in user input.

Extra parameters#

Extra parameters to the argument parser can be added with the class method Application.add_extra_parameters(). This will add traits to a section named “extra”, created if needed. This is useful when needing parameters for a single script for instance. If in our script we write:

App.add_extra_parameters(threshold=Float(5.0))

we can then pass a parameter from the command line with --extra.threshold and retrieve it with app.extra.threshold.

Autocompletion#

Command line autocompletion for parameters is available via argcomplete. Install argcomplete and either register the scripts you need or activate global completion. In both cases you will need to add # PYTHON_ARGCOMPLETE_OK to the beginning of your scripts.

Note

Completion is not available when using ipython, as it shadows our application. I do not know if this is fixable.

From a dictionary#

The loader DictLoader can transform any nested mapping into a proper configuration.

Note

The loaders TomlkitLoader, YamlLoader and JsonLoader are based on it.