Usage#
This page details how to use the configuration framework of Neba.
Declaring parameters#
To use the configuration framework, you must first define your configuration in Python. Here is an example of how it will look:
from neba.config import Application, Section
from traitlets import Enum, Float, List, Unicode
class App(Application):
"""The application will retrieve and store parameters."""
result_dir = Unicode("/data/results", help="Directory containing results")
class model(Section):
"""A nested section."""
coefficients = List(Float(), [0.5, 1.5, 10.0], help="Some coefficients for computation.")
style = Enum(["serial", "parallel"], "serial", help="Only some values are accepted.")
app = App()
Traits#
The configuration is specified through Section classes. Each
section contains parameters in the form of class attribute of type
traitlets.TraitType (for instance Float,
Unicode, or List).
Note
Traits can be confusing at first. They are a sort of descriptor. A container class has instances of traits bound as class attribute. For instance:
class Container(Section):
name = Float(default_value=1.)
We can access the trait instance from the container class, but it only contains the information used for its definition, it does not hold any actual value:
>>> Container.name
<traitlets.traitlets.Float>
However, if we access name from a container instance we will obtain a
value, not a trait:
>>> c = Container()
>>> type(c.name)
float
>>> c.name = 2 # we can change it
>>> c.name
2.0
It behaves nearly like a normal float attribute. When we change the value
for example, the trait (again which is a class attribute) will be used to
validate the new value, or do some more advanced things. But the value
remains tied to the container instance c.
Here are some of the basic trait types:
For strings (Traitlets differentiates unicode and bytes strings). |
|
Containers can check the
element type: |
|
Tuples are fixed length. To check the types of
its elements, you must specify every element:
|
|
Dict can specify keys and/or values:
|
|
Must be one of the specified values:
|
|
Multiple types are permitted. Will try to
convert values in the order types are specified
until it succeds.
For instance, prefer this order:
|
|
|
|
This is currently unsupported. |
Neba provides two new of types traits. Range is a list of integers or
floats that can be parsed from a slice specification of the form
start:stop[:step]. ‘stop’ is inclusive.
With
year = Range(Int()),--year=2002:2004will be parsed as[2002, 2003, 2004]With
coef = Range(Float()),--coef=0:1:0.5will be parsed as[0.0, 0.5, 1.0].
To get a descending list, change the order of start and stop:
--year=2008:2002:4 will be parsed as [2008, 2004]. It can still take in
lists of values normally (--year 2002 2005 2006).
Fixable is meant to work with filefinder, for parameters defined in filename
patterns. It can take
a single value
a string that will be interpreted as a range of values if the trait type allows it (Int or Float)
a string that will be interpreted as a regular expression (this is disabled by default as it can be dangerous: any value from the command line that cannot be parsed would still be allowed).
a list of values
Subsections#
A section can contain sub-sections, allowing a tree-like, nested configuration. It can be done by in two ways:
Subsections can be defined directly inside another section class definition. The name of the nested class will be used to access the subsection and its traits. The class definition will be renamed and moved under the attribute
_{name}SectionDef. For example:class MyConfig(Section): class log(Section): level = Unicode("INFO") class sst(Section): dataset = Enum(["a", "b"]) class a(Section): location = Unicode("/somewhere") time_resolution = Int(8, help="in days") class b(Section): location = Unicode("/somewhere/else") MyConfig().sst.a.location = "/fantastic"
A mypy plugin is provided to support these dynamic definitions. Add it to the list of plugins in your mypy configuration file, for instance in ‘pyproject.toml’:
[tool.mypy] plugins = ['neba.config.mypy_plugin']
A more standard way is by using the
Subsectionclass and setting it as an attribute in the parent section:from neba.config import Subsection class MySubsection(Section): b = Int(2) class MySection(Section): a = Int(1) sub = Subsection(MySubsection) >>> sec = MySection() >>> sec.a 1 >>> sec.sub.b 2
Note
Like traits, Subsections are also descriptors: accessing from an instance
will give the subsection instance (sec.sub is a MySubsection instance),
and accessing from a class will give a Subsection object which
contains information about the subsection type (MySection.sub.klass
is MySubsection).
Aliases#
It is possible to define aliases with the Section.aliases attribute.
It is a mapping of shortcut names to a deeper subsection:
{"short": "some.deeply.nested.subsection"}
Aliases can be used in configuration files, in the command line, and when
accessing values as a mapping (section["short"]).
Documentation#
Traits can be documented with the help argument. Sections should be
documented with a class docstring. Both will be re-used:
in the python console, with
Section.emit_help(),in the command line, with
python my_script.py --helpor--list-arguments,in configuration files generated by the application (see Generating configuration files),
in Sphinx, using the Autodoc_trait extension.
Application#
The principal section, at the root of the configuration tree, is the
Application. As a subclass of
Section, it can hold directly all your parameters and nested
subsections. It will also be responsible for gathering the parameters from
configuration files and the command line, and more.
Starting the application#
By default, when the application is instantiated it executes its starting
sequence with the start() method. It will:
Parse command line arguments
Read parameters from configuration files
Instantiate all subsections with the obtained parameters
This can be controlled with __init__ arguments start, ignore_cli,
and instantiate.
Note
Even though some features are still available if the subsections are not instantiated (since the subsections classes contain information about the parameters), instantiating them is necessary to fully validate the parameters.
Logging#
The base application contains some parameters to easily log information. A
logger instance is available at Application.log that by default will
log to the console (stderr), and can be configured via the (trait) parameters
log_level, log_format, and
log_datefmt.
The configuration of the logging setup is kept minimal. Users needing to
configure it further may look into Application._get_logging_config().
Note
The logger will have the application class fullname (module + class name), so logging inheritance rules will apply.
Generating configuration files#
The application can generate configuration files automatically with
Application.write_config(), for any supported format. It will write the
values currently held by the application.
For each parameter, it will add some information as comments (the trait
location, type, and default value). You can reduce the amount of comments by
passing comment="no-help" or comment="none".
You can also comment the parameters if their value is equal to the trait default
by passing comment_default=True.
If a file already exists, you can completely overwrite the file, or update it. In the latter case, the application will use the parameters values in the existing file but will replace everything else (comments, new traits, etc).
Neba tries its best to generate valid configuration files, but some traits
cannot be serialized (an Instance, or a None value
when using TOML for example) and will be commented.
Accessing parameters#
In a section, the value of a parameter can be accessed (or changed) just like any other attribute. Subsections are also accessible which allows for deeply nested access:
app.some.deeply.nested.trait = 2
Tip
It is possible to only show subsections and configurable traits in
autocompletion. Set the class attribute
Section._attr_completion_only_traits to True.
Sections also implements the interface of a
MutableMapping and most of the interface of a
dict. Parameters can be accessed with a single key of dot-separated
attributes. This still benefits from all features of traitlets.
app["some.deeply.nested.trait"] = 2
# is equivalent to
app["some"]["deeply"]["nested"]["trait"] = 2
By default Section.keys(), Section.values() and
Section.items() do not list subsections objects or aliases, but this can
be altered via keyword arguments. They also return a flat output; to obtain a
nested dictionary pass nest=True.
Important
The omission of subsections and aliases is done to allow a straightforward
conversion with dict(section). Similarly, len and iter do not
account for subsections and aliases.
However, other methods such as “get”, “set” and “contains” will allow subsections keys and aliases:
>>> "subsection" in section
True
>>> section["subsection"] # No KeyError
Sections have an update() method allowing to modify it with a
mapping of several parameters (or another section instance):
app.update({"computation.n_cores": 10, "physical.threshold": 5.})
Similarly to Section.setdefault(), it can add new traits to the section
with some specific input, see the docstring for details.
Warning
Adding traits to a Section instance (via add_trait(),
update(), or setdefault()) internally creates
a new class and modifies in-place the section instance; something along the
lines of:
section.__class__ = type("NewClass", (section.__class__), {<the new traits>})
References to section classes necessary to operate the nested structure are updated accordingly, but this is a possibly dangerous operation and it would be preferred to set traits statically.
Note
When changing the value of a trait (with any method), traitlets will validate the new value and trigger callbacks if registered. This ensures the configuration stays valid. Refer to the traitlets documentation for more details on how to use these features.
Select parameters#
We can select only some of the parameters by name by using
select():
>>> app.select("result_dir", "model.style")
{
"result_dir": "/data/results",
"model.style": "serial",
}
Each trait can be tagged. This can be used to group traits together. For instance if we tag some traits with:
some_parameter = Bool(True).tag(group_a=True)
we can recover them all across the configuration by using the metadata
argument in many methods such as keys() or
select() (app.select(group_a=True)). Use
@tag_all_traits to tag all traits of a section:
class App(Application):
@tag_all_traits(group_a=True)
class subsection(Section):
a = Int(0)
b = Int(1).tag(group_a=False) # will not be tagged as True
If some traits are meant to be used as arguments to a specific function,
Section.values_from_func_signature() will find the parameters that share
the same name as arguments from a function signature.
Input parameters#
During its start() sequence, the Application class
will first parse command line (CLI) arguments (unless deactivated) and then
read values from specified configuration files.
The configuration values are retrieved by ConfigLoader objects adapted
for each source. Their outputs is a flat dictionary mapping keys to a
ConfigValue. Aliases are expanded so that each key is unique.
Note
The ConfigValue class allows to store more information about the
value: its origin, the original string and parsed value if applicable, and a
priority value used when merging configs. To obtain the value, use
ConfigValue.get_value().
Parameters obtained from configuration files and from CLI are merged. Parameters
are stored in file_config, cli_config
and the merged version in config.
Finally, the application will recursively instantiate all sections with the retrieved configuration values. Unspecified values will take the trait default value. All values will undergo validation from traitlets.
Important
By default, all this process is automatic, to use your application you only have to instantiate your application:
class App(Application):
...
app = App()
app.my_parameter # retrieved from config files or CLI
From configuration files#
The application can retrieve parameters from configuration files by invoking
Application.load_config_files(). It will load the file (or files)
specified in Application.config_files. If multiple files are specified,
they are read in order (ie if config_files = [first_file, second_file] the
parameters of the second file will replace those from the first). The resulting
configuration will be stored in the file_config attribute.
Note
The config_files attribute is a trait, which allows to
select configuration files from the command line. To specify it from your
script use:
class App(Application):
pass
App.config_files.default_value = ...
or if you do not need to change the value using command line arguments:
class App(Application):
config_files = ...
Different file formats require specific subclasses of FileLoader. A
loader is selected by looking at the config file extension. As some loaders have
external dependencies, they are only imported when needed, according to the
import string in Application.file_loaders.
File extensions |
Class |
Library |
|---|---|---|
toml |
||
py, ipy |
||
yaml, yml |
||
json |
Neba supports and recommends TOML configuration
files. It is both easily readable and unambiguous. Despite allowing nested
configuration, it can be written without indentation, allowing to add long
comments for each parameters. The tomllib builtin module
does not support writing, so we use (for both reading and writing) one of the
recommended replacement: tomlkit.
The package also support python scripts as configuration files, similarly to how
traitlets is doing it. To load a configuration file, the file loader creates a
PyConfigContainer object. That object will be bound to the c
variable in the script/configuration file. It allows setting nested attribute so
that the following syntax is valid:
c.section.subsection.parameter = 5
Important
Remember that this script will be executed, so arbitrary code can be run inside, maybe changing some value depending on the OS, the hostname, or more advanced logic.
Of course running arbitrary code dynamically is a security liability, do not load parameters from a python script unless you trust it.
The loader does not support the traitlets feature of configuration file
inheritance via (in the config file) load_subconfig("some_other_script.py").
This would be doable, but for the moment we recommend instead that you specify
multiple configuration files in Application.config_files, remembering
that each configuration file replaces the values of the previous one in the
list.
Yaml is supported via YamlLoader and the
third-party module ruamel.yaml.
Despite not being easily readable, the JSON format is also supported via
JsonLoader and the builtin module json. The
decoder and encoder class can be customized.
From the command line#
Parameters can be set from parsing command line arguments, although it can be
skipped by either setting the Application.ignore_cli attribute or the
ignore_cli argument in Application.start(). The configuration
obtained will be stored in the cli_config attribute and
will take priority over parameters from configuration files.
The keys are indicated following one or two hyphen. Any subsequent hyphen is
replaced by an underscore. So -computation.n_cores and
--computation.n-cores are equivalent. Parameters keys are dot-separated
paths leading to a trait. Aliases can be used for brevity.
Note
This behavior can be changed with attributes CLILoader.allow_kebab
and CLILoader.prefix.
Command line arguments need to be parsed. The corresponding trait object
will deal with the parsing, using its from_string or from_string_list
(for containers) methods.
Note
Nested containers parameters (list of list e.g.) are not currently supported.
Note
The list of command line arguments is obtained by
Application.get_argv(). It tries to detect if python was launched
from IPython or Jupyter, in which case it ignores the arguments before the
first --.
List arguments#
For any and every parameter, the argument action is
“append”, with type str (since the parsing is left to traitlets), and
nargs="*" meaning that any parameter can receive any number of values. To
indicate multiple values, for a List trait for instance, the following syntax is
to be used:
--physical.years 2015 2016 2017
and not as is the case with vanilla traitlets:
--physical.years 2015 --physical.years 2016 ...
That would raise an error since duplicates are forbidden to avoid possible mistakes in user input.
Extra parameters#
Extra parameters to the argument parser can be added with the class method Application.add_extra_parameters(). This will add traits to a section
named “extra”, created if needed. This is useful when needing parameters for a
single script for instance. If in our script we write:
App.add_extra_parameters(threshold=Float(5.0))
we can then pass a parameter from the command line with --extra.threshold
and retrieve it with app.extra.threshold.
Autocompletion#
Command line autocompletion for parameters is available via argcomplete. Install argcomplete and either
register the scripts you need or activate global completion. In both cases you
will need to add # PYTHON_ARGCOMPLETE_OK to the beginning of your scripts.
Note
Completion is not available when using ipython, as it shadows our application. I do not know if this is fixable.
From a dictionary#
The loader DictLoader can transform any nested mapping into a proper
configuration.
Note
The loaders TomlkitLoader, YamlLoader and
JsonLoader are based on it.