Jobman enables you to store experiments, their original hyper-parameters, and some results, in a database. It is useful, for instance, to schedule experiments to be launched later, launch them (for instance, use a cluster to launch them all in parallel), and finally collect some measurements (reconstruction error, classification error, training time, for instance) and add them in the database. Then, we can use queries on that database to do some analyses of the results.
For the moment, what can be inside the database is limited to simple data types, and things that are not too big in memory: integers, floating-point numbers, strings, and dictionaries of the above (if the keys are strings).
See Jobman documentation for installation instructions, and information on how to use it.
Such an experiment function is in Pylearn2 pylearn2/scripts/jobman/experiment.py, and will use a specially-crafted state dictionary, that specifies:
For the moment, only one such experiment function exists: pylearn2.scripts.jobman.train_experiment, that trains a model the same way scripts/train.py would. Other functions may be implemented in the future, with the same structure in state.
Here’s the structure for the state dictionary-like object, that represents a single experiment. For your convenience, you can access to state‘s properties either as items of a dictionary (state["some_attr"]) or as attributes of an object (state.some_attr).
String value
This is a string containing a template for the YAML file that will represent your experiment. This template will typically be shared among different experiments, when these experiments differ only by the value of some hyper-parameters.
Instead of specifying the values of hyper-parameters in the YAML file, you can specify them using the Python substitution syntax. For instance, if one of your hyper-parameters is a learning rate, you can specify the following in your template:
"learning_rate": %(learning_rate)d
If you want to specify a whole object as a hyper-parameter (for instance, if you want to try different classes), you can take advantage from the fact that Python’s serialization for dictionaries is close to YAML. For instance, you can specify:
"termination_criterion": %(term_crit_builder)s %(term_crit_args)s
The identifier inside the parentheses is the name you give to your hyper-parameter, the letter afterwards identifies its type (here, “s” for string, “d” for decimal number, every type can usually be converted automatically into a string).
These “place-holders” in the template will be replaced by actual values, that are provided in state.hyper_parameters.
Dictionary, all keys must be strings.
This dictionary contains the mapping from hyper-parameter names, as used in the yaml_template string, to the actual value you want to give them in that particular experiment.
For instance, for a YAML template like the one above, a possible value of state.hyper_parameters could be:
{
"learning_rate": 1e-4,
"term_crit_builder": "!obj:pylearn2.training_algorithms.sgd.EpochCounter",
"term_crit_args": {
"max_epochs": 2
}
}
The resulting YAML file that will be used in your experiment will look like:
{
[...]
"learning_rate": 1e-4,
[...]
"termination_criterion": !obj:pylearn2.training_algorithms.sgd.EpochCounter {
max_epochs: 2,
},
}
You can achieve the same effect by specifying only one place-holder for a complete YAML object. If your template is:
{ "termination_criterion": %(term_crit)s }
and you have, in state.term_crit:
{
'__builder__': 'pylearn2.training_algorithms.sgd.EpochCounter',
'max_epochs': 2
}
The special key '__builder__' will be used as the constructor function of the object to build, which will result in the same final YAML string:
{ "termination_criterion": !obj:pylearn2.training_algorithms.sgd.EpochCounter {
max_epochs: 2,
} }
When that YAML string gets generated, it will be parsed to build a train_object, that will be trained (train_object.main_loop() is called), then returned. That object will depend on what is described in the YAML file. Usually, you will want to extract some information from that object, and save that in the database, to be used as some metric. This is done by state.extract_results.
String, containing the name of a function to call on “train_object”.
The function named in state.extract_results takes one argument, and returns a dictionary. Like state.hyper_parameters, this dictionary’s keys are strings, and values can be ints, floats, strings or dictionaries (with integer keys, and they can be nested). The returned dictionary will be used as state.results. This function has to be in some place where it can be importable.
For instance, say we want to record the best epoch according to some validation error, and the training and validation errors (classification and NLL) at that training epoch. Those will kept in train_obj.model.monitor.channels, so our function may look like:
def result_extractor(train_obj):
import numpy
channels = train_obj.model.monitor.channels
best_index = numpy.argmin(channels["valid_nll"].val_record)
return dict(
best_epoch=best_index,
train_nll=channels["train_nll"].val_record[best_index],
train_classerr=channels["classerr"].val_record[best_index],
valid_nll=channels["valid_nll"].val_record[best_index],
valid_classerr=channels["valid_classerr"].val_record[best_index])
Note
The channels and their name will depend on the content of your YAML template string.
If that function is placed in pylearn2/scripts/examples/extractor.py, we will specify:
state.extract_results = "pylearn2.scripts.examples.extractor.result_extractor"
For a working example, you can have a look at pylearn2/scripts/jobman/tester.py
Dictionary, all keys are strings.
This dictionary will be empty initially, but will be filled after the experiment finishes, by the call to state.extract_results.
For instance, the result specification above may give the following result:
{
'best_epoch': 47,
'train_nll': 0.02,
'train_classerr': 0.0,
'valid_nll': 0.05,
'valid_classerr': 0.03,
}
Having those informations in the database allows you to efficiently search for the model that has the lowest classification error, for instance.
You can see a complete example of a state dictionary in pylearn2/scripts/jobman/tester.py.
Here is an outline of what is really happening inside the train_experiment function.