Skip to main content
Version: v2506

Tuning Tips

This section explains useful features that enhance automatic tuning with ZenithTune.

Switch to Maximize

You can change from a minimization problem to a maximization problem by setting True for the maximize argument when initializing the Tuner. This is useful, for example, when maximizing training throughput.

tuner = GeneralTuner(maximize=True)
tuner.optimize(objective, n_trials=10)

Exclude Trial from Tuning

Common to all Tuners, trials where errors or exceptions occur within the objective function are automatically excluded. For example, users don't need to consider cases where the program terminates abnormally due to Out of Memory during command execution. On the other hand, if you want to intentionally exclude a trial when cost value calculation fails or when certain constraints are not met, you can exclude the trial by returning None within the objective function.

def objective(trial, **kwargs):
x = trial.suggest_int("x", low=2, high=10)
y = trial.suggest_int("y", low=1, high=10)

if x % y != 0:
return None
return x**2 + y**2

Resume/Analyze from Existing Tuning Results

By specifying the path to existing tuning data (the sqlite database output by Optuna) in the Tuner's db_path, you can resume tuning from the last trial.

tuner = GeneralTuner(db_path="path/to/study.db")
tuner.optimize(objective, n_trials=10)

You can also analyze existing tuning results only by specifying db_path in the same way.

tuner = GeneralTuner(db_path="path/to/study.db")
tuner.analyze()

Set Initial Values for Tuning Parameters

By providing hyperparameter values defined in the objective function in dict format to the default_params argument of the Tuner optimize function, you can set initial values. By providing good initial values, you can reduce the number of trials with bad parameters and increase the likelihood of efficient tuning. Also, by setting baseline initial values, it becomes easier to analyze the degree of improvement through tuning.

def objective(trial, **kwargs):
x = trial.suggest_int("x", low=-10, high=10)
y = trial.suggest_int("y", low=-10, high=10)
return x**2 + y**2

tuner = GeneralTuner()
tuner.optimize(
objective,
n_trials=10,
default_params={
"x": 1,
"y": 2,
},
)

Receive Metadata from Objective Function

The following values are provided as metadata in the keyword arguments of the objective function. Please refer to them as needed.

  • trial_id: Trial index (0-indexed)
  • dist_info: Dictionary data representing distributed execution information. Contains the following key-values:
    • rank: Process rank
    • local_rank: Process local rank
    • world_size: Number of processes
    • launcher: Distributed execution method. One of "mpi", "torch", or None
  • study_dir: Directory where tuning logs and results are output. Can also be used when you want to generate working files.

Tune Environment Variables

You can tune environment variables by setting hyperparameters defined in the objective function directly to environment variables via os.environ.

def command_generator(trial, **kwargs):
num_workers = trial.suggest_int("num_workers", low=1, high=10)
omp_num_threads = trial.suggest_categorical("omp_num_threads", [1, 2, 4, 8, 16, 32])
os.environ["OMP_NUM_THREADS"] = str(omp_num_threads)

command = f"python train.py --num-workers {num_workers}"
return command

When using CommandOutputTuner, the environment variables in the current context are listed at the beginning of the execution command log file, so you can confirm that the environment variables are set properly.

Tune Scripts and Configuration Files

You can also tune values written in scripts and configuration files using the replace_params_to_file function provided by ZenithTune.

As preparation, create a copy of the file you want to tune, and turn values you want to change as tuning parameters into placeholders by surrounding the parameter name and its ends with {{ }}.

train:
batch_size: 4
learning_rate: 1e-5

train:
batch_size: {{batch_size}}
learning_rate: 1e-5

Then, define the corresponding tuning parameters in the objective function and use the replace_params_to_file function to overwrite the placeholder values. The arguments of the replace_params_to_file function are the input file path containing placeholders, the output file path, and the parameters to set and their values in dict format.

def replace_params_to_file(
input_filepath: str,
output_filepath: str,
params: Dict[str, Any],
) -> None:

In summary, it looks like this. By receiving metadata from the objective function, we write out yaml files individually for each trial.

from zenith_tune.utils import replace_params_to_file

def command_generator(trial, trial_id, study_dir, **kwargs):
num_workers = trial.suggest_int("num_workers", low=1, high=10)
batch_size = trial.suggest_int("batch_size", low=2, high=8)
tuning_yaml = os.path.join(study_dir, f"trial_{trial_id}.yaml")
replace_params_to_file(
"edited.yaml",
tuning_yaml,
{"batch_size": batch_size},
)

command = f"python train.py --num-workers {num_workers} --config {tuning_yaml}"
return command

This way, you can directly use the file output to output_filepath after tuning.