CommandBuilder
CommandBuilder
is a powerful utility class for dynamically manipulating command-line options in ZenithTune.
It parses existing commands and allows easy addition, update, and removal of options, making it particularly useful when dealing with complex command lines in hyperparameter tuning.
Overview
CommandBuilder can be utilized in the following scenarios:
- Dynamically modify command-line options of existing training scripts
- Handle multiple option formats in a unified manner
- Dynamically construct commands in Kubernetes PyTorchJob and similar environments
Supported Option Formats
CommandBuilder automatically recognizes and properly handles four main option formats:
1. EQUALS Format
--key=value
--nproc_per_node=4
--config=/path/to/config.yaml
2. SPACE Format
--key value
--batch-size 32
--learning-rate 0.001
3. KEYVALUE Format
--env KEY1=value1 KEY2=value2
--define batch_size=32 learning_rate=0.001
4. FLAG Format
--verbose
--use-amp
--debug
Basic Usage
Command Initialization and Operations
from zenith_tune.command import CommandBuilder
# Initialize from an existing command
builder = CommandBuilder("python train.py --epochs 10 --batch-size 32")
# Append option (adds even if the option already exists)
builder.append("--learning-rate 0.001")
# Update option (replaces existing option)
builder.update("--epochs 20") # Update from 10 to 20
# Remove option
builder.remove("--batch-size")
# Get the final command
command = builder.get_command()
print(command) # "python train.py --epochs 20 --learning-rate 0.001"
Method Chaining
All operation methods return self, enabling method chaining:
command = (CommandBuilder("python train.py")
.append("--epochs 10")
.append("--batch-size 32")
.update("--learning-rate 0.001")
.remove("--debug")
.get_command())
Practical Usage Examples
Hyperparameter Tuning with ZenithTune
from optuna.trial import Trial
from zenith_tune.command import CommandBuilder
def command_generator(trial: Trial, **kwargs):
# Start with an existing command as base
base_command = "torchrun --nproc_per_node=8 --master_port=29500 train.py --epochs 100"
builder = CommandBuilder(base_command)
# Sample hyperparameters
batch_size = trial.suggest_int("batch_size", 16, 128)
lr = trial.suggest_float("learning_rate", 1e-5, 1e-3, log=True)
dropout = trial.suggest_float("dropout", 0.1, 0.5)
# Update existing options
builder.update(f"--batch-size {batch_size}")
builder.update(f"--lr {lr}")
# Add new options
builder.append(f"--dropout {dropout}")
# Conditionally add flags
if trial.suggest_categorical("use_amp", [True, False]):
builder.append("--use-amp")
return builder.get_command()
Usage with Kubernetes PyTorchJob
from zenith_tune.job_tuning.kubernetes import PyTorchJob
def job_converter(trial: Trial, job: PyTorchJob) -> PyTorchJob:
# Get current command (e.g., ["sh", "-c", "actual command"])
current_command = job.get_command()
actual_command = current_command[2]
# Manipulate with CommandBuilder
builder = CommandBuilder(actual_command)
# Update parameters
num_workers = trial.suggest_int("num_workers", 0, 8)
builder.update(f"--num-workers {num_workers}")
# Add GPU memory settings
if trial.suggest_categorical("gradient_checkpointing", [True, False]):
builder.append("--gradient-checkpointing")
# Set command
new_command = current_command.copy()
new_command[2] = builder.get_command()
job.set_command(new_command)
return job