Skip to main content
Version: v2509

CommandBuilder

CommandBuilder is a powerful utility class for dynamically manipulating command-line options in ZenithTune. It parses existing commands and allows easy addition, update, and removal of options, making it particularly useful when dealing with complex command lines in hyperparameter tuning.

Overview

CommandBuilder can be utilized in the following scenarios:

  • Dynamically modify command-line options of existing training scripts
  • Handle multiple option formats in a unified manner
  • Dynamically construct commands in Kubernetes PyTorchJob and similar environments

Supported Option Formats

CommandBuilder automatically recognizes and properly handles four main option formats:

1. EQUALS Format

--key=value
--nproc_per_node=4
--config=/path/to/config.yaml

2. SPACE Format

--key value
--batch-size 32
--learning-rate 0.001

3. KEYVALUE Format

--env KEY1=value1 KEY2=value2
--define batch_size=32 learning_rate=0.001

4. FLAG Format

--verbose
--use-amp
--debug

Basic Usage

Command Initialization and Operations

from zenith_tune.command import CommandBuilder

# Initialize from an existing command
builder = CommandBuilder("python train.py --epochs 10 --batch-size 32")

# Append option (adds even if the option already exists)
builder.append("--learning-rate 0.001")

# Update option (replaces existing option)
builder.update("--epochs 20") # Update from 10 to 20

# Remove option
builder.remove("--batch-size")

# Get the final command
command = builder.get_command()
print(command) # "python train.py --epochs 20 --learning-rate 0.001"

Method Chaining

All operation methods return self, enabling method chaining:

command = (CommandBuilder("python train.py")
.append("--epochs 10")
.append("--batch-size 32")
.update("--learning-rate 0.001")
.remove("--debug")
.get_command())

Practical Usage Examples

Hyperparameter Tuning with ZenithTune

from optuna.trial import Trial
from zenith_tune.command import CommandBuilder

def command_generator(trial: Trial, **kwargs):
# Start with an existing command as base
base_command = "torchrun --nproc_per_node=8 --master_port=29500 train.py --epochs 100"
builder = CommandBuilder(base_command)

# Sample hyperparameters
batch_size = trial.suggest_int("batch_size", 16, 128)
lr = trial.suggest_float("learning_rate", 1e-5, 1e-3, log=True)
dropout = trial.suggest_float("dropout", 0.1, 0.5)

# Update existing options
builder.update(f"--batch-size {batch_size}")
builder.update(f"--lr {lr}")

# Add new options
builder.append(f"--dropout {dropout}")

# Conditionally add flags
if trial.suggest_categorical("use_amp", [True, False]):
builder.append("--use-amp")

return builder.get_command()

Usage with Kubernetes PyTorchJob

from zenith_tune.job_tuning.kubernetes import PyTorchJob

def job_converter(trial: Trial, job: PyTorchJob) -> PyTorchJob:
# Get current command (e.g., ["sh", "-c", "actual command"])
current_command = job.get_command()
actual_command = current_command[2]

# Manipulate with CommandBuilder
builder = CommandBuilder(actual_command)

# Update parameters
num_workers = trial.suggest_int("num_workers", 0, 8)
builder.update(f"--num-workers {num_workers}")

# Add GPU memory settings
if trial.suggest_categorical("gradient_checkpointing", [True, False]):
builder.append("--gradient-checkpointing")

# Set command
new_command = current_command.copy()
new_command[2] = builder.get_command()
job.set_command(new_command)

return job