メインコンテンツまでスキップ
バージョン: v2510

observability.nccl_pattern_analyzer.run

update_comm_unique_patterns

def update_comm_unique_patterns(routine: str, curr_status: dict, content: str)

This is to update the tracked patterns when a specific pattern is completed Decides if the finished pattern is unique and saves the pattern if extracted pattern is unique routine: is the name of the nccl collective routine currently being tracked curr_satus: is a dictionary that keeps track of the current nccl routines being tracked line: is the log from the tracer split in list containing 2 elements [0: process info, 1: comm info] returns: dict maintaing current comm patterns being tracked

track_comm_patterns

def track_comm_patterns(routine: str, curr_status: dict, content: str)

if new comm pattern is discovered in the log starts tracking it in curr_status dict. if a comm pattern ends, removes it from the tracked status and updates unique patterns