----------- Configuration Arguments ----------- gpus: 0,1 heter_worker_num: None heter_workers: http_port: None ips: 127.0.0.1 log_dir: log ... ------------------------------------------------ ... +=======================================================================================+ | Distributed Envs Value | +---------------------------------------------------------------------------------------+ | PADDLE_TRAINER_ID 0 | | PADDLE_CURRENT_ENDPOINT 127.0.0.1:12464 | | PADDLE_TRAINERS_NUM 2 | | PADDLE_TRAINER_ENDPOINTS 127.0.0.1:12464,127.0.0.1:43227 | | FLAGS_selected_gpus 0 | +=======================================================================================+ ... +==============================================================================+ | | | DistributedStrategy Overview | | | +==============================================================================+ | lars=True <-> lars_configs | +------------------------------------------------------------------------------+ | lars_coeff 0.0010000000474974513 | | lars_weight_decay 0.0005000000237487257 | | epsilon 0.0 | | exclude_from_weight_decay batch_norm | | .b_0 | +==============================================================================+ ... W0114 18:07:51.588716 16234 device_context.cc:346] Please NOTE: device: 4, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.0 W0114 18:07:51.593963 16234 device_context.cc:356] device: 4, cuDNN Version: 7.6. [Epoch 0, batch 0] loss: 0.14651, acc1: 0.00000, acc5: 0.00000 [Epoch 0, batch 5] loss: 1.82926, acc1: 0.00000, acc5: 0.00000 [Epoch 0, batch 10] loss: 0.00000, acc1: 0.00000, acc5: 0.00000 [Epoch 0, batch 15] loss: 0.13787, acc1: 0.03125, acc5: 0.03125 [Epoch 0, batch 20] loss: 0.12400, acc1: 0.03125, acc5: 0.06250 [Epoch 0, batch 25] loss: 0.17749, acc1: 0.00000, acc5: 0.00000 ...