LocFedMix-SL: Localize, Federate, and Mix for Improved Scalability, Convergence, and Latency in SL

RAMO
2022년 3월 16일
2분 분량

Authors: S. Oh, J. Park, P. Vepakomma, S. Baek, R. Raskar, M. Bennis and S. -L. Kim

The goal of this article is to develop a conventional parallel Split Learning (SL)[1] and state-of-art work LocSplitFed[2] and generate a scalable parallel SL algorithm with fast convergence and low latency. As a first step, we identify that the fundamental bottleneck of existing parallel SL comes from the model-split and parallel computing architectures, under which the server-client model updates are often imbalanced, and the client models are prone to detach from the server’s model. To fix this problem, by carefully integrating local parallelism, federated learning, and mixup augmentation techniques, we propose a novel parallel SL framework, coined LocFedMix-SL.

Conventional works such as SplitFEd and LocSplitFed have a limited scalability problem, and the fuldatmental reason tis as below.

Server-Client update imbalance problem
Client model detachment problem

To overcome such withdraws, this work describe the key component techniques used in the existing SL algorithms, and other component techniques in the DNN algorithms to target aforementioned imbalanced update problem with little or no additional cost.

Smashed Data Augmentation

The server aggregates smashed data uploaded by two clients, generating mixed-up smashed data according to the well-known algorithm in manifold mixup[3].

Local loss with Mutual Information regularization

With a reference of Infopro loss, this work devises a local regularizer, to maximize the information about the input data that can be obtained from given smashed data, that is, the mutual information between the smashed data and the input data. With is goal, auxiliary network is used to minimize a local loss, the residual randomness.

Local model Federated averaging

After updating the lower model and upper model through FP and BP, an additional aggregation phase is introduced to supplement the lower model updated with only local gradient compared to the upper model updated with global gradient.

We discovered that the reason comes fundamentally from the parallel SL architecture that is inherently prone to incur the problems of server-client update imbalance and client model

detachment from the server model. To fix this issue, we carefully integrated local parallelism, federated learning, and mixup data augmentation techniques into parallel SL, so as to keep the FP flows and BP updates balanced. Consequently, we proposed a novel parallel SL framework, named LocFedMix-SL, and validated its achieving high scalability, fast convergence, and low latency by simulation.

Full Paper: S. Oh, J. Park, P. Vepakomma, S. Baek, R. Raskar, M. Bennis and S. -L. Kim, "LocFedMix-SL: Localize, Federate, and Mix for Improved Scalability, Convergence, and Latency in Split Learning," accepted to TheWebConf 2022 (WWW conference), 2022.

Reference

[1] Chandra Thapa, Mahawaga Arachchige Pathum Chamikara, and Seyit Camtepe.

2020. Splitfed: When federated learning meets split learning. arXiv preprint

arXiv:2004.12088 (2020).

[2] Dong-Jun Han, Hasnain Irshad Bhatti, Jungmoon Lee, and Jaekyun Moon. [n. d.].

Accelerating Federated Learning with Split Learning on Locally Generated Losses.

([n. d.]).

[3] Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas,

David Lopez-Paz, and Yoshua Bengio. 2019. Manifold mixup: Better representations

by interpolating hidden states. In International Conference on Machine

Learning. PMLR, 6438–6447

로보틱 및 무선네트워크 연구실

Robotic and Mobile Networks Laboratory

School of Electrical

& Electronic Engineering

LocFedMix-SL: Localize, Federate, and Mix for Improved Scalability, Convergence, and Latency in SL

최근 게시물

댓글