Page 81 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 81

ITU Journal: ICT Discoveries, Vol. 3(1), June 2020



          needs to be more thoroughly investigated. For instance,  tings in which deep neural networks are communicated
          it has been shown that it is possible for an adversary to  and has discussed the respective proposed compression
          introduce hidden functionality into the jointly trained  methods and methodological challenges. Our holistic
          model [5] or disturb the training process [16]. Detecting  view has revealed that these four seemingly different
          these adversarial behaviors becomes much more difficult  and independently developing fields of research have a
          under privacy constraints. Future methods for data-  lot in common. We therefore believe that these settings
          local training will have to jointly address the issues of  should be considered in conjunction in the future.
          efficiency, privacy and robustness.
          Synchrony: In most distributed learning schemes of   REFERENCES
          Embedded ML, communication takes place at regular
          time intervals such that the state of the system can al-  [1] M. S. H. Abad, E. Ozfatura, D. Gunduz, and
          ways be uniquely determined [13]. This has the ben-      O. Ercetin. Hierarchical federated learning across
          efit that it severely simplifies the theoretical analysis  heterogeneous cellular networks.  arXiv preprint
          of the properties of the distributed learning system.    arXiv:1909.02362, 2019.
          However synchronous schemes may suffer dramatically   [2] A. F. Aji and K. Heafield. Sparse communication
          from delayed computation in the presence of slow work-   for distributed gradient descent.  arXiv preprint
          ers (stragglers). While countermeasures against strag-   arXiv:1704.05021, 2017.
          glers can usually be taken (e.g. by restricting the maxi-
          mum computation time per worker), in some situations  [3] D. Alistarh, D. Grubic, J. Li, R. Tomioka, and
          it might still be beneficial to adopt an asynchronous    M. Vojnovic. Qsgd: Communication-efficient sgd
          training strategy (e.g. [54]), where parameter updates   via gradient quantization and encoding. In Ad-
          are applied to the central model directly after they ar-  vances in Neural Information Processing Systems,
          rive at the server. This approach avoids delays when the  pages 1707–1718, 2017.
          time required by workers to compute parameter updates  [4] S. Bach, A. Binder, G. Montavon, F. Klauschen,
          varies heavily. The absence of a central state however   K.-R. Müller, and W. Samek.     On pixel-wise
          makes convergence analysis far more challenging (al-     explanations for non-linear classifier decisions by
          though convergence guarantees can still be given [21])   layer-wise relevance propagation.  PLoS ONE,
          and may cause model updates to become ”stale” [88].      10(7):e0130140, 2015.
          Since the central model may be updated an arbitrary
          number of times while a client is computing a model   [5] E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and
          update, this update will often be out of date when it    V. Shmatikov. How to backdoor federated learning.
          arrives at the server. Staleness slows down convergence,  arXiv preprint arXiv:1807.00459, 2018.
          especially during the final stages of training.       [6] D. Bahdanau, K. Cho, and Y. Bengio. Neural ma-
          Standards: To communicate neural data in an inter-       chine translation by jointly learning to align and
          operable manner, standardized data formats and com-      translate. arXiv preprint arXiv:1409.0473, 2014.
          munication protocols are required. Currently, MPEG is
          working towards a new part 17 of the ISO/IEC 15938    [7] A. Bellet, R. Guerraoui, M. Taziki, and M. Tom-
          standard, defining tools for compression of neural data  masi. Personalized and private peer-to-peer ma-
          for multimedia applications and representing the result-  chine learning. arXiv preprint arXiv:1705.08435,
          ing bitstreams for efficient transport. Further steps are  2017.
          needed in this direction for a large-scale implementation  [8] J. Bernstein, Y.-X. Wang, K. Azizzadenesheli, and
          of embedded machine learning solutions.                  A. Anandkumar.    signsgd:  compressed optimi-
                                                                   sation for non-convex problems.  arXiv preprint
          4. CONCLUSION                                            arXiv:1802.04434, 2018.
          We currently witness a convergence between the ar-    [9] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone,
          eas of machine learning and communication technology.    H. B. McMahan, S. Patel, D. Ramage, A. Segal,
          Not only are today’s algorithms used to enhance the      and K. Seth. Practical secure aggregation for fed-
          design and management of networks and communica-         erated learning on user-held data. arXiv preprint
          tion components [34], ML models such as deep neu-        arXiv:1611.04482, 2016.
          ral networks themselves are being communicated more  [10] L. Bottou. Online learning and stochastic approx-
          and more in our highly connected world. The roll-out     imations.  On-line learning in neural networks,
          of data-intensive 5G networks and the rise of mobile     17(9):142, 1998.
          and IoT applications will further accelerate this devel-
          opment, and it can be predicted that neural data will  [11] S. Caldas, J. Konečny, H. B. McMahan, and A. Tal-
          soon account for a sizable portion of the traffic through  walkar. Expanding the reach of federated learn-
          global communication networks.                           ing by reducing client resource requirements. arXiv
          This paper has described the four most important set-    preprint arXiv:1812.07210, 2018.





                                             © International Telecommunication Union, 2020                    59
   76   77   78   79   80   81   82   83   84   85   86