Research Publications

FedML’s core technology is backed by years of cutting-edge research represented in 50+ publications in ML/FL Algorithms, Security/Privacy, Systems, and Applications.

Outline

A Full-stack of Scientific Publications in ML Algorithms, Security/Privacy, Systems, Applications, and Visionary Impacts

Vision Paper for High Scientific Impacts

Being visionary to find the correct problems is always the key to impactful research.

[1] Open Problems and Advances in Federated Learning在新窗口打开. FnTML 2021.

[2] Field Guide for Federated Learning在新窗口打开 (Arxiv 2021)

[3] Federated learning for Internet of Things: : Applications, Challenges, and Opportunities在新窗口打开 (Arxiv 2021)

System for Large-scale Distributed/Federated Training

Towards communication/computation/memory-efficient, resilient and robust distributed training and inferences via ML+system co-design and real-world implementation.

[1] A fundamental tradeoff between computation and communication in distributed computing在新窗口打开 (IEEE Transactions on Information Theory)

[2] FedML: A Research Library and Benchmark for Federated Machine Learning在新窗口打开 (NeurIPS 2020 FL Workshop, Best Paper Award)

[3] PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers在新窗口打开 (ICML 2021)

[4] Pipe-SGD: A decentralized pipelined SGD framework for distributed deep net training在新窗口打开 (NeurIPS 2018)

[5] Gradiveq: Vector quantization for bandwidth-efficient gradient aggregation in distributed cnn training在新窗口打开 (NeurIPS 2018)

[6] MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge在新窗口打开 (NeurIPS 2021)

[7] ApproxIFER: A Model-Agnostic Approach to Resilient and Robust Prediction Serving Systems在新窗口打开 (NeurIPS 2021)

[8] Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy在新窗口打开 (AISTATS 2019)

[9] OmniLytics: A Blockchain-based Secure Data Market for Decentralized Machine Learning在新窗口打开 (ICML 2021 FL Workshop)

[10] AsymML: An Asymmetric Decomposition Framework for Privacy-Preserving DNN Training and Inference在新窗口打开 (Arxiv 2022)

[11] Communication-aware scheduling of serial tasks for dispersed computing在新窗口打开 (IEEE/ACM Transactions on Networking)

Training Algorithms for FL

Algorithmic innovation to land distributed training and inference on the edge into the real-world system, solving challenges in efficiency, scalability, label deficiency, personalization, fairness, low-latency, straggler mitigation, etc.

[1] Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge在新窗口打开 (NeurIPS’20)

[2] FedNAS (neural architecture search for FL personalization)在新窗口打开 at CVPR’20 NAS Workshop

[3] SpreadGNN: Serverless Multi-task Federated Learning for Graph Neural Networks在新窗口打开 (AAAI’21)

[4] SSFL: Tackling Label Deficiency in Federated Learning via Personalized Self-Supervision在新窗口打开 (FL-AAAI’22, Best Paper Award)

[5] FairFed: Enabling Group Fairness in Federated Learning在新窗口打开 (NeurIPS 2021 FL workshop)

[6] Accelerated Distributed Approximate Newton Method在新窗口打开 (TNNLS Journal, 2022)

[7] Partial Model Averaging in Federated Learning: Performance Guarantees and Benefits在新窗口打开 (FL-AAAI’2022)

[8] SPIDER: Searching Personalized Neural Architecture for Federated Learning在新窗口打开 (Arxiv’ 2022)

[9] Layer-wise Adaptive Model Aggregation for Scalable Federated Learning在新窗口打开 (Arxiv’2022)

[10] Achieving Small-Batch Accuracy with Large-Batch Scalability via Adaptive Learning Rate Adjustment在新窗口打开 (Arxiv’ 2022)

[11] Coded Computing for Low-Latency Federated Learning Over Wireless Edge Networks在新窗口打开 (IEEE Journal on Selected Areas in Communications)

[12] Coded computation over heterogeneous clusters在新窗口打开 (IEEE Transactions on Information Theory)

[13] Hierarchical coded gradient aggregation for learning at the edge在新窗口打开 (ISIT 2020)

[14] Coded computing for federated learning at the edge在新窗口打开

[15] Straggler mitigation in distributed matrix multiplication: Fundamental limits and optimal coding在新窗口打开 (IEEE Transactions on Information Theory)

Security/privacy for FL

Privacy-preserving, Attack, and Defense

[1] LightSecAgg: a Lightweight and Versatile Design for Secure Aggregation in Federated Learning在新窗口打开 (MLSys’22)

[2] Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning在新窗口打开 (JSAIT’21)

[3] Securing Secure Aggregation: Mitigating Multi-Round Privacy Leakage in Federated Learning在新窗口打开 (end-to-end privacy protection in FL)

[4] A scalable approach for privacy-preserving collaborative machine learning在新窗口打开 (NeurIPS 2020)

[5] Secure aggregation for buffered asynchronous federated learning在新窗口打开 (Arxiv’2021)

[6] Basil: A Fast and Byzantine-Resilient Approach for Decentralized Training在新窗口打开

[7] CodedReduce: A Fast and Robust Framework for Gradient Aggregation in Distributed Learning在新窗口打开 (IEEE/ACM Transactions on Networking)

[8] Verifiable Coded Computing: Towards Fast, Secure and Private Distributed Machine Learning在新窗口打开 (IPDPS 2022)

[9] CodedPrivateML: A fast and privacy-preserving framework for distributed machine learning在新窗口打开 (IEEE Journal on Selected Areas in Information Theory)

[10] Byzantine-resilient secure federated learning在新窗口打开 (IEEE Journal on Selected Areas in Information Theory)

[11] Mitigating byzantine attacks in federated learning在新窗口打开

[12] Secure aggregation with heterogeneous quantization in federated learning在新窗口打开

[13] Entangled polynomial codes for secure, private, and batch distributed matrix multiplication: Breaking the” cubic” barrier在新窗口打开 (ISIT 2020)

[14] Coded merkle tree: Solving data availability attacks in blockchains在新窗口打开 (International Conference on Financial Cryptography and Data Security)

[15] HeteroSAg: Secure Aggregation with Heterogeneous Quantization in Federated Learning在新窗口打开

[16] Polyshard: Coded sharding achieves linearly scaling efficiency and security simultaneously在新窗口打开 (IEEE Transactions on Information Forensics and Security)

AI Applications

Besides fundamental research in FL, we also target important applications in Natural Language Processing, Computer Vision, Data Mining, and the Internet of Things (IoTs).

[1] FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks在新窗口打开 NAACL 2022

[2] FedGraphNN: A Federated Learning Benchmark System for Graph Neural Networks在新窗口打开 (ICLR 2021 workshop; KDD 2021 workshop)

[3] FedCV: A Federated Learning Framework for Diverse Computer Vision Tasks在新窗口打开 (FL-AAAI’2022)

[4] Federated Learning for Internet of Things在新窗口打开 (ACM Sensys’21)

[5] MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation在新窗口打开 (CVPR 2020)

[6] AutoCTS: Automated Correlated Time Series Forecasting在新窗口打开 (VLDB 2022)

[7] Coded computing for distributed graph analytics在新窗口打开 (IEEE Transactions on Information Theory)

[8] TACC: Topology-aware coded computing for distributed graph processing在新窗口打开 (IEEE Transactions on Signal and Information Processing over Networks)

[9] Privacy-Aware Distributed Graph-Based Semi-Supervised Learning在新窗口打开 (2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP)

[10] Lightweight Image Super-Resolution with Hierarchical and Differentiable Neural Architecture Search在新窗口打开 (IJCV Journal Under Review)

[11] Collecting Indicators of Compromise from Unstructured Text of Cybersecurity Articles using Neural-Based Sequence Labelling在新窗口打开 (IJCNN 2019)