Efficient Privacy-Preserving NN Inference at the Edge

Torres, Antonio

Etd

Efficient Privacy-Preserving NN Inference at the Edge

Public Deposited

Neural networks (NNs) have been widely adopted in practice to create predictive models in applications such as healthcare, financial services, and policy-making. As their use continues to grow, so does the risk of attacks against users' data and NNs. While traditionally, deep learning was constrained by computational power and off-chip memory bandwidth, such attacks impose new priorities in connection with security and privacy. Privacy-preserving deep learning addresses these issues by leveraging cryptographic primitives, e.g., homomorphic encryption and secure multiparty computation (MPC). MPC-based solutions offer a higher degree of flexibility by allowing different parties to train an NN model on their private data without revealing any information beyond the output. In this regard, combining MPC and deep learning enables a variety of privacy-preserving online services. As an example, to classify a picture, a customer can use an online deep learning service, where the service provider and the user engage in an MPC protocol instead of just uploading the picture. In doing so, the user obtains the classification result without revealing the input, while the provider can keep its model secret. Existing work in the area of MPC falls into two main classes: (1) MPC over Boolean circuits and (2) MPC over arithmetic circuits. While the former class relies on Yao’s garbled circuits and achieves constant communication complexity, secret-sharing (SS)-based solutions have been adopted to evaluate arithmetic circuits with a communication complexity linear in the multiplicative depth of the circuit. Nevertheless, there are MPC frameworks enjoying the benefits of both classes by combining them. Further optimization can be performed, which is applied in the context of private NN inference. Resource optimization is even more vital for applications where NNs run on resource-constrained edge devices. It has been shown in the literature that running MPC-based privacy-preserving NNs on FPGAs brings down the protocol execution time and power consumption within the practical limit. In line with such efforts, our work presents optimization techniques that utilize the full capability of the underlying FPGA hardware. More concretely, FPGAs' parallel processing and pipelining capabilities allow for faster computations essential in NN tasks. This is complemented by optimized memory access, which minimizes latency and maximizes data throughput. Moreover, the flexibility to develop custom instruction sets tailored for NN operations enhances protocol execution efficiency. Finally, the algorithm-hardware co-design approach ensures that both the NN algorithms and FPGA architecture are optimally aligned for performance, making FPGAs a powerful and efficient choice for the secure implementation of neural network accelerators through multi-party computation. Our results demonstrate that our approach uses significantly fewer hardware resources, up to 62.5 times fewer logical resources, and 66 times less memory than cutting-edge privacy-preserving NN interfaces. Furthermore, in scenarios where execution time is critical, our approach proves to be 2.5 times faster than the average execution time of the privacy-preserving NN inferences while closely matching the performance of the fastest state-of-the-art approaches to privacy-preserving NN inferences.

Creator