|

Organizing the Protection of Confidential Information Contained in Datasets when Training Neural Networks on Remote Electronic Computers and Cloud Services

Authors: Tarasenko S.S., Morkovin S.V.  Published: 27.12.2021
Published in issue: #4(137)/2021  
DOI: 10.18698/0236-3933-2021-4-109-121

 
Category: Informatics, Computer Engineering and Control | Chapter: Methods and Systems of Information Protection, Information Security  
Keywords: neural networks, data protection, cloud computing, encryption, generative adversarial networks

Various companies and researchers invest significant material and time resources to collect the necessary datasets, and as a consequence, wish to keep them secret. To analyze the collected datasets and further apply them in the training of neural networks for specific tasks requires the appropriate hardware, which not all machine learning researchers are able to acquire. To solve this problem, many IT corporations, such as Amazon or Google, provide access to their powerful hardware infrastructure (on a reimbursable and non-reimbursable basis) with high computing power for training neural networks. The semantics of the datasets on which the training of neural networks will take place is open. Therefore, there is a need to protect the semantics of the data on which neural networks will be trained in the cloud services or on remote computers

References

[1] Google colaboratory: website. Available at: https://colab.research.google.com (accessed: 15.09.2021).

[2] Gentry C. Fully homomorphic encryption using ideal lattices. Proc. STOC, 2009, pp. 169--178. DOI: https://doi.org/10.1145/1536414.1536440

[3] Salomaa A. Public-key cryptography. New York, Springer, 1990.

[4] Arute F., Arya K., Babbush R., et al. Quantum supremacy using a programmable superconducting processor. Nature, 2019, vol. 574, pp. 505--510. DOI: https://doi.org/10.1038/s41586-019-1666-5

[5] Goodfellow I., Pouget-Abadie J., Mirza M., et al. Generative adversarial networks. Proc. NIPS, 2014, pp. 2672--2680.

[6] LeCun Y., Cortez C., Burges C.J.C. The MNIST database of handwritten digits. yann.lecun.com: website. Available at: http://yann.lecun.com/exdb/mnist (accessed: 15.09.2021).

[7] Ofitsial’nyy sayt distributiva Ubuntu 20.04 LTS [Official website of Ubuntu 20.04 LTS]. Available at: https://releases.ubuntu.com/20.04 (accessed: 15.09.2021).

[8] Ofitsial’nyy sayt biblioteki scikit-learn [Official website of Scikit-Learn Library]. Available at: https://scikit-learn.org/stable (accessed: 15.09.2021).

[9] Ofitsial’nyy sayt biblioteki Keras [Official website of Keras Library]. Available at: https://keras.io (accessed: 15.09.2021).

[10] Isola P., Zhu J.Y., Zhou T., et al. Image-to-image translation with conditional adversarial networks. Available at: https://arxiv.org/abs/1611.07004 (accessed: 15.09.2021).

[11] Krizhevsky A., Sutskever I., Hinton G.E. Imagenet classification with deep convolutional neural networks. Proc. NIPS, 2012, vol. 1, pp. 1097--1105.

[12] Shannon C.E. Communicftion theory of secrecy systems. BSTJ, 1949, vol. 28, no. 4, pp. 656--715.

[13] Standard ECMA-262. ECMAScript language specification. Geneva, Ecma International, 2011.

[14] Cho K., van Merrienboer B., Gulcehre C., et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Available at: https://arxiv.org/abs/1406.1078 (accessed: 15.09.2021).

[15] Hochreiter S., Schmidhuber J. Long short-term memory. Neural Comput., 1997, vol. 9, no. 8, pp. 1735--1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735