End-to-End Diagnosis of Cloud Systems Against Intermittent Faults
- Computer School, Beijing Information Science and Technology University
North 4th Ring Mid Road 35, 100101 Beijing, China
wangchao@bistu.edu.cn - Computer Science & Technology Department, Harbin Institute of Technology
Xidazhi Street 92, 150001 Heilongjiang, China
fuzhongchuan@hit.edu.cn - Beijing Advanced Innovation Center for Materials Genome Engineering
North 4th Ring Mid Road 35
100101 Beijing, China
Abstract
The diagnosis of intermittent faults is challenging because of their random manifestation due to intricate mechanisms. Conventional diagnosis methods are no longer effective for these faults, especially for hierachical environment, such as cloud computing. This paper proposes a fault diagnosis method that can effectively identify and locate intermittent faults originating from (but not limited to) processors in the cloud computing environment. The method is end-to-end in that it does not rely on artificial feature extraction for applied scenarios, making it more generalizable than conventional neural network-based methods. It can be implemented with no additional fault detection mechanisms, and is realized by software with almost zero hardware cost. The proposed method shows a higher fault diagnosis accuracy than BP network, reaching 97.98% with low latency.
Key words
cloud system, intermittent fault, fault diagnosis, end-to-end, LSTM, PNN
Digital Object Identifier (DOI)
https://doi.org/10.2298/CSIS200620040W
Publication information
Volume 18, Issue 3 (June 2021)
Year of Publication: 2021
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium
Full text
Available in PDF
Portable Document Format
How to cite
Wang, C., Fu, Z., Huo, Y.: End-to-End Diagnosis of Cloud Systems Against Intermittent Faults. Computer Science and Information Systems, Vol. 18, No. 3, 771–790. (2021), https://doi.org/10.2298/CSIS200620040W