Speedup of deep neural network learning on the MIC-architecture › Научные исследования в СПбГУ

Standard

Speedup of deep neural network learning on the MIC-architecture. / Milova, E.; Sveshnikova, S.; Gankevich, I.

International Conference on High Performance Computing Simulation (HPCS'16). Institute of Electrical and Electronics Engineers Inc., 2016. стр. 989-992.

Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная

Harvard

Milova, E, Sveshnikova, S & Gankevich, I 2016, Speedup of deep neural network learning on the MIC-architecture. в International Conference on High Performance Computing Simulation (HPCS'16). Institute of Electrical and Electronics Engineers Inc., стр. 989-992. https://doi.org/10.1109/HPCSim.2016.7568443

APA

Milova, E., Sveshnikova, S., & Gankevich, I. (2016). Speedup of deep neural network learning on the MIC-architecture. в International Conference on High Performance Computing Simulation (HPCS'16) (стр. 989-992). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/HPCSim.2016.7568443

Vancouver

Milova E, Sveshnikova S , Gankevich I. Speedup of deep neural network learning on the MIC-architecture. в International Conference on High Performance Computing Simulation (HPCS'16). Institute of Electrical and Electronics Engineers Inc. 2016. стр. 989-992 https://doi.org/10.1109/HPCSim.2016.7568443

Author

Milova, E. ; Sveshnikova, S. ; Gankevich, I. / Speedup of deep neural network learning on the MIC-architecture. International Conference on High Performance Computing Simulation (HPCS'16). Institute of Electrical and Electronics Engineers Inc., 2016. стр. 989-992

BibTeX

@inproceedings{1b4e6569c99d4c8686db043bfd4b28bb,

title = "Speedup of deep neural network learning on the MIC-architecture",

abstract = "Deep neural networks are more accurate, but require more computational power in the learning process. Moreover, it is an iterative process. The goal of the research is to investigate efficiency of solving this problem on MIC architecture without changing baseline algorithm. Well-known code vectorization and parallelization methods are used to increase the effectiveness of the program on MIC architecture. In the course of the experiments we test two coprocessor data transfer models: explicit and implicit one. We show that implicit memory copying is more efficient than explicit one, because only modified memory blocks are copied. MIC architecture shows competitive performance compared to multi-core ×86 processor.",

keywords = "many-core architecture, DNN, optimisation, parallel computing, vectorization, offload, Xeon Phi, coprocessor",

author = "E. Milova and S. Sveshnikova and I. Gankevich",

year = "2016",

doi = "10.1109/HPCSim.2016.7568443",

language = "English",

isbn = "978-1-5090-2088-1",

pages = "989--992",

booktitle = "International Conference on High Performance Computing Simulation (HPCS'16)",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

}

RIS

TY - GEN

T1 - Speedup of deep neural network learning on the MIC-architecture

AU - Milova, E.

AU - Sveshnikova, S.

AU - Gankevich, I.

PY - 2016

Y1 - 2016

N2 - Deep neural networks are more accurate, but require more computational power in the learning process. Moreover, it is an iterative process. The goal of the research is to investigate efficiency of solving this problem on MIC architecture without changing baseline algorithm. Well-known code vectorization and parallelization methods are used to increase the effectiveness of the program on MIC architecture. In the course of the experiments we test two coprocessor data transfer models: explicit and implicit one. We show that implicit memory copying is more efficient than explicit one, because only modified memory blocks are copied. MIC architecture shows competitive performance compared to multi-core ×86 processor.

AB - Deep neural networks are more accurate, but require more computational power in the learning process. Moreover, it is an iterative process. The goal of the research is to investigate efficiency of solving this problem on MIC architecture without changing baseline algorithm. Well-known code vectorization and parallelization methods are used to increase the effectiveness of the program on MIC architecture. In the course of the experiments we test two coprocessor data transfer models: explicit and implicit one. We show that implicit memory copying is more efficient than explicit one, because only modified memory blocks are copied. MIC architecture shows competitive performance compared to multi-core ×86 processor.

KW - many-core architecture

KW - DNN

KW - optimisation

KW - parallel computing

KW - vectorization

KW - offload

KW - Xeon Phi

KW - coprocessor

UR - https://www.scopus.com/inward/record.uri?eid=2-s2.0-84991734860&doi=10.1109%2fHPCSim.2016.7568443&partnerID=40&md5=46fb21a706001b7a2fc2402d08dd0e81

U2 - 10.1109/HPCSim.2016.7568443

DO - 10.1109/HPCSim.2016.7568443

M3 - Conference contribution

SN - 978-1-5090-2088-1

SP - 989

EP - 992

BT - International Conference on High Performance Computing Simulation (HPCS'16)

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

ID: 7632736