Artificial Intelligence & High-Performance Computing
Performance Optimization with Intel® Software Tools

Workshop - November 13-14, 2019 - Politecnico di Torino

Department of Mathematical Sciences - MIUR Excellence Project 2018 - 2022

In this 2-day Hands-on workshop, we provide a unique opportunity to learn techniques, methods and solutions on how to improve code, how to enable new features of the Intel processors and to use new tools like the Roofline model to visualize the potential benefits of an optimization process for HPC and AI applications.


On the first day of this workshop we will introduce the Intel SW development tools and the new, upcoming cross platform development concept oneAPI and the code optimization process. The attendees will be supported with hands-on exercises using a many-body kernel and learn how to enable vectorization with simple pragmas and more effective techniques, like changing data layout and memory alignment. The work is then guided by the hints from the Intel® Compiler reports and using the Intel® Advisor. Furthermore, the participants will get insights into Intel® VTune™ Amplifier and of Intel Application Performance Snapshot as tools for investigating and improving the performance of HPC application.

On the second day we will describe the latest micro-processor hardware architectures and how the developers can efficiently use modern HPC hardware, in particular the Intel® Deep Learning Boost with AVX-512VNNI instructions.

We will then focus on data analytics techniques, such as Machine Learning and Deep Learning, which become the key for gaining insight into the incredible amount of data generated by scientific investigations (simulations and observations). An overview on the most known machine learning algorithms for supervised and unsupervised learning will follow. With small example codes we show how to implement such algorithms using the Intel® Distribution for Python*, and which performance benefit can be obtained with minimal effort from the developer perspective. We will implement simple algorithms, like K-Means and PCA, using the Intel® Data Analytics Acceleration Library (Intel-DAAL) and show how to scale the workload on HPC systems using the Intel® MPI library. We cover also how to accelerate the training of deep neural networks with Tensorflow*, thanks to the highly optimized Intel® Math Kernel Library (Intel® MKL). We also demonstrate techniques on how to leverage deep neural network training on multiple nodes on distributed x86 HPC systems not requiring GPUs.


09:00-09:30 Introduction to Intel oneAPI project for heterogeneous software infrastructure Intel’s Hardware and Software directions for Artificial Intelligence (AI)
09:30-10:00 Latest update of the Intel Parallel Studio features
10:00-10:30 Introduction to HPC code optimization process Hardware Accelerated Deep Learning instructions and implementations
10:30-11:00 Coffee break Coffee break
11:00-12:30 Hands-on on compiler optimization process using N-body code Hands-on Labs using Intel distribution for Python focus on Classical Machine Learning algorithms
12:30-14:00 Lunch Lunch
14:00-15:00 Identify Vectorization inefficiencies - Roofline Model and Intel Advisor with hands-on Hands-on using Intel Data Analytics Acceleration Library for distributed Machine Learning algorithms
15:00-15:30 Introduction to Deep Learning using Intel Optimized Tensorflow*
15:30-16:00 Coffee break Coffee break
16:00-17:00 Identify Performance inefficiencies - Intel VTune and the Application Performance Snapshot Distributed Deep Learning Solution for HPC systems
17:00-17:30 Tips and tricks - Enhance the performance using the Intel Math Kernel Library


Registration is compulsory (there is no fee), but there is a limited number of places available.
Registration is open until November 03.
Registration form