Overview of Parallelism
Kinds of parallelisms in Computer Application
On a very high level, parallelism in applications is categorised as
Data Level Parallelism (DLP)
- arises when multiple data items need to be operated upon at the same time
Task-Level Parallelism (TLP)
- arises when multiple tasks need to be run simultaneously and independently
How computer hardware exploits the parallelisms?
Instruction-level parallelism (ILP)
- exploits data-level parallelism at modest levels with compiler help using ideas like pipelining and at medium levels using ideas like speculative execution
- Internal to hardware, not accessible to programmers
- Micro-architectural techniques that are used to exploit ILP include:
- Instruction pipelining
- Superscalar execution, VLIW, and the closely related explicitly parallel instruction computing
- Out-of-order execution
- Register renaming
- Speculative execution
- Branch prediction
- Simultaneous multithreading (SMT) converts thread-level parallelism to instruction-level parallelism
Vector Architectures and GPUs
- exploit DLP parallelism by applying a single instruction to a collection of data in parallel
- examples of vector architectures: SSE, AVX, NEON, RISCV-V
Thread-level parallelism
- exploits either DLP or TLP
- possible to use in hardware which support interaction among threads
- using libraries like OpenMP or manually writing fork/join modelled programs programmer can create multiple threads which can run simultaneously on a Multicore systems
Request-level parallelism
- parallelism among largely decoupled tasks specified by programmer
- I don’t consider this as some low level thing
- e.g. client sending multiple independent requests to a remote server simultaneously
Reference
- David A. Patterson and John L. Hennessy. 1990. Computer architecture: a quantitative approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.