Employing a parallel or pipelined system architecture like in Section A.3.2 changes the throughput and the power consumption by the same factor. However, it does not change the switching energy. In cases where a parallel architecture is affordable (this can be also a subsystem as small as a flip-flop) it can also be used to actually reduce the power consumption at constant throughput to much lower values than what can be achieved with a strict delay time constraint. This may be a solution where high-performance computers are limited by their thermal power capability