PEs (Processing Elements) perform Multiply-Accumulate (MAC). Data flows rhythmically.
Matrix A enters from the left. Matrix B enters from the top. Inputs are skewed so each PE receives aligned pairs.