Things to be careful about in the Jacobi MPI implementation --------- 1) logical to physical mapping Declare the meshes X and Y: X[N+2][N+2] and Y[N+2][N+2] how to partition? Logical array may be distributed in the first dimension or the second dimension. X[N/P+2][N+2] Each node would still want to have a size of dimension+2 to store the boundaries How is the mapping done? X[i][j] --> X[my_id*pernode_size+i][j] (something like that) You need to work-out the index mapping, especially when the matrix size is not divisible by P. Which code is affected? Initialization Loop bounds (loop body is similar if boundary is maintained) Output 2) Managing communication: in the main loop: each node must send boundary to its neighbor and receive data from its neighbor. Be aware of the potential deadlock situation. When can this happen? See example Communicate Maxdiff: need to follow a reduce broadcast pattern in the output: Cannot let all processes write to the same file Common approach: Send data to process 0 and let process 0 write to the file only All processes can send at the same time, but process 0 will receive in order.