Let's say you are implementing a distributed sort algorithm running on hundreds of nodes. How does that algorithm work? Where are the bottlenecks?
Anonym
Each node does its own sort, then funnels up to another node which does a "reduce" operation and then funnels up another layer. Bottleneck may well be I/O depending on the network fabric since all data must be distributed, then moved up to a single node at the end.