Abstract : Runtime systems usually abstract a single node. The Sequential Task Flow (STF) model has been proven efficient on shared memory applications. When harnessing cluster of nodes, how should they communicate? By using explicit MPI user calls ? By using a specific paradigm ? Or can we keep the same STF paradigm and almost the same code, and leave the runtime system handle data transfers? We show how such a system has been sucessfully implemented on top of the StarPU runtime.