Show simple item record

dc.contributor.authorMendes, Celso
dc.contributor.authorMeneses-Rojas, Esteban
dc.contributor.authorXiang, Ni
dc.contributor.authorGengbin, Zheng
dc.description.abstractSupercomputers have seen an exponential increase in their size in the last two decades. Such a high growth rate is expected to take us to exascale in the timeframe 2018-2022. But, to bring a productive exascale environment about, it is necessary to focus on several key challenges. One of those challenges is fault tolerance. Machines at extreme scale will experience frequent failures and will require the system to avoid or overcome those failures. Various techniques have recently been developed to tolerate failures. The impact of these techniques and their scalability can be substantially enhanced by a parallel programming model called migratable objects. In this paper, we demonstrate how the migratable-objects model facilitates and improves several fault tolerance approaches. Our experimental results on thousands of cores suggest fault tolerance schemes based on migratable objects have low performance overhead and high scalability. Additionally, we present a performance model that predicts a significant benefit of using migratable objects to provide fault tolerance at extreme
dc.publisherIEEE Computer Societyes
dc.rightsAttribution-NonCommercial 3.0 Costa Rica*
dc.sourceIEEE Transactions on Parallel and Distribute Systems, Vol. 26, no. 7, JULY 2015es
dc.subjectProgramación paralela, escalabilidad, resistenciaes
dc.subjectPropagación paralelaes
dc.subjectResearch Subject Categories::TECHNOLOGY::Information technology::Computer science::Computer sciencees
dc.titleUsing migratable objects to enhance fault tolerance schemes in supercomputerses

Files in this item


This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial 3.0 Costa Rica
Except where otherwise noted, this item's license is described as Attribution-NonCommercial 3.0 Costa Rica