Using migratable objects to enhance fault tolerance schemes in supercomputers

Mendes, Celso; Meneses-Rojas, Esteban; Xiang, Ni; Gengbin, Zheng

dc.contributor.author	Mendes, Celso
dc.contributor.author	Meneses-Rojas, Esteban
dc.contributor.author	Xiang, Ni
dc.contributor.author	Gengbin, Zheng
dc.date.accessioned	2017-06-02T15:00:09Z
dc.date.available	2017-06-02T15:00:09Z
dc.date.issued	2015-07
dc.identifier	https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6862914	es
dc.identifier.uri	https://hdl.handle.net/2238/7150
dc.description.abstract	Supercomputers have seen an exponential increase in their size in the last two decades. Such a high growth rate is expected to take us to exascale in the timeframe 2018-2022. But, to bring a productive exascale environment about, it is necessary to focus on several key challenges. One of those challenges is fault tolerance. Machines at extreme scale will experience frequent failures and will require the system to avoid or overcome those failures. Various techniques have recently been developed to tolerate failures. The impact of these techniques and their scalability can be substantially enhanced by a parallel programming model called migratable objects. In this paper, we demonstrate how the migratable-objects model facilitates and improves several fault tolerance approaches. Our experimental results on thousands of cores suggest fault tolerance schemes based on migratable objects have low performance overhead and high scalability. Additionally, we present a performance model that predicts a significant benefit of using migratable objects to provide fault tolerance at extreme scale.	es
dc.language.iso	eng	es
dc.publisher	IEEE Computer Society	es
dc.rights	acceso abierto	*
dc.rights.uri	https://creativecommons.org/licenses/by-nc/3.0/cr/	*
dc.source	IEEE Transactions on Parallel and Distribute Systems, Vol. 26, no. 7, JULY 2015	es
dc.subject	Programación paralela, escalabilidad, resistencia	es
dc.subject	Propagación paralela	es
dc.subject	Escalabilidad	es
dc.subject	Research Subject Categories::TECHNOLOGY::Information technology::Computer science::Computer science	es
dc.title	Using migratable objects to enhance fault tolerance schemes in supercomputers	es
dc.type	artículo original	es

Ficheros en el ítem

Nombre:: Using Migratable Objects to ...
Tamaño:: 961.0Kb
Formato:: PDF

Ver/

Nombre:: license_rdf
Tamaño:: 1.346Kb
Formato:: application/rdf+xml

Ver/

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos [19]

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como acceso abierto