• A fault-tolerance protocol for parallel applications with communication imbalance 

      Meneses-Rojas, Esteban (IEEE, 2015)
      The predicted failure rates of future supercomputers loom the groundbreaking research large machines are expected to foster. Therefore, resilient extreme-scale applications are an absolute necessity to effectively use ...
    • Camel: collective-aware message logging 

      Kalé, Laxmikant; Meneses-Rojas, Esteban (Kluwer Academic Publishers, 2015-03)
      The continuous progress in the performance of supercomputers has made possible the understanding of many fundamental problems in science. Simulation, the third scientific pillar, constantly demands more powerful machines ...