Senior Lecturer
Email: rgran@unizar.es
Address: Campus Río Ebro, University of Zaragoza
C/María de Luna 1, Ada Byron Building,
50018, Zaragoza, Spain
ABOUT ME
Rubén Gran Tejero graduated in Computer Science from the University of Zaragoza, Spain. He received his Ph.D. from the Polytechnic University of Catalonia (UPC), Spain, in 2010. Since 2010, he has been an Associate Professor at the Department of Computer Science and Systems Engineering, University of Zaragoza. His research interests include hard real-time systems, hardware for reducing worst-case execution time and energy consumption, efficient processor microarchitecture, and effective programming for parallel and heterogeneous systems. Dr. Gran Tejero is member of the Aragon Institute of Engineering Research (I3A) and the Spanish Society of Computer Architecture (SARTECO).
PUBLICATIONS
2024
Artículos de revista
Toca-Díaz, Yamilka; Tejero, Rubén Gran; Valero, Alejandro
Shift-and-Safe: Addressing permanent faults in aggressively undervolted CNN accelerators Artículo de revista
En: Journal of Systems Architecture, vol. 157, pp. 1-13, 2024, ISSN: 1383-7621.
@article{Toca-Díaz2024,
title = {Shift-and-Safe: Addressing permanent faults in aggressively undervolted CNN accelerators},
author = {Yamilka Toca-Díaz and Rubén Gran Tejero and Alejandro Valero},
url = {https://www.sciencedirect.com/science/article/pii/S1383762124002297},
doi = {https://doi.org/10.1016/j.sysarc.2024.103292},
issn = {1383-7621},
year = {2024},
date = {2024-12-01},
urldate = {2024-12-01},
journal = {Journal of Systems Architecture},
volume = {157},
pages = {1-13},
abstract = {Underscaling the supply voltage (Vdd) to ultra-low levels below the safe-operation threshold voltage (Vmin) holds promise for substantial power savings in digital CMOS circuits. However, these benefits come with pronounced challenges due to the heightened risk of bitcell permanent faults stemming from process variations in current technology node sizes. This work delves into the repercussions of such faults on the accuracy of a 16-bit fixed-point Convolutional Neural Network (CNN) inference accelerator powering on-chip activation memories at ultra-low Vdd voltages. Through an in-depth examination of fault patterns, memory usage, and statistical analysis of activation values, this paper introduces Shift-and-Safe: two novel and cost-effective microarchitectural techniques exploiting the presence of outlier activation values and the underutilization of activation memories. Particularly, activation outliers enable a shift-based data representation that reduces the impact of faults on the activation values, whereas the memory underutilization is exploited to maintain a safe replica of affected activations in idle memory regions. Remarkably, these mechanisms do not add any burden to the programmer and are independent of application characteristics, rendering them easily deployable across real-world CNN accelerators. Experimental results show that Shift-and-Safe maintains the CNN accuracy even in the presence of almost a quarter of the total activations with faults. In addition, average energy savings are by 5% and 11% compared to the state-of-the-art approach and a conventional accelerator supplied at Vmin, respectively.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Toca-Díaz, Yamilka; Palacios, Reynier Hernández; Tejero, Ruben Gran; Valero, Alejandro
Flip-and-Patch: A fault-tolerant technique for on-chip memories of CNN accelerators at low supply voltage Artículo de revista
En: Microprocessors and Microsystems, vol. 106, pp. 1-13, 2024, ISSN: 0141-9331.
@article{Toca-Díaz2024b,
title = {Flip-and-Patch: A fault-tolerant technique for on-chip memories of CNN accelerators at low supply voltage},
author = {Yamilka Toca-Díaz and Reynier Hernández Palacios and Ruben Gran Tejero and Alejandro Valero},
url = {https://www.sciencedirect.com/science/article/pii/S0141933124000188},
doi = {https://doi.org/10.1016/j.micpro.2024.105023},
issn = {0141-9331},
year = {2024},
date = {2024-04-01},
urldate = {2024-04-01},
journal = {Microprocessors and Microsystems},
volume = {106},
pages = {1-13},
abstract = {Aggressively reducing the supply voltage (Vdd) below the safe threshold voltage (Vmin) can effectively lead to significant energy savings in digital circuits. However, operating at such low supply voltages poses challenges due to a high occurrence of permanent faults resulting from manufacturing process variations in current technology nodes. This work addresses the impact of permanent faults on the accuracy of a Convolutional Neural Network (CNN) inference accelerator using on-chip activation memories supplied at low Vdd below Vmin. Based on a characterization study of fault patterns, this paper proposes two low-cost microarchitectural techniques, namely Flip-and-Patch, which maintain the original accuracy of CNN applications even in the presence of a high number of faults caused by operating at Vdd < Vmin. Unlike existing techniques, Flip-and-Patch remains transparent to the programmer and does not rely on application characteristics, making it easily applicable to real CNN accelerators.
Experimental results show that Flip-and-Patch ensures the original CNN accuracy with a minimal impact on system performance (less than 0.05% for every application), while achieving average energy savings of 10.5% and 46.6% in activation memories compared to a conventional accelerator operating at safe and nominal supply voltages, respectively. Compared to the state-of-the-art ThUnderVolt technique, which dynamically adjusts the supply voltage at run time and discarding any energy overhead for such an approach, the average energy savings are by 3.2%.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Experimental results show that Flip-and-Patch ensures the original CNN accuracy with a minimal impact on system performance (less than 0.05% for every application), while achieving average energy savings of 10.5% and 46.6% in activation memories compared to a conventional accelerator operating at safe and nominal supply voltages, respectively. Compared to the state-of-the-art ThUnderVolt technique, which dynamically adjusts the supply voltage at run time and discarding any energy overhead for such an approach, the average energy savings are by 3.2%.
Proceedings Articles
Toca-Díaz, Yamilka; Tejero, Rubén Gran; Valero, Alejandro
Ensuring the Accuracy of CNN Accelerators Supplied at Ultra-Low Voltage Proceedings Article
En: pp. 92-95, 2024, ISBN: 979-8-3503-8040-8.
@inproceedings{Toca-Díaz2024c,
title = {Ensuring the Accuracy of CNN Accelerators Supplied at Ultra-Low Voltage},
author = {Yamilka Toca-Díaz and Rubén Gran Tejero and Alejandro Valero},
url = {https://ieeexplore.ieee.org/document/10817950},
doi = {https://doi.org/10.1109/ICCD63220.2024.00024},
isbn = {979-8-3503-8040-8},
year = {2024},
date = {2024-11-18},
urldate = {2024-11-18},
journal = {Proceedings of the 42nd IEEE International Conference on Computer Design (ICCD 2024)},
pages = {92-95},
abstract = {Underscaling the supply voltage (Vdd) to ultra-low levels below the safe-operation threshold voltage (Vmin) brings significant energy savings in digital CMOS circuits but introduces reliability challenges due to increased risk of bitcell permanent faults. This work explores the impact of such faults on the accuracy of a CNN inference accelerator supplying on-chip activation memories at ultra-low Vdd. By examining fault pat-terns, activation values, and memory usage, this paper proposes two microarchitectural techniques exploiting activation outliers and activation memory underutilization. These approaches are cost-effective, do not require programmer intervention, and are application-independent. Experimental results show that the proposed approaches maintain the original CNN accuracy and achieve energy savings by 2.1 % and 8.2 % compared to the state-of-the-art technique and a conventional accelerator supplied at Vmin, respectively, with a negligible impact on the system performance (less than 0.25 %).},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2023
Proceedings Articles
Toca-Díaz, Yamilka; Muñoz, Nicolás Landeros; Tejero, Ruben Gran; Valero, Alejandro
On Fault-Tolerant Microarchitectural Techniques for Voltage Underscaling in On-Chip Memories of CNN Accelerators Proceedings Article
En: pp. 138-145, 2023, ISBN: 979-8-3503-4419-6.
@inproceedings{Toca-Díaz2023,
title = {On Fault-Tolerant Microarchitectural Techniques for Voltage Underscaling in On-Chip Memories of CNN Accelerators},
author = {Yamilka Toca-Díaz and Nicolás Landeros Muñoz and Ruben Gran Tejero and Alejandro Valero},
url = {https://ieeexplore.ieee.org/document/10456839},
doi = {https://doi.org/10.1109/DSD60849.2023.00029},
isbn = {979-8-3503-4419-6},
year = {2023},
date = {2023-09-06},
urldate = {2023-09-06},
journal = {Proceedings of the 26th Euromicro Conference on Digital System Design (DSD 2023)},
pages = {138-145},
abstract = {Aggressively underscaling the supply voltage (Vdd) below the safe voltage (Vmin) margin is an effective solution to attain substantial energy savings. Unfortunately, operating at such low voltages is challenging due to the high number of permanent faults as a result of variations in the manufacturing process of current technology nodes. This work characterizes the impact of permanent faults on the accuracy of a Convolutional Neural Network (CNN) inference accelerator with on-chip activation memories supplied at low Vdd below Vmin. Based on these observations, this paper proposes a couple of low-cost microarchitectural techniques, referred to as flipping and patching, that ensure the accuracy of CNN applications despite the presence of permanent faults. Contrary to prior work, the proposed techniques are transparent to the programmer and do not depend on application characteristics. Experimental results show that the proposed techniques maintain the original CNN accuracy with a minimal impact on system performance (less than 0.05%), while reducing the energy consumption of activation memories by 11.2% and 46.7% compared to those of a conventional accelerator operating at safe and nominal supply voltages, respectively.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2022
Artículos de revista
Muñoz, Nicolás Landeros; Valero, Alejandro; Tejero, Rubén Gran; Zoni, Davide
Gated-CNN: Combating NBTI and HCI aging effects in on-chip activation memories of Convolutional Neural Network accelerators Artículo de revista
En: Journal of Systems Architecture, vol. 128, pp. 1-13, 2022, ISSN: 1383-7621.
@article{Muñoz2022,
title = {Gated-CNN: Combating NBTI and HCI aging effects in on-chip activation memories of Convolutional Neural Network accelerators},
author = {Nicolás Landeros Muñoz and Alejandro Valero and Rubén Gran Tejero and Davide Zoni},
url = {https://www.sciencedirect.com/science/article/pii/S1383762122001072},
doi = {https://doi.org/10.1016/j.sysarc.2022.102553},
issn = {1383-7621},
year = {2022},
date = {2022-07-01},
urldate = {2022-07-01},
journal = {Journal of Systems Architecture},
volume = {128},
pages = {1-13},
abstract = {Negative Bias Temperature Instability (NBTI) and Hot Carrier Injection (HCI) are two of the main reliability threats in current technology nodes. These aging phenomena degrade the transistor’s threshold voltage (Vth) over the lifetime of a digital circuit, resulting in slower transistors that eventually lead to a faulty operation when the critical paths become longer than the processor cycle time. Among all the transistors on a chip, the most vulnerable transistors to such wearout effects are those used to implement SRAM storage, since memory cells are continuously degrading. In particular, NBTI ages PMOS cell transistors when a given logic value is stored for a long period (i.e., a long duty cycle), whereas HCI ages NMOS cell transistors not only when the stored value flips but also when it is accessed. This work focuses on mitigating aging in the on-chip SRAM memories of Convolutional Neural Network (CNN) accelerators storing activations. This paper makes two main contributions. At the software level, we quantify the aging induced by current CNN benchmarks with a characterization study of duty cycle, flip, and access patterns in every activation memory cell. Based on the insights from this study, this work proposes a novel microarchitectural technique, Gated-CNN, that ensures a uniform aging degradation of every memory cell. To do so, Gated-CNN exploits power-gating and address rotation techniques tailored to the memory demands and temporal/spatial localities exhibited by CNN applications, as well as the memory organization and management of CNN accelerators. Experimental results show that, compared to a conventional design, the average Vth degradation savings are at least as much as 49% depending on the type of transistor.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}