CLEVER: C ross- L ayer E rror V erification E valuation and R eporting

16
Maeda, Sill Torres: CLEVER CLEVER: Cross-Layer Error Verification Evaluation and Reporting Rafael Kioji Vivas Maeda, Frank Sill Torres Federal University of Minas Gerais (UFMG) School of Engineering Belo Horizonte, Brazil

description

CLEVER: C ross- L ayer E rror V erification E valuation and R eporting. Rafael Kioji Vivas Maeda, Frank Sill Torres Federal University of Minas Gerais (UFMG) School of Engineering Belo Horizonte, Brazil. Focus / Principal idea: - PowerPoint PPT Presentation

Transcript of CLEVER: C ross- L ayer E rror V erification E valuation and R eporting

Page 1: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

Maeda, Sill Torres: CLEVER

CLEVER: Cross-Layer Error Verification Evaluation and Reporting

Rafael Kioji Vivas Maeda, Frank Sill Torres

Federal University of Minas Gerais (UFMG)

School of Engineering

Belo Horizonte, Brazil

Page 2: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

2Maeda, Sill Torres: CLEVER

Focus / Principal idea:

System Health Management approach for

Embedded Systems / SoCs

Page 3: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

3Maeda, Sill Torres: CLEVER

1. Motivation

2. Preliminaries

3. CLEVER

4. Verification Environment

5. Conclusion

Outline

Page 4: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

4Maeda, Sill Torres: CLEVER

Rising complexity of Embedded Systems / Systems-on-Chip (SoC)

MotivationSystem Complexity

# P

roce

ssin

g E

ngin

es /

SoCSoC Memory Size

2011 2014 2018 2022 2026

SoC Logic Size

1,000

3,000

5,000

7,000

ITRS, 2013

Page 5: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

5Maeda, Sill Torres: CLEVER

Due to technology scaling considerable increase of:

– Temporary faults

– Aging and permanent faults

MotivationFaults

Altera, RELIABILITY REPORT 56, 2013

0

20

40

60

80

130 nm 90 nm 65 nm 40 nm 25 nm

Stratix Stratix II Stratix III Stratix IV Stratix V

FIT

(Fai

lure

s in

109

h)

Page 6: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

6Maeda, Sill Torres: CLEVER

Technique classification

– Avoidance (e.g.: Triple Modular

Redundancy)

– Detection and Recovery (e.g.: Rollback)

– Prediction (e.g.: PHM, S.M.A.R.T)

Prognostics and Health Management (PHM)

– Runtime monitoring

– Remaining Useful Liftetime (RUL) estimation and extension

PreliminariesReliability

V

Page 7: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

7Maeda, Sill Torres: CLEVER

Fai

lure

Rat

e λ

Time in Operation

PreliminariesRemaining Usefile Lifetime (RUL)

λold(t)

tRUL_newtRUL_old

λacc

t

λnew(t)

Failu

re R

ate

λ

www.wikipedia.com

Page 8: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

8Maeda, Sill Torres: CLEVER

CLEVER

Prediction of possible system failure important for future SoC Limited effectiveness and efficiency of single layer solutions Straightforward system integration required

Prediction of possible system failure important for future SoC Limited effectiveness and efficiency of single layer solutions Straightforward system integration required

Origination of Approach

Cross-Layer Error Verification Evaluation and Reporting

Cross-Layer Error Verification Evaluation and Reporting

CLEVER

Page 9: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

9Maeda, Sill Torres: CLEVER

Sensors

– Sensing Device

– Communication

Processing Unit (PU)

– Data acquisition

– Prediction

– Scheduler

Memories

Sensor Bus

System Bus

CLEVERArchitecture

Page 10: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

10Maeda, Sill Torres: CLEVER

CLEVER

Two principal parts

– Sensing device

– Communication

with PU

Sensing on different

level:– Physical / electrical

(Temp., Voltage, …)

– Architectural (NBTI,

detected faults, …)

– System (active time,

load, …)

Architecture - Sensor

Page 11: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

11Maeda, Sill Torres: CLEVER

CLEVERArchitecture - Processing Unit

Sensor data

acquisition

Error Prediction

Arbitration

Interface to

Operating

System

(optional)

Memory access

Page 12: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

12Maeda, Sill Torres: CLEVER

CLEVERArchitecture – OS Integration (optional)

Page 13: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

13Maeda, Sill Torres: CLEVER

CLEVERVerification Flow

SystemC

implementation

Communication

based on TLM

(Transaction

Level Modeling)

Verification based

on Message

Sequence Chart

(MSC)

Page 14: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

14Maeda, Sill Torres: CLEVER

CLEVERVerification – TLM2MSC

Page 15: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

15Maeda, Sill Torres: CLEVER

Increasing design complexity and fault probability demand solutions

PHM solutions permit prediction of (probable) system failure

CLEVER: Cross-layer approach for Error Detection and Reporting

System Architecture of CLEVER defined

Verification by simulation of feasibility of CLEVER architecture

Next steps:

– Implementation of prediction algorithm

– Test case

Conclusion

Page 16: CLEVER:  C ross- L ayer  E rror  V erification  E valuation  and  R eporting

16Maeda, Sill Torres: CLEVER

Thank you!ART

OptMAlab / ARTwww.asic-reliability.com