# SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels

Kristin Barber, Anys Bacha\*, Li Zhou, Yinqian Zhang, Radu Teodorescu



THE OHIO STATE UNIVERSITY

- Department of Computer Science and Engineering
  - The Ohio State University
  - http://arch.cse.ohio-state.edu
  - \*The University of Michigan









• In Jan. 2018, a new class of vulnerabilities was unveiled



Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels





- In Jan. 2018, a new class of vulnerabilities was unveiled
  - Ability to bypass isolation boundaries and violate control-flow integrity





- In Jan. 2018, a new class of vulnerabilities was unveiled
  - Ability to bypass isolation boundaries and violate control-flow integrity
  - Manipulation of speculative mechanisms to perform computation of interest to attacker





- In Jan. 2018, a new class of vulnerabilities was unveiled
  - Ability to bypass isolation boundaries and violate control-flow integrity
  - Manipulation of speculative mechanisms to perform computation of interest to attacker
- Meltdown demonstrated reading entire kernel memory from user process!







- In Jan. 2018, a new class of vulnerabilities was unveiled
  - Ability to bypass isolation boundaries and violate control-flow integrity
  - Manipulation of speculative mechanisms to perform computation of interest to attacker
- Meltdown demonstrated reading entire kernel memory from user process!
  - Indicates how powerful these attacks can be







ER CTURE H LAB

- In Jan. 2018, a new class of vulnerabilities was unveiled
  - Ability to bypass isolation boundaries and violate control-flow integrity
  - Manipulation of speculative mechanisms to perform computation of interest to attacker
- Meltdown demonstrated reading entire kernel memory from user process!
  - Indicates how powerful these attacks can be
  - Process isolation, sandboxed environments, virtualized environments are all susceptible





ER CTURE H LAB

- In Jan. 2018, a new class of vulnerabilities was unveiled
  - Ability to bypass isolation boundaries and violate control-flow integrity
  - Manipulation of speculative mechanisms to perform computation of interest to attacker
- Meltdown demonstrated reading entire kernel memory from user process!
  - Indicates how powerful these attacks can be
  - Process isolation, sandboxed environments, virtualized environments are all susceptible
  - Don't rely on an implementation "bug", use features correctly as intended





- In Jan. 2018, a new class of vulnerabilities was unveiled
  - Ability to bypass isolation boundaries and violate control-flow integrity
  - Manipulation of speculative mechanisms to perform computation of interest to attacker
- Meltdown demonstrated reading entire kernel memory from user process!
  - Indicates how powerful these attacks can be
  - Process isolation, sandboxed environments, virtualized environments are all susceptible
  - Don't rely on an implementation "bug", use features correctly as intended
- The past 1.5 years has seen a wide range of attacks with variations on this theme





ER CTURE H LAB



## 'Foreshadow' attack affects Intel chips

- manipulation of interest to attacker
- Meltdown demonstrated reading entire kernel memory from user process!
  - Indicates how powerful these attacks can be
  - Process isolation, sandboxed environments, virtualized environments are all susceptible
  - Don't rely on an implementation "bug", use features correctly as intended
- The past 1.5 years has seen a wide range of attacks with variations on this theme



- ndaries and violate control-flow integrity





## 'Foreshadow' attack affects Intel chips

- iviai ilpulation or speculative ilech
- Meltdown demonstrated reading entire kernel memory from user process!
  - Indicates how powerful these attacks can be

  - Don't rely on an implementation "bug", use features correctly as intended
- The past 1.5 years has seen a wide range of attacks with variations on this theme



BEST PRODUCTS ~

## Spectre and Meltdown: Deta on those big chip flar

- Process isolation, sandboxed environments, virtualized environments are all susceptible

Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels





- Don't rely on an implementation "bug", use features correctly as intended
- The past 1.5 years has seen a wide range of attacks with variations on this theme



BEST PRODUCTS

## **Spectre and Meltdown: Deta** on those big chip flar

process!

- Process isolation, sandboxed environments, virtualized environments are all susceptible









BEST PRODUCTS \ DEALS ~ DOW

SECURITY | LEER EN ESPAÑOL

## Spectre and Meltdown: Deta on those big chip flar

BUSINESS SHOP SMART HOME

#HPEliteDragonfly #HuaweiMate30 #SonosMove

process!

and then detect which locations have been evicted from the car Such a side channel leaks addresses and allows the adversary) Xed environments, virtualized environments are all susceptible

in the side channel is now used indirectly, in a way that - intation "bug", use features correctly as intended

• The past 1.5 years has seen a wide range of attacks with variations on this theme









BEST PRODUCTS \ DEALS ~ DOW

SECURITY | LEER EN ESPAÑOL

## Spectre and Meltdown: Deta on those big chip flar

BUSINESS SHOP SMART HOME

#HPEliteDragonfly #HuaweiMate30 #SonosMove

process!

### **NetSpectre: Read Arbitrary Memory over Network**

Michael Schwarz Graz University of Technology

Moritz Lipp Graz University of Technology

BSTRACT

eculative execution is a crucial cornerstone to the performance modern processors. During speculative execution, the processor ay perform operations the program usually would not perform. hile the architectural effects and results of such operations are ploit these side effects to read memory contents of other programs.

Martin Schwarzl Graz University of Technology

Daniel Gruss Graz University of Technology

### 1 INTRODUCTION

Modern computers are highly optimized for performance. F these optimizations typically have side effects. Side-channe observe these side effects and consequently deduce info which would usually not be accessible to the attacker. So based side-channel attacks are particularly unsettling sir do not require physical access to the device. Many of these fall into the category of microarchitectural attacks, which

## • The past 1.5 years has seen a wide range of attacks with variations on this theme









PACT 2019



BEST PRODUCTS \ DEALS ~ DOW

SECURITY | LEER EN ESPAÑOL

## Spectre and Meltdown: Deta on those big chip flar

BUSINESS SHOP SMART HOME

#HPEliteDragonfly #HuaweiMate30 #SonosMove

process!

### **NetSpectre: Read Arbitrary Memory over Network**

Michael Schwarz Graz University of Technology

Moritz Lipp Graz University of Technology

BSTRACT

eculative execution is a crucial cornerstone to the performance modern processors. During speculative execution, the processor ay perform operations the program usually would not perform. hile the architectural effects and results of such operations are ploit these side effects to read memory contents of other programs.

Martin Schwarzl Graz University of Technology

Daniel Gruss Graz University of Technology

### 1 INTRODUCTION

Modern computers are highly optimized for performance. F these optimizations typically have side effects. Side-channe observe these side effects and consequently deduce info which would usually not be accessible to the attacker. So based side-channel attacks are particularly unsettling sir do not require physical access to the device. Many of these fall into the category of microarchitectural attacks, which

## t 1.5 years has seen a wide range of attacks with variations on this theme









PACT 2019



BEST PRODUCTS DEALS ~ DOW

SECURITY | LEER EN ESPAÑOL

## Spectre and Meltdown: Deta on those big chip flar

BUSINESS SHOP SMART HOME

#HuaweiMate30 #SonosMove #HPEliteDragonfly

process!

### **NetSpectre: Read Arbitrary Memory over Network**

Michael Schwarz Graz University of Technology

Moritz Lipp Graz University of Technology

### BSTRACT

eculative execution is a crucial cornerstone to the performance modern processors. During speculative execution, the processor ay perform operations the program usually would not perform. hile the architectural effects and results of such operations are ploit these side effects to read memory contents of other programs.

Graz U

Graz U

### 1 INTRODUC

Modern computers a these optimizations t observe these side e which would usuall based side-channel do not require physi fall into the category



## t 1.5 years has seen a wide range of attacks with variations on this theme

### FALLOUT







RIDL

PACT 2019

2



BEST PRODUCTS \ DEALS ~ DOW

ECURITY

# Transient Execution Attacks

Graz University of Technology

### BSTRACT

eculative execution is a crucial cornerstone to the performance modern processors. During speculative execution, the processor ay perform operations the program usually would not perform. hile the architectural effects and results of such operations are ploit these side effects to read memory contents of other programs. Graz U

Graz U

fler

ory over Network

### 1 INTRODUC

<u>EER EN ESPAÑOL</u>

Modern computers a these optimizations t observe these side e which would usuall based side-channel do not require physi fall into the category



## t 1.5 years has seen a wide range of attacks with variations on this theme

### FALLOUT





# Outline

- Transient Execution Attacks
- Deep Dive Example
- SpecShield Defense
- Evaluation
- Conclusion







ER CTURE H LAB







ER CTURE H LAB

• Speculation allows the execution of **incorrect** instructions







- Speculation allows the execution of **incorrect** instructions
- Under the right set of conditions, allows for retrieval of restricted data







- Speculation allows the execution of **incorrect** instructions
- Under the right set of conditions, allows for retrieval of restricted data
- But speculative results might be discarded?







- Speculation allows the execution of incorrect instructions
- Under the right set of conditions, allows for retrieval of restricted data
- But speculative results might be discarded?



## Sources of Speculation

conditional branches

exceptions

speculative store bypass



- Speculation allows the execution of incorrect instructions
- Under the right set of conditions, allows for retrieval of restricted data
- But speculative results might be discarded?
  - Exploit µarch side-effects of transient execution



## **Sources of Speculation**

conditional branches

exceptions

speculative store bypass

value speculation

Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels



- Speculation allows the execution of incorrect instructions
- Under the right set of conditions, allows for retrieval of restricted data
- But speculative results might be discarded?
  - Exploit µarch side-effects of transient execution
  - Architectural changes are discarded, µarch changes persist



## **Sources of Speculation**

conditional branches

exceptions

speculative store bypass



- Speculation allows the execution of incorrect instructions
- Under the right set of conditions, allows for retrieval of restricted data
- But speculative results might be discarded?
  - Exploit µarch side-effects of transient execution
  - Architectural changes are discarded, µarch changes persist
  - Attacker encodes data into µarch state -> covert channel



## **Sources of Speculation**

conditional branches

exceptions

speculative store bypass



- Speculation allows the execution of incorrect instructions
- Under the right set of conditions, allows for retrieval of restricted data
- But speculative results might be discarded?
  - Exploit µarch side-effects of transient execution
  - Architectural changes are discarded, µarch changes persist
  - Attacker encodes data into µarch state -> covert channel
    - Ex., traditional cache timing attack techniques, infer memory access patterns of victim



## **Sources of Speculation**

conditional branches

exceptions

speculative store bypass



- Speculation allows the execution of incorrect instructions
- Under the right set of conditions, allows for retrieval of restricted data
- But speculative results might be discarded?
  - Exploit µarch side-effects of transient execution
  - Architectural changes are discarded, µarch changes persist
  - Attacker encodes data into  $\mu$  arch state covert channel
    - Ex., traditional cache timing attack techniques, infer memory access patterns of victim
    - Covert channel functions as medium for send/recv data outside the speculative window



## **Sources of Speculation**

conditional branches

exceptions

speculative store bypass



- Speculation allows the execution of incorrect instructions
- Under the right set of conditions, allows for retrieval of restricted data
- But speculative results might be discarded?
  - Exploit µarch side-effects of transient execution
  - Architectural changes are discarded, µarch changes persist
  - Attacker encodes data into  $\mu$  arch state covert channel
    - Ex., traditional cache timing attack techniques, infer memory access patterns of victim
    - Covert channel functions as medium for send/recv data outside the speculative window



## **Sources of Speculation**

conditional branches

exceptions

speculative store bypass













• Illustrative example: Spectre-v1, bounds-check-bypass





Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels



• Illustrative example: Spectre-v1, bounds-check-bypass

- Transient execution attacks require two phases:
  - (1) Speculation primitive allows access to restricted data
  - (2) Utilization of covert channel to disclose data outside of speculative window













ER CTURE H LAB

## Attacker









ER CTURE H LAB

## Attacker





## Victim







## Deep Dive Example: Spectre-v1

Attacker



victim

### Main Memory

| attacker   |
|------------|
|            |
| shared lib |
|            |
|            |
|            |



## Victim





## Deep Dive Example: Spectre-v1





victim

### Main Memory

| attacker   |
|------------|
|            |
| shared lib |
|            |
|            |
|            |



## Victim



































PACT 2019 6































COMPUTE











COMPUTE





















### offset to restricted data, known a priori













### offset to restricted data, known a priori





















































Software-only mitigation solutions







- Software-only mitigation solutions
  - Generally very high overhead for good coverage



Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels





- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design









- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design
    - Coarse-grain, rely on serializing instructions









- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design
    - Coarse-grain, rely on serializing instructions
  - Ad-hoc and specific to exploits, rely on manual insertion, static analysis shown to miss corner cases







- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design
    - Coarse-grain, rely on serializing instructions
  - Ad-hoc and specific to exploits, rely on manual insertion, static analysis shown to miss corner cases
- Existing hardware solutions







- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design
    - Coarse-grain, rely on serializing instructions
  - Ad-hoc and specific to exploits, rely on manual insertion, static analysis shown to miss corner cases
- Existing hardware solutions

- Lower-overhead, better coverage







- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design
    - Coarse-grain, rely on serializing instructions
  - Ad-hoc and specific to exploits, rely on manual insertion, static analysis shown to miss corner cases
- Existing hardware solutions
  - Lower-overhead, better coverage
  - Mostly focused on closing specific covert channels





- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design
    - Coarse-grain, rely on serializing instructions
  - Ad-hoc and specific to exploits, rely on manual insertion, static analysis shown to miss corner cases
- Existing hardware solutions
  - Lower-overhead, better coverage
  - Mostly focused on closing specific covert channels



### 000 Processor







- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design
    - Coarse-grain, rely on serializing instructions
  - Ad-hoc and specific to exploits, rely on manual insertion, static analysis shown to miss corner cases
- Existing hardware solutions
  - Lower-overhead, better coverage
  - Mostly focused on closing specific covert channels



### **OOO** Processor







- Software-only mitigation solutions
  - Generally very high overhead for good coverage
    - Attack exploits traces left in µarch, opaque to programmer by design
    - Coarse-grain, rely on serializing instructions
  - Ad-hoc and specific to exploits, rely on manual insertion, static analysis shown to miss corner cases
- Existing hardware solutions
  - Lower-overhead, better coverage
  - Mostly focused on closing specific covert channels



### **OOO** Processor









### **OOO Processor**







**SpecShield** is a family of uarch mitigation solutions with different isolation properties for trade-offs with performance

### **OOO** Processor







**SpecShield** is a family of uarch mitigation solutions with different isolation properties for trade-offs with performance

**Goal:** Isolate transient data from covert channel transmission

### 000 Processor





**SpecShield** is a family of uarch mitigation solutions with different isolation properties for trade-offs with performance

**Goal:** Isolate transient data from covert channel transmission

**Threat Model** 

### 000 Processor





**SpecShield** is a family of uarch mitigation solutions with different isolation properties for trade-offs with performance

**Goal:** Isolate transient data from covert channel transmission

### **Threat Model**

Sensitive data resides anywhere in the memory hierarchy







## SpecShield: Our Proposed Solution

**SpecShield** is a family of uarch mitigation solutions with different isolation properties for trade-offs with performance

**Goal:** Isolate transient data from covert channel transmission

### **Threat Model**

- Sensitive data resides anywhere in the memory hierarchy
  - Accessed through a transient misspeculated instruction

### memory hierarchy peculated







ER CTURE H LAB

## **SpecShield: Our Proposed Solution**

**SpecShield** is a family of uarch mitigation solutions with different isolation properties for trade-offs with performance

**Goal:** Isolate transient data from covert channel transmission

### **Threat Model**

- Sensitive data resides anywhere in the memory hierarchy
  - Accessed through a transient misspeculated instruction
- Any covert channel can be used to exfiltrate secret data
  - Caches, SIMD units, TLBs, etc.

8

### 000 Processor









### **OOO Processor**







ER CTURE H LAB

A more general solution that prevents covert channel formation

9



### **OOO Processor**







A more general solution that prevents covert channel formation

Key Observation: Leakage source by definition has dependence on the secret data

9



### **OOO** Processor







A more general solution that prevents covert channel formation

Key Observation: Leakage source by definition has dependence on the secret data

9



### **OOO** Processor



if (x < array1\_size)</pre> y = array2[array1[x] \* 256];





A more **general** solution that prevents covert channel formation

Key Observation: Leakage source by definition has dependence on the secret data

 Prevent covert channel formation by policing speculative data use by dependent instructions



### **OOO** Processor



if (x < array1 size) 256]; y = array2[array1[x] \*





A more **general** solution that prevents covert channel formation

Key Observation: Leakage source by definition has dependence on the secret data

 Prevent covert channel formation by policing speculative data use by dependent instructions



### **OOO** Processor



if (x < array1 size) 256]; y = array2[array1[x] \*





A more **general** solution that prevents covert channel formation

Key Observation: Leakage source by definition has dependence on the secret data

 Prevent covert channel formation by policing speculative data use by dependent instructions



### **OOO** Processor

| Fetch/<br>Decode               | iTLB<br>Front End                   |  |
|--------------------------------|-------------------------------------|--|
| Rename                         | Retire                              |  |
|                                | ROB                                 |  |
|                                |                                     |  |
| Scheduler/Reservation Stations |                                     |  |
| Secret                         | Register File                       |  |
|                                | Secret<br>Int/FP/SIMD Exec. Cluster |  |
|                                | 000 Ex. Engine                      |  |
| Secret                         | DTLB L1 Data                        |  |
| Secret                         | L2                                  |  |
| Secret                         | Main Memory                         |  |
|                                | Memory Subsystem                    |  |





6];

A more **general** solution that prevents covert channel formation

**Key Observation**: Leakage source by definition has dependence on the secret data

- Prevent covert channel formation by policing speculative data use by dependent instructions
  - Speculative status determined by producing instruction



| Fetch/<br>Decode               | iTLB<br>Front End                  |  |
|--------------------------------|------------------------------------|--|
| Rename                         | Retire                             |  |
|                                | ROB                                |  |
|                                |                                    |  |
| Scheduler/Reservation Stations |                                    |  |
| Secret                         | Register File                      |  |
|                                | ecret<br>Int/FP/SIMD Exec. Cluster |  |
|                                | 000 Ex. Engine                     |  |
| Secret                         | DTLB L1 Data                       |  |
| Secret                         | L2                                 |  |
| Secret                         | Main Memory                        |  |
|                                | Memory Subsystem                   |  |

if (x < array1 size)</pre> 256]; y = array2[array1[x] \*





A more **general** solution that prevents covert channel formation

**Key Observation**: Leakage source by definition has dependence on the secret data

- Prevent covert channel formation by policing speculative data use by dependent instructions
  - Speculative status determined by producing instruction
  - Monitor speculative status of loads



| Fetch/<br>Decode               | iTLB<br>Front End                  |  |
|--------------------------------|------------------------------------|--|
| Rename                         | Retire                             |  |
|                                | ROB                                |  |
|                                |                                    |  |
| Scheduler/Reservation Stations |                                    |  |
| Secret                         | Register File                      |  |
|                                | ecret<br>Int/FP/SIMD Exec. Cluster |  |
|                                | 000 Ex. Engine                     |  |
| Secret                         | DTLB L1 Data                       |  |
| Secret                         | L2                                 |  |
| Secret                         | Main Memory                        |  |
|                                | Memory Subsystem                   |  |

if (x < arrayl size) 256]; y = array2[array1[x] \*





A more **general** solution that prevents covert channel formation

**Key Observation**: Leakage source by definition has dependence on the secret data

- Prevent covert channel formation by policing speculative data use by dependent instructions
  - Speculative status determined by producing instruction
  - Monitor speculative status of loads
- Delay forwarding until window of speculation is closed



| Fetch/                         | iTLB                                |   |
|--------------------------------|-------------------------------------|---|
| Decode                         | Front End                           | ł |
|                                |                                     |   |
| Rename                         | Retire                              |   |
| ROB                            |                                     |   |
|                                |                                     |   |
| Scheduler/Reservation Stations |                                     |   |
| Secret                         | Register File                       |   |
|                                | Secret<br>Int/FP/SIMD Exec. Cluster |   |
|                                | 000 Ex. Engine                      | e |
| ennerstremens interesters      |                                     |   |
| Secret                         | DTLB L1 Data                        |   |
| Secret                         | L2                                  |   |
| Secret                         | Main Memory                         |   |
|                                | Memory Subsystem                    | n |

**OOO** Processor





6];

A more **general** solution that prevents covert channel formation

**Key Observation**: Leakage source by definition has dependence on the secret data

- Prevent covert channel formation by policing speculative data use by dependent instructions
  - Speculative status determined by producing instruction
  - Monitor speculative status of loads
- Delay forwarding until window of speculation is closed
- Traditionally, instructions considered non-speculative when reaching ROB head



### **OOO** Processor

| Fetch/<br>Decode               | iTLB<br>Front End                 |  |
|--------------------------------|-----------------------------------|--|
| Rename                         | Retire                            |  |
|                                | ROB                               |  |
|                                |                                   |  |
| Scheduler/Reservation Stations |                                   |  |
| Secret                         | Register File                     |  |
| Se                             | cret<br>Int/FP/SIMD Exec. Cluster |  |
| Baug-1                         | OOO Ex. Engine                    |  |
| Secret                         | DTLB L1 Data                      |  |
| Secret                         | L2                                |  |
| Secret                         | Main Memory                       |  |
|                                | Memory Subsystem                  |  |

if (x < arrayl size) y = array2[array1[x] \* 256];











ER CTURE H LAB

• Wait until load reaches ROB head before forwarding to dependent instruction







 Wait until load reaches ROB head before forwarding to dependent instruction







- Wait until load reaches ROB head before forwarding to dependent instruction
- When data returns from memory (cache)







- Wait until load reaches ROB head before forwarding to dependent instruction
- When data returns from memory (cache)
  - Register file is updated







- Wait until load reaches ROB head before forwarding to dependent instruction
- When data returns from memory (cache)
  - Register file is updated
  - Delay forwarding data to dependent instructions







- Wait until load reaches ROB head before forwarding to dependent instruction
- When data returns from memory (cache)
  - Register file is updated
  - Delay forwarding data to dependent instructions







- Wait until load reaches ROB head before forwarding to dependent instruction
- When data returns from memory (cache)
  - Register file is updated
  - Delay forwarding data to dependent instructions



LD returns





- Wait until load reaches ROB head before forwarding to dependent instruction
- When data returns from memory (cache)
  - Register file is updated
  - Delay forwarding data to dependent instructions







- Wait until load reaches ROB head before forwarding to dependent instruction
- When data returns from memory (cache)
  - Register file is updated
  - Delay forwarding data to dependent instructions
- All data guaranteed to be non-speculative before use







- Wait until load reaches ROB head before forwarding to dependent instruction
- When data returns from memory (cache)
  - Register file is updated
  - Delay forwarding data to dependent instructions
- All data guaranteed to be non-speculative before use
- Downside: relatively large performance impact









### **Reorder Buffer**





COMPUTER



### **Reorder Buffer**





COMPUTER

• Goal: Relax constraints on allowable forwarding









- Goal: Relax constraints on allowable forwarding
- Observation: Most loads are safe earlier than retirement









- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:









- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved









- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed









- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions









- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions









- Goal: Relax constraints on allowable forwarding
- Observation: Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions



### **Reorder Buffer**







ER CTURE H LAB

- Goal: Relax constraints on allowable forwarding
- Observation: Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions



### **Reorder Buffer**







ER CTURE H LAB

- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions













- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions











- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions











- Goal: Relax constraints on allowable forwarding
- Observation: Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions



### **Reorder Buffer**







- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions
- Loads behind ERP can be considered safe and allowed to forward data



### **Reorder Buffer**

#### FP (Forward Pending) tail 0 ... 0 SUB ...,r2,... 0 0 . . . 0 0 ADD ...,r1,... 0 0 BR <c2>,target1 0 0 ERP 0 0 ... LD r2, mem(B) 0 0 0 BR <c1>,target1 0 0 1 0 LD r1, mem(A) 1 1 0 1 head 1 0 ... (commit)





- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions
- Loads behind ERP can be considered safe and allowed to forward data











- Goal: Relax constraints on allowable forwarding
- **Observation:** Most loads are safe earlier than retirement
- Define Early Resolution Point (ERP), instruction in the ROB where:
  - All older branch instructions have resolved
  - All older loads and stores have had addresses computed
  - No branch mispredictions or memory-access exceptions
- Loads behind ERP can be considered safe and allowed to forward data
- Much lower performance impact, equivalent security













| ROB             | CCR | FP | Tain | t                               |
|-----------------|-----|----|------|---------------------------------|
| MUL r4,r3,      | 0   | 0  | 0    | <b>tail</b>                     |
| SUB r3,r2,      | 0   | 0  | 0    |                                 |
| LD, addr(r2)    | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| ADD r2,r1,      | 0   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| BNEZ r1,target1 | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| LD r1, mem(B)   | 1   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    | ERP                             |
| AND,r0,         | 0   | 0  | 0    |                                 |
| LD r0, mem(A)   | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    | <mark>↓ head</mark><br>(commit) |
|                 |     |    |      | , <i>,</i>                      |





• A covert channel-specific optimization



#### **Reorder Buffer**

| ROB             | CCR | FP | Tain | t                               |
|-----------------|-----|----|------|---------------------------------|
| MUL r4,r3,      | 0   | 0  | 0    | tail<br>◀                       |
| SUB r3,r2,      | 0   | 0  | 0    |                                 |
| LD, addr(r2)    | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| ADD r2,r1,      | 0   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| BNEZ r1,target1 | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| LD r1, mem(B)   | 1   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    | ERP                             |
| AND,r0,         | 0   | 0  | 0    |                                 |
| LD r0, mem(A)   | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    | <mark>↓ head</mark><br>(commit) |
|                 |     |    |      |                                 |





- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative



| ROB             | CCR | FP | Tain | t                               |
|-----------------|-----|----|------|---------------------------------|
| MUL r4,r3,      | 0   | 0  | 0    | tail<br>◀                       |
| SUB r3,r2,      | 0   | 0  | 0    |                                 |
| LD, addr(r2)    | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| ADD r2,r1,      | 0   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| BNEZ r1,target1 | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| LD r1, mem(B)   | 1   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    | €RP                             |
| AND,r0,         | 0   | 0  | 0    |                                 |
| LD r0, mem(A)   | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    | <mark>↓ head</mark><br>(commit) |
|                 |     |    |      |                                 |



- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)



| ROB             | CCR | FP | Tain | t                               |
|-----------------|-----|----|------|---------------------------------|
| MUL r4,r3,      | 0   | 0  | 0    | tail<br>◀                       |
| SUB r3,r2,      | 0   | 0  | 0    |                                 |
| LD, addr(r2)    | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| ADD r2,r1,      | 0   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| BNEZ r1,target1 | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| LD r1, mem(B)   | 1   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    | ERP                             |
| AND,r0,         | 0   | 0  | 0    |                                 |
| LD r0, mem(A)   | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    | <mark>↓ head</mark><br>(commit) |
|                 |     |    |      |                                 |





- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk



| ROB             | CCR | FP | Tain | t                               |
|-----------------|-----|----|------|---------------------------------|
| MUL r4,r3,      | 0   | 0  | 0    | <mark>∢ tail</mark>             |
| SUB r3,r2,      | 0   | 0  | 0    |                                 |
| LD, addr(r2)    | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| ADD r2,r1,      | 0   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| BNEZ r1,target1 | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| LD r1, mem(B)   | 1   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    | €RP                             |
| AND,r0,         | 0   | 0  | 0    |                                 |
| LD r0, mem(A)   | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    | <mark>↓ head</mark><br>(commit) |
|                 |     |    |      |                                 |



- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk



| ROB             | CCR | FP | Tain | t                               |
|-----------------|-----|----|------|---------------------------------|
| MUL r4,r3,      | 0   | 0  | 0    | <b>tail</b>                     |
| SUB r3,r2,      | 0   | 0  | 0    |                                 |
| LD, addr(r2)    | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| ADD r2,r1,      | 0   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| BNEZ r1,target1 | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| LD r1, mem(B)   | 1   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    | €RP                             |
| AND,r0,         | 0   | 0  | 0    |                                 |
| LD r0, mem(A)   | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    | <mark>← head</mark><br>(commit) |
|                 |     |    |      |                                 |

| High CCR | LDs, Branches |
|----------|---------------|
| Low CCR  | Rest          |



- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:



| ROB             | CCR | FP | Tain | t                               |
|-----------------|-----|----|------|---------------------------------|
| MUL r4,r3,      | 0   | 0  | 0    | <b>tail</b>                     |
| SUB r3,r2,      | 0   | 0  | 0    |                                 |
| LD, addr(r2)    | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| ADD r2,r1,      | 0   | 1  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| BNEZ r1,target1 | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    |                                 |
| LD r1, mem(B)   | 1   | 1  | 1    |                                 |
|                 | 0   | 0  | 0    | €RP                             |
| AND,r0,         | 0   | 0  | 0    |                                 |
| LD r0, mem(A)   | 1   | 0  | 0    |                                 |
|                 | 0   | 0  | 0    | <mark>← head</mark><br>(commit) |
|                 |     |    |      |                                 |

| High CCR | LDs, Branches |
|----------|---------------|
| Low CCR  | Rest          |



- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions



#### **Reorder Buffer**







- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data



#### **Reorder Buffer**



Low CCR

Rest





- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data



#### **Reorder Buffer**



Low CCR

Rest





- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data



#### **Reorder Buffer**



| High CCR | LDs, Branches |
|----------|---------------|
| Low CCR  | Rest          |





- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction



#### **Reorder Buffer**



| High CCR | LDs, Branches |
|----------|---------------|
| Low CCR  | Rest          |



- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction



#### **Reorder Buffer**



| High CCR | LDs, Branches |
|----------|---------------|
| Low CCR  | Rest          |



Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels

- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction



#### **Reorder Buffer**



| High CCR | LDs, Branches |
|----------|---------------|
| Low CCR  | Rest          |



#### Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels

- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction





| High CCR | LDs, Branches |
|----------|---------------|
| Low CCR  | Rest          |



- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction
  - At ERP for high leakage risk instructions, from tainted instructions



#### **Reorder Buffer**



COMPUTEF

- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction
  - At ERP for high leakage risk instructions, from tainted instructions



#### **Reorder Buffer**





- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction
  - At ERP for high leakage risk instructions, from tainted instructions



#### **Reorder Buffer**



- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction
  - At ERP for high leakage risk instructions, from tainted instructions



#### **Reorder Buffer**





- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction
  - At ERP for high leakage risk instructions, from tainted instructions



#### **Reorder Buffer**





- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction
  - At ERP for high leakage risk instructions, from tainted instructions



#### **Reorder Buffer**



- A covert channel-specific optimization
- Hypothesis: not all instructions form covert channels, loads delaying forwarding to all dependents is possibly still too conservative
  - Some classes of instructions may pose a low leakage risk (maybe arithmetic ops)
- Idea: classify instructions as high/low Covert Channel Risk
- Speculative data forwarded:
  - Immediately to low leakage risk instructions
    - Taint used to indicate if instruction computed with speculative data
    - Speculative data propagates along dependency chain until reaching high CCR instruction
  - At ERP for high leakage risk instructions, from tainted instructions



#### **Reorder Buffer**





SpecShield Changes/Additions











SpecShield Changes/Additions







SpecShield Changes/Additions









SpecShield Changes/Additions











SpecShield Changes/Additions



Retire

...







SpecShield Changes/Additions



Retire

. . .

Retire





## **Evaluation Methodology**

- Experimental Platform:
  - Simulator: gem5, full-system mode, Ubuntu 14.04 OS
  - Benchmarks: spec2006, reference input set
  - Simpoints: Used to select 10 most representative regions of 1B instructions







## **Evaluation Methodology**

- Experimental Platform:
  - Simulator: gem5, full-system mode, Ubuntu 14.04 OS
  - Benchmarks: spec2006, reference input set
  - Simpoints: Used to select 10 most representative regions of 1B instructions

| <b>CPU Architecture</b> |              |               |      |
|-------------------------|--------------|---------------|------|
| CPU Clock               | 2GHz         | LSQ Entries   | 32   |
| L1 ICache               | 32KB (4-way) | IQ Entries    | 64   |
| L1 DCache               | 32KB (8-way) | BTB Entries   | 4096 |
| L2 Cache                | 2MB (16-way) | dTLB Entries  | 64   |
| Issue Width             | 8            | iTLB Entries  | 64   |
| ROB Entries             | 192          | FP Registers  | 256  |
| Branch Predictor        | LTAGE        | Int Registers | 256  |









#### **Evaluation Methodology**

- Experimental Platform:
  - Simulator: gem5, full-system mode, Ubuntu 14.04 OS
  - Benchmarks: spec2006, reference input set
  - Simpoints: Used to select 10 most representative regions of 1B instructions

| CPU Architecture |              |               |      |
|------------------|--------------|---------------|------|
| CPU Clock        | 2GHz         | LSQ Entries   | 32   |
| L1 ICache        | 32KB (4-way) | IQ Entries    | 64   |
| L1 DCache        | 32KB (8-way) | BTB Entries   | 4096 |
| L2 Cache         | 2MB (16-way) | dTLB Entries  | 64   |
| Issue Width      | 8            | iTLB Entries  | 64   |
| ROB Entries      | 192          | FP Registers  | 256  |
| Branch Predictor | LTAGE        | Int Registers | 256  |









#### **Evaluation Methodology**

- Experimental Platform:
  - Simulator: gem5, full-system mode, Ubuntu 14.04 OS
  - Benchmarks: spec2006, reference input set
  - Simpoints: Used to select 10 most representative regions of 1B instructions

| CPU Architecture |              |               |      |
|------------------|--------------|---------------|------|
| CPU Clock        | 2GHz         | LSQ Entries   | 32   |
| L1 ICache        | 32KB (4-way) | IQ Entries    | 64   |
| L1 DCache        | 32KB (8-way) | BTB Entries   | 4096 |
| L2 Cache         | 2MB (16-way) | dTLB Entries  | 64   |
| Issue Width      | 8            | iTLB Entries  | 64   |
| ROB Entries      | 192          | FP Registers  | 256  |
| Branch Predictor | LTAGE        | Int Registers | 256  |



- SpecShield STL, ERP, ERP+
- LFENCE\* serialization after every branch

\*Intel, Speculative execution side channel mitigations. Intel, 2018. https://software.intel.com/security-software-guidance/api-app/ sites/default/files/336996-Speculative-Execution-Side- Channel-Mitigations.pdf

























15 PACT 2019



#### Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels







15 PACT 2019



#### LFENCE > 2XSpecShieldSTL 73%

#### SpecShieldERP 21%















- LFENCE > 2XSpecShieldSTL 73%
- SpecShieldERP 21%
- SpecShieldERP+ 10%
- Benchmarks with low miss rates most impacted









- LFENCE > 2XSpecShieldSTL 73%
- SpecShieldERP 21%
- SpecShieldERP+ 10%
- Benchmarks with low miss rates most impacted









- LFENCE > 2XSpecShieldSTL 73%
- SpecShieldERP 21%
- SpecShieldERP+ 10%
- Benchmarks with low miss rates most impacted









- LFENCE > 2XSpecShieldSTL 73%
- SpecShieldERP 21%
- SpecShieldERP+ 10%
- Benchmarks with low miss rates most impacted





## Loads Delaying in SpecShield STL and ERP





- STL forces most loads to delay (84%)
- ERP cuts that to < 40%







## Loads Delaying in SpecShield STL and ERP





- STL forces most loads to delay (84%)
- ERP cuts that to < 40%









## Percentage Instructions Delayed



Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels



COMPUTER Architect Research

## Percentage Instructions Delayed





COMPUTER RESEARCH









• Spectre-vl attack, using cache as covert channel



COMPUTER



 Spectre-v1 attack, using cache as covert channel

if (x < array1\_size)</pre> y = array2[array1[x] \* 256];







 Spectre-v1 attack, using cache as covert channel

if (x < array1\_size)</pre> y = array2[array1[x] \*

• Exfiltrated value visible in access latency







# 



Exfiltrated value visible in lacksquareaccess latency













 Spectre-v1 attack, using cache as covert channel

if (x < array1 size) y = array2[array1[x] \*

Exfiltrated value visible in access latency













 Spectre-v1 attack, using cache as covert channel

if (x < array1 size) y = array2[array1[x]]

- Exfiltrated value visible in access latency
- Secret value no longer appears in the cache channel









#### Conclusions

- arbitrary memory
- SpecShield is more general
  - Unlike prior work that has focused on closing specific covert preventing channel formation.
- SpecShield is easier to implement
  - guarantees, etc.



#### Microarchitectural framework for preventing transient execution attacks on

# channels, SpecShield controls all speculative data-flow within the pipeline,

#### - No changes to the memory hierarchy, coherence protocol, consistency

#### • Performance-security tradeoff possible by only restricting select covert channels







# Questions?

20 PACT 2019 Kristin Barber, SpecShield: Shielding Speculative Data from Microarchitectural Covert Channels



# Thank You!











Impact on wakeup/select logic







- Impact on wakeup/select logic
- Baseline: load dependents speculatively woken up on select



Baseline Wakeup/Select/Execute/Retire Pipeline

#### LD r1, mem(D) Wakeup Select Execute Retire . . . ADD ...,r1,... Select Wakeup Execute Retire • • • • • • SUB ...,r1,... Select Wakeup Execute Retire ... ...





- Impact on wakeup/select logic
- Baseline: load dependents speculatively woken up on select
  - Speculating that load will be hit



Baseline Wakeup/Select/Execute/Retire Pipeline

#### Wakeup LD r1, mem(D) Select Execute Retire . . . ADD ...,r1,... Select Wakeup Execute Retire • • • SUB ...,r1,... Select Wakeup Execute Retire • • • ...





- Impact on wakeup/select logic
- Baseline: load dependents speculatively woken up on select
  - Speculating that load will be hit
- SpecShield: Wakeup delayed until retirement

#### Baseline Wakeup/Select/Execute/Retire Pipeline



#### SpecShield Wakeup/Select/Execute/Retire Pipeline







#### Retire

#### **Benefits of Early Resolution**





- Average distance between ERP and ROB Head
- 1-55 entries, 9 average





#### Comparison with other solutions

|    | Defense           | Overhead | Benchmarks | Channels<br>Protected |
|----|-------------------|----------|------------|-----------------------|
| SW | LFENCE [35]       | 144%     | SPEC2006   | All                   |
|    | SLH [37]          | 108%     | SPEC2006   | Cache                 |
|    | Invisispec [8]    | 22-78%   | SPEC2006   | Cache                 |
|    | SafeSpec [9]      | -3%      | SPEC2017   | Cache, TLB            |
|    | DAWG [10]         | 1-15%    | PARSEC     | Cache                 |
|    | CS Fencing [38]   | 8-48%    | SPEC2006   | Cache                 |
| HW | Cond. Spec. [11]  | 7-53%    | SPEC2006   | Cache                 |
|    | Select Delay [16] | 11-46%   | SPEC2006   | Cache                 |
|    | SpecShieldSTL     | 73%      | SPEC2006   | All                   |
|    | SpecShieldERP     | 21%      | SPEC2006   | All                   |
|    | SpecShieldERP+    | 10%      | SPEC2006   | Flexible              |

## NUTIONS

