Timing Analysis of Software Executing on Undocumented Multicore Processors

Bjorn Andersson

Software Engineering Institute
Carnegie Mellon University
Pittsburgh, PA 15213
Systems Interact with Their Physical Environment
Systems Include Software
Systems Include Software That Interacts with the Physical Environment
Systems Include Software That Has Real-Time Requirements
Satisfying Real-Time Requirements is a Challenge for These Systems in General
Satisfying Real-Time Requirements is a Challenge for …

“The trick there, when you’re processing flight critical information, it has to be a deterministic environment, meaning we know exactly where a piece of data is going to be exactly when we need to — no room for error,” Langhout says. “On a multi-core processor there’s a lot of sharing going on across the cores, so right now we’re not able to do that.”

- Jeff Langhout, Acting Director, U.S. Army Aviation and Missile Research Development and Engineering Center (AMRDEC)

Satisfying Real-Time Requirements is a Challenge for …

“A majority of avionics today are running on single-core processors, or multicore processors with all but one core disabled.”

“The root of the problem is shared resources, which most of the time creates some kind of interference.”

Satisfying Real-Time Requirements is a Challenge for …

“In safety-critical domains such as avionics, the multicore “predictability problem” is currently dealt with by turning off all but one core if highly-critical system components exist.”

Satisfying Real-Time Requirements is a Challenge for …

“Currently, avionics manufacturers resolve the multicore “predictability problem” by turning off all but one core if highly critical system components exist.”

Commonality of these Systems
Commonality of these Systems

Computer

Program

Sensor

Physical Environment

Actuator

Time

Read
Sensor

Actuate
Command
Commonality of these Systems

Computer

Program

Physical Environment

Sensor

Actuator

Read Sensor

Actuate Command

Read Sensor

Actuate Command

Time
Commonality of these Systems

- Computer
- Program
- Sensor
- Actuator
- Physical Environment

Period

Read Sensor
Actuate Command

Deadline

Time

Read Sensor
Actuate Command
Commonality of these Systems
Commonality of these Systems
What Makes it Challenging to Satisfy Real-Time Requirements?
What Causes Delay of Software?
What Causes Delay of Software?
What Causes Delay of Software?

Time when one thread in the software system arrives

Deadline Time
What Causes Delay of Software?

Thread executes one path

Time when one thread in the software system arrives

Deadline

Time
What Causes Delay of Software?

- Thread executes another path
- Time when one thread in the software system arrives
- Deadline
What Causes Delay of Software?

- **Preemption:** Another thread uses the processor.

  ![Diagram showing preemption and time](image)

- Time when one thread in the software system arrives

- Deadline

- Time
What Causes Delay of Software?

Time when one thread in the software system arrives

Interrupt service routine executes

Deadline

Time
What Causes Delay of Software?

Time when one thread in the software system arrives

Deadline

Thread requests a critical section held by another thread

Time
What Causes Delay of Software?

- Time when one thread in the software system arrives
- Thread experiences extra cache misses because of preemptions
- Deadline

Time
What Causes Delay of Software?

- Time when one thread in the software system arrives

Thread experiences extra cache misses because of preemptions

The response time in this scenario
What Causes Delay of Software?

How large can the response time be?

- Time when one thread in the software system arrives
- Thread experiences extra cache misses because of preemptions
- The response time in this scenario
What Causes Delay of Software?

Does it hold for all scenarios that the response time is at most the deadline?

- Time when one thread in the software system arrives
- Thread experiences extra cache misses because of preemptions
- The response time in this scenario
What Causes Delay of Software?

Does it hold for all scenarios that the response time is at most the deadline?

Thread executes one path

Time when one thread in the software system arrives

Deadline

The response time in this scenario
What Causes Delay of Software?

Does it hold for all scenarios that the response time is at most the deadline?

Thread experiences extra cache misses because of preemptions

The response time in this scenario
What Causes Delay of Software?

Does it hold for all threads, that for all arrivals of the thread, that for all scenarios, that the finishing time is at most the deadline?

Thread experiences extra cache misses because of preemptions

Time when one thread in the software system arrives

Deadline

The response time in this scenario

Time
What Causes Delay of Software?

Does it hold for all threads, that for all arrivals of the thread, that for all scenarios, that the finishing time is at most the deadline?

If “yes,” we say the set of threads is \textit{schedulable}. Otherwise, the set of threads is \textit{unschedulable}.

Thread experiences extra cache misses because of preemptions

Time when one thread in the software system arrives

Deadline

Time

The response time in this scenario
What Causes Delay of Software?

Does it hold for all threads, that for all arrivals of the thread, that for all scenarios, that the finishing time is at most the deadline?

If “yes,” we say the set of threads is *schedulable*. Otherwise, the set of threads is *unschedulable*.

The process of determining whether a set of threads is schedulable is called *schedulability analysis*.

Thread experiences extra cache misses because of preemptions

Time when one thread in the software system arrives

The response time in this scenario

Deadline

Time
Conclusions so far

Many systems interact with the physical world

This interaction requires correct timing

Correct timing depends on whether the delay of the software is at most a certain bound

There are many causes of the delay of software (even on a computer with a single core)

Many systems today disable all processor cores except one in order to be confident about timing
Real-Time Requirements of Software Executing on a Multicore Processor

Hardware Trends

- All computers are multicores.
Real-Time Requirements of Software Executing on a Multicore Processor

Hardware Trends

- All computers are multicores.
- Most chip makers do not offer single core.
Real-Time Requirements of Software Executing on a Multicore Processor

Hardware Trends

- All computers are multicores.
- Most chip makers do not offer single core.
- Most multicores have shared memory.
**Problem:** For each process, compute an upper bound on its response time.
How Co-Runners Impact Speed of Execution

Core 1
L1/L2

Core 2
L1/L2

Core 3
L1/L2

Speed=1

Arrives | Finishes | Time
How Co-Runners Impact Speed of Execution

Core 1
L1/L2

Core 2
L1/L2

Core 3
L1/L2

Arrives  |  Finishes  |  Time
**Problem:** For each process, compute an upper bound on its response time.
Issues

• Shared hardware resources impact timing.
How Bad?

Slowdown

<table>
<thead>
<tr>
<th></th>
<th>Pelli10</th>
<th>Nowo12</th>
<th>Sha16</th>
<th>Kim14</th>
<th>Nowo14</th>
<th>Yun15</th>
</tr>
</thead>
<tbody>
<tr>
<td>Slowdown</td>
<td>2.98</td>
<td>5.1</td>
<td>6</td>
<td>14</td>
<td>15</td>
<td>103</td>
</tr>
</tbody>
</table>
Issues

- **Shared hardware resources impact timing.**
- **103 times slowdown has been observed.**

Issues

• *Shared hardware resources impact timing.*

• *103 times slowdown has been observed.*

This slowdown is caused by Miss-Status Holding Register (MSHR).

Issues

• **Shared hardware resources impact timing.**
• **103 times slowdown has been observed.***

This slowdown is caused by Miss-Status Holding Register (MSHR). At that time, most researchers in real-time systems community were not aware of the MSHR.

Issues

• *Shared hardware resources impact timing.*
• *103 times slowdown has been observed.*

This slowdown is caused by Miss-Status Holding Register (MSHR). At that time, most researchers in real-time systems community were not aware of the MSHR. There was no schedulability analysis that incorporated MSHR.

Issues

- **Shared hardware resources impact timing.**
- **103 times slowdown has been observed.**

This slowdown is caused by Miss-Status Holding Register (MSHR). At that time, most researchers in real-time systems community were not aware of the MSHR. There was no schedulability analysis that incorporated MSHR. Even today, there is no schedulability analysis that incorporates the timing effects of MSHR.

Issues

- *Shared hardware resources impact timing.*
- *103 times slowdown has been observed.*


MSHR was an unknown unknown.
Issues

• *Shared hardware resources impact timing.*
• *103 times slowdown has been observed.*

The resource that cause the worst slowdown is a resource that the real-time systems research computing community can neither analyze nor manage.

Issues

- **Shared hardware resources impact timing.**
- 103 times slowdown has been observed [Yun15].
- Current methods cannot deal with undocumented resources.
Issues

- **Shared hardware resources impact timing.**
- **103 times slowdown has been observed** [Yun15].
- **Current methods cannot deal with undocumented resources.**
- **Even for the case that resources are documented, current methods can only analyze/manage a small set of them.**
Issues

• Shared hardware resources impact timing.
• 103 times slowdown has been observed [Yun15].
• Current methods cannot deal with undocumented resources.
• Even when resources are documented, current methods can only analyze/manage a small set of them.
• The problem is getting worse:
  * Slowdown increasing
  * More undocumented h/w
**Problem:** For each process, compute its response time
Problem: For each process, compute an upper bound on its response time.
**Problem:** For each process, compute an upper bound on its response time considering contention for resources in the memory system and that resources in the memory system are undocumented.
Documented resources
• DRAM memory timing tends to be specified according to JDEC standard.
Documented resources
• DRAM memory timing tends to be specified according to JDEC standard.

Undocumented resources
• Memory controller is often undocumented.
• Interconnection network from cores to Last-Level cache is often undocumented.
• Miss-Status Holding Register is often undocumented.
Different reasons for treating resources as undocumented

Processes

Core 1
L1/L2

Core 2
L1/L2

Core 3
L1/L2

Last-Level Cache (L3)

Memory Bus (and Mem Controller)

DRAM Bank 0

DRAM Bank 1

DRAM Bank 2

DRAM Bank 3

... DRAM Bank B
Different reasons for treating resources as undocumented

- Resources are undocumented
Different reasons for treating resources as undocumented

- Resources are undocumented
- Resources are documented but documentation is incorrect
Different reasons for treating resources as undocumented

• Resources are undocumented
• Resources are documented but documentation is incorrect
  • Heule:PLDI16:
    50 out of 1795 x86 instructions have incorrect documentation
Different reasons for treating resources as undocumented

- Resources are undocumented
- Resources are documented but documentation is incorrect
  - Heule:PLDI16: 50 out of 1795 x86 instructions have incorrect documentation
  - Dasgupta:PLDI19: Found incorrect documentation of instructions for x86.
Different reasons for treating resources as undocumented:

- Resources are undocumented
- Resources are documented but documentation is incorrect
  - Heule:PLDI16: 50 out of 1795 x86 instructions have incorrect documentation
  - Dasgupta:PLDI19: Found incorrect documentation of instructions for x86.
  - Fog19: There are discrepancies between measured latencies and latencies in data sheets.
Different reasons for treating resources as undocumented

- Resources are undocumented
- Resources are documented but documentation is incorrect
- Resources are documented but one believes the documentation to be incorrect
Different reasons for treating resources as undocumented

- Resources are undocumented
- Resources are documented but documentation is incorrect
- Resources are documented but one believes the documentation to be incorrect
- Resources are documented and one believes documentation to be correct but it is laborious to create a timing model for schedulability analysis
Different reasons for treating resources as undocumented

- Resources are undocumented
- Resources are documented but documentation is incorrect
- Resources are documented but one believes the documentation to be incorrect
- Resources are documented and one believes documentation to be correct but it is laborious to create a timing model for schedulability analysis (and it needs to be changed when one buys a new chip anyway)

Processes

- Core 1
  - L1/L2
- Core 2
  - L1/L2
- Core 3
  - L1/L2

Last-Level Cache (L3)

Memory Bus (and Mem Controller)

- DRAM Bank 0
- DRAM Bank 1
- DRAM Bank 2
- DRAM Bank 3
- DRAM Bank B
The consequences of analyzing timing of software executing on multicore using documentation
The consequences of analyzing timing of software executing on multicores using documentation
The consequences of analyzing timing of software executing on multicores using documentation

Project starts:
We want to develop product X
The consequences of analyzing timing of software executing on multicores using documentation

Project starts:
We want to develop product X
Buy multi-core processor Y

time
The consequences of analyzing timing of software executing on multicores using documentation

Project starts: We want to develop product X
Buy multi-core processor Y
Read data sheets of multicore processor Y
The consequences of analyzing timing of software executing on multicores using documentation

Project starts:
We want to develop product X

Buy multi-core processor Y
Read data sheets of multicore processor Y
Create model of hardware and develop schedulability analysis equations
The consequences of analyzing timing of software executing on multicores using documentation

Project starts: We want to develop product X
Buy multicore processor Y
Read data sheets of multicore processor Y
Create model of hardware and develop schedulability analysis equations
Deliver product X
The consequences of analyzing timing of software executing on multicores using documentation

Voice of the customer:
We need product X to use the new multicore processor Y’

Project starts:
We want to develop product X

Buy multicore processor Y
Read data sheets of multicore processor Y
Create model of hardware and develop schedulability analysis equations
Deliver product X
The consequences of analyzing timing of software executing on multicores using documentation

Voice of the customer:
We need product X to use the new multicore processor Y’

Project starts:
We want to develop product X

Buy multicore processor Y
Read data sheets of multicore processor Y
Create model of hardware and develop schedulability analysis equations
Deliver product X

Buy multicore processor Y’
Read data sheets of multicore processor Y’
Create model of hardware and develop schedulability analysis equations
Deliver product X
The consequences of analyzing timing of software executing on multicores using documentation

This has to be repeated for each hardware upgrade.
Takes a long time.
Requires PhD level skills.

Voice of the customer:
We need product X to use the new multicore processor Y'.

Project starts:
We want to develop product X

Buy multicore processor Y
Read data sheets of multicore processor Y
Create model of hardware and develop schedulability analysis equations
Deliver product X

Buy multicore processor Y'
Read data sheets of multicore processor Y'
Create model of hardware and develop schedulability analysis equations
Deliver product X
The consequences of analyzing timing of software executing on multicores using documentation

New multicore processor Y becomes available
The consequences of analyzing timing of software executing on multicores using documentation

New multicore processor Y becomes available

Academic researcher creates a model of hardware and develops schedulability analysis equations for Y
The consequences of analyzing timing of software executing on multicores using documentation

New multicore processor Y becomes available

New multicore processor Y' becomes available

Academic researcher create model of hardware and develop schedulability analysis equations for Y

time
The consequences of analyzing timing of software executing on multicores using documentation

New multicore processor Y becomes available

New multicore processor Y’ becomes available

<table>
<thead>
<tr>
<th>Time</th>
<th>New multicore processor Y</th>
<th>Academic researcher create model of hardware and develop schedulability analysis equations for Y</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td>Academic researcher create model of hardware and develop schedulability analysis equations for Y’</td>
</tr>
</tbody>
</table>
The consequences of analyzing timing of software executing on multicores using documentation

<table>
<thead>
<tr>
<th>New multicore processor Y becomes available</th>
<th>New multicore processor Y’ becomes available</th>
<th>New multicore processor Y” becomes available</th>
</tr>
</thead>
<tbody>
<tr>
<td>Academic researcher</td>
<td>Academic researcher</td>
<td>Academic researcher</td>
</tr>
<tr>
<td>create</td>
<td>create</td>
<td>create</td>
</tr>
<tr>
<td>model of hardware and develop</td>
<td>model of hardware and develop</td>
<td>model of hardware and develop</td>
</tr>
<tr>
<td>schedulability analysis</td>
<td>schedulability analysis</td>
<td>schedulability analysis</td>
</tr>
<tr>
<td>equations for Y</td>
<td>equations for Y’</td>
<td>equations for Y’</td>
</tr>
</tbody>
</table>

Academic researcher create a model of hardware and develop schedulability analysis equations for Y, Y’, and Y” as new multicore processors become available.
The consequences of analyzing timing of software executing on multicores using documentation

<table>
<thead>
<tr>
<th>Time</th>
<th>New multicore processor Y becomes available</th>
<th>New multicore processor Y' becomes available</th>
<th>New multicore processor Y'' becomes available</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Academic researcher create model of hardware and develop schedulability analysis equations for Y</td>
<td>Academic researcher create model of hardware and develop schedulability analysis equations for Y'</td>
<td>Academic researcher create model of hardware and develop schedulability analysis equations for Y''</td>
</tr>
</tbody>
</table>
The consequences of analyzing timing of software executing on multicores using documentation
The consequences of analyzing timing of software executing on multicores using documentation

Treating multicore processors as undocumented hardware has the potential to create a model that can be used for processors in the future even for processors that we currently do not know about.
**Problem:** For each process, compute its response time

Diagram:
- Processes
  - Core 1
    - L1/L2
  - Core 2
    - L1/L2
  - Core 3
    - L1/L2
- Last-Level Cache (L3)
- Memory Bus (and Mem Controller)
  - DRAM Bank 0
  - DRAM Bank 1
  - DRAM Bank 2
  - DRAM Bank 3
  - DRAM Bank B
**Problem:** For each process, compute an upper bound on its response time.
Problem: For each process, compute an upper bound on its response time considering contention for resources in the memory system and that resources in the memory system are undocumented.
Problem: For each process, compute an upper bound on its response time considering contention for resources in the memory system and that resources in the memory system are undocumented.

Q: How can we analyze timing of a system when we do not know how the system works?
**Problem:** For each process, compute an upper bound on its response time considering contention for resources in the memory system and that resources in the memory system are undocumented.

**Q:** How can we analyze timing of a system when we do not know how the system works?

**A:** Create abstraction that describes effect of undocumented h/w.
Big Research Questions
Big Research Questions

**Q1:** What is a good abstraction that describes the effect of undocumented hardware?

**Q2:** How to create a schedulability analysis that uses this abstraction?
Big Research Questions

**Q1:** What is a good abstraction that describes the effect of undocumented hardware?

**Q2:** How to create a schedulability analysis that uses this abstraction?

Before discussing them, let us discuss:

1. Other approaches
2. General ideas for abstractions in other disciplines
Other approaches
Other approaches

Ignore timing requirements
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (FPGA)
- Build your own processor (ASIC)
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)
- Build your own processor (FPGA)
- Use well-documented hardware (e.g., RISC-V) and model all resources and create analysis
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)
- Build your own processor (FPGA)
- Use well-documented hardware (e.g., RISC-V) and model all resources and create analysis
- Model some resources and create analysis
- Ignore timing aspects of memory system
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)

- Build your own processor (FPGA)
- Use well-documented hardware (e.g., RISC-V) and model all resources and create analysis
- Model some resources and create analysis
- Software-based mechanism to improve predictability/isolation
### Other approaches

<table>
<thead>
<tr>
<th>Ignore timing requirements</th>
<th>Ignore timing aspects of memory system</th>
<th>Disable all processor cores except one</th>
<th>Use simple processor and run heavy computations on GPU</th>
<th>Build your own processor (ASIC)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Build your own processor (FPGA)</td>
<td>Use well-documented hardware (e.g., RISC-V) and model all resources and create analysis</td>
<td>Model some resources and create analysis</td>
<td>Software-based mechanism to improve predictability/isolation</td>
<td>Reorganize program so that at most one program accesses memory at a time</td>
</tr>
</tbody>
</table>
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)

**Pro:** Easy

**Con:** Potentially unsafe
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)

**Pro:** Easy
**Con:** Potentially unsafe
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)

**Pro:** Easy

**Con:** Lose lots of processing capacity
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)

**Con:** Analyzing timing of a single “task” executing on a GPU is still hard.
Other approaches

- Ignore timing requirements
- Ignore timing aspects of memory system
- Disable all processor cores except one
- Use simple processor and run heavy computations on GPU
- Build your own processor (ASIC)

**Con:** High fixed cost.
Other approaches

Con: Low clock frequency

- Build your own processor (FPGA)
- Use a well-documented hardware (e.g., RISC-V) and model all resources and create analysis
- Model some resources and create analysis
- Software-based mechanism to improve predictability/isolation
- Reorganize program so that at most one program accesses memory at a time

Build your own processor (FPGA)

Ignore timing aspects of memory system

Con: Low clock frequency
Other approaches

**Con:** Limits #suppliers. Laborious.

- Build your own processor (FPGA)
- Use well-documented hardware (e.g., RISC-V) and model all resources and create analysis
- Model some resources and create analysis
- Software-based mechanism to improve predictability/isolation
- Reorganize program so that at most one program accesses memory at a time
Other approaches

**Con:** Laborious. There may be many resources for which documentation does not exist.

- **Build your own processor (FPGA)**
- **Use well-documented hardware (e.g., RISC-V) and model all resources and create analysis**
- **Model some resources and create analysis**
- **Software-based mechanism to improve predictability/isolation**
- **Reorganize program so that at most one program accesses memory at a time**

**Con:** Laborious. There may be many resources for which documentation does not exist.
Other approaches

**Examples:** Cache coloring, Cache locking, Bank coloring, MemGuard, TLB coloring.

**Con:** Requires changes to operating system. Only work for some resources; there are many resources for which there is no isolation mechanism.

- Build your own processor (FPGA)
- Use well-documented hardware (e.g., RISC-V) and model all resources and create analysis
- Model some resources and create analysis
- Software-based mechanism to improve predictability/isolation
- Reorganize program so that at most one program accesses memory at a time
- Ignore timing requirements
Other approaches

Examples: PREM, Linkoping@RTSS07.
Con: Laborious. Requires a local memory that is large enough to store working set. Difficult to prove that no memory accesses occurs in certain phases.

- Build your own processor (FPGA)
- Use well-documented hardware (e.g., RISC-V) and model all resources and create analysis
- Model some resources and create analysis
- Software-based mechanism to improve predictability/isolation
- Reorganize program so that at most one program accesses memory at a time
Big Research Questions

Q1: What is a good abstraction that describes the effect of undocumented hardware?

Q2: How to create a schedulability analysis that uses this abstraction?

Before discussing them, let us discuss:
1. Other approaches (DONE)
2. General ideas for abstractions in other disciplines
General ideas for abstractions in other disciplines
General ideas for abstractions in other disciplines
General ideas for abstractions in other disciplines

Not drawn to scale
General ideas for abstractions in other disciplines

This body has many atoms
General ideas for abstractions in other disciplines

This body has many atoms  This body has many atoms
General ideas for abstractions in other disciplines

This body also has many atoms

This body has many atoms

This body has many atoms
General ideas for abstractions in other disciplines

This body also has many atoms

We describe each body as a single point mass
General ideas for abstractions in other disciplines

This body also has many atoms  We describe each body as a single point mass

Describe each body with only as many parameters that we need for the analysis that we want to do.
General ideas for abstractions in other disciplines

+ resistor

-
General ideas for abstractions in other disciplines

How many electrons per time unit are flowing here?
General ideas for abstractions in other disciplines

+ How much current is flowing here?

-
General ideas for abstractions in other disciplines

+ How much current is flowing here?

- Ask questions about the aggregate that you care about.
General ideas for abstractions in other disciplines

For each of these $10^{24}$ water molecules, how fast does this water molecule move?
General ideas for abstractions in other disciplines

What is the water temperature?

Ask questions about the aggregate that you care about.
General ideas for abstractions in other disciplines

What is the water temperature?

If possible: Describe a system with quantities that you can measure.
General ideas for abstractions in other disciplines

1. Describe a system with parts

2. Describe each part with an abstraction

3. Obtain specific values of the abstraction (e.g., using measurements) for each part

4. Ask questions: calculate the answer to the questions
General ideas for abstractions

1. Describe a system with parts

2. Describe each part with an abstraction

3. Obtain specific values of the abstraction (e.g., using measurements) for each part

4. Ask questions: calculate the answer to the questions
General ideas for abstractions

1. Describe a system with parts
   A software system comprises a set of tasks.

2. Describe each part with an abstraction
   Each task is described with a T (period), D (deadline), and C (execution time).

3. Obtain specific values of the abstraction (e.g., using measurements) for each part
   Obtain T from source code. Obtain D from software requirement specification. Obtain C by measuring the execution of the program; then add margin. (or run WCET tool)

4. Ask questions: calculate the answer to the questions
   Will all deadlines be met if tasks are scheduled by Rate-Monotonic on a single processor?
General ideas for abstractions

1. Describe a system with parts
   A software system comprises a set of tasks.

2. Describe each part with an abstraction
   Each task is described with a T (period), D (deadline), and C (execution time).
   Describe the effect of the memory system on execution speed of a task.

3. Obtain specific values of the abstraction (e.g., using measurements) for each part
   Obtain T from source code. Obtain D from software requirement specification.
   Obtain C by measuring the execution of the program; then add margin.
   (or run WCET tool)
   Obtain parameters (e.g., using measurements)

4. Ask questions: calculate the answer to the questions
   Will all deadlines be met if tasks are scheduled by Rate-Monotonic on a single processor?
Big Research Questions

Q1: What is a good abstraction that describes the effect of undocumented hardware?

Q2: How to create a schedulability analysis that uses this abstraction?

Before discussing them, let us discuss:
1. Other approaches (DONE)
2. General ideas for abstractions in other disciplines (DONE)
Big Research Questions

Q1: What is a good abstraction that describes the effect of undocumented hardware?

Q2: How to create a schedulability analysis that uses this abstraction?

Let us discuss Q1.
Big Research Questions

Q1: What is a good abstraction that describes the effect of undocumented hardware?

Q2: How to create a schedulability analysis that uses this abstraction?

What are the requirements of a good abstraction?
Big Research Questions

Q1: What is a good abstraction that describes the effect of undocumented hardware?

Q2: How to create a schedulability analysis that uses this abstraction?

What are the requirements of a good abstraction?
1. It should be as small as possible (few numbers; few bits)
Big Research Questions

Q1: What is a good abstraction that describes the effect of undocumented hardware?

Q2: How to create a schedulability analysis that uses this abstraction?

What are the requirements of a good abstraction?

1. It should be as small as possible (few numbers; few bits)
2. It should allow us to do prediction/analysis that we care about
Big Research Questions

Q1: What is a good abstraction that describes the effect of undocumented hardware?

Q2: How to create a schedulability analysis that uses this abstraction?

What are the requirements of a good abstraction?

1. It should be as small as possible (few numbers; few bits)
2. It should allow us to do prediction/analysis that we care about
3. Given a system, it should be possible to find the abstraction (through measurements or lower-level analysis)
Considerations in creating an abstraction for multicore

Let $speed(i,t)$ denote the speed of execution of task $i$ at time $t$. Let $cor(i,t)$ denote the set of tasks, other than task $i$, that executes at time $t$. 

![Diagram of multicore processor layers](image-url)
Considerations in creating an abstraction for multicore

Let \( \text{speed}(i, t) \) denote the speed of execution of task \( i \) at time \( t \). Let \( \text{cor}(i, t) \) denote the set of tasks, other than task \( i \), that executes at time \( t \).

One abstraction could be:

\[
\frac{1}{1 + |\text{cor}(i, t)|} \leq \text{speed}(i, t) \leq 1
\]
Considerations in creating an abstraction for multicore

Let $speed(i, t)$ denote the speed of execution of task $i$ at time $t$. Let $cor(i, t)$ denote the set of tasks, other than task $i$, that executes at time $t$.

One abstraction could be:

$$\frac{1}{1 + |cor(i, t)|} \leq speed(i, t) \leq 1$$

Drawback of abstraction:
- Does not reflect that different co-runners can have different effect on task $i$. 
Considerations in creating an abstraction for multicore

Let \( speed(i, t) \) denote the speed of execution of task \( i \) at time \( t \). Let \( cor(i, t) \) denote the set of tasks, other than task \( i \), that executes at time \( t \).

One abstraction could be:

\[ 1 - k \times |cor(i, t)| \leq speed(i, t) \leq 1 \]

Drawback of abstraction:
- Does not reflect that different co-runners can have different effect on task \( i \).
Considerations in creating an abstraction for multicore

Let \( \text{speed}(i,t) \) denote the speed of execution of task \( i \) at time \( t \). Let \( \text{cor}(i,t) \) denote the set of tasks, other than task \( i \), that executes at time \( t \).

One abstraction could be:

\[
\frac{w_i}{1 + \sum_{j|j \in \text{cor}(i,t)} w_{i,j}} \leq \text{speed}(i,t) \leq 1
\]

Drawback of abstraction:
- In reality, the speed can be super-additive or sub-additive (not reflected in the above).
Considerations in creating an abstraction for multicore

Let $speed(i,t)$ denote the speed of execution of task $i$ at time $t$. Let $cor(i,t)$ denote the set of tasks, other than task $i$, that executes at time $t$.

One abstraction could be:

$$
\frac{w_i}{1 + \prod_{j \in cor(i,t)} w_{i,j}} \leq speed(i,t) \leq 1
$$

Drawback of abstraction:
- In reality, the speed can be super-additive or sub-additive (not reflected in the above).
Considerations in creating an abstraction for multicore

Let $speed(i,t)$ denote the speed of execution of task $i$ at time $t$. Let $cor(i,t)$ denote the set of tasks, other than task $i$, that executes at time $t$.

One abstraction could be:

$$\frac{w_i}{1 + e^{\sum_{j \in cor(i,t)} w_{i,j}}} \leq speed(i,t) \leq 1$$

Drawback of abstraction:
- In reality, the speed can be super-additive or sub-additive (not reflected in the above).
Considerations in creating an abstraction for multicore

Why is speed sub-additive?

![Diagram of multicore architecture]

- Core 1
  - L1/L2
- Core 2
  - L1/L2
- Core 3
  - L1/L2

- Last-Level Cache (L3)
- Memory Bus (and Mem Controller)
- DRAM Bank 0
- DRAM Bank 1
- DRAM Bank 2
- DRAM Bank 3
- DRAM Bank B
Considerations in creating an abstraction for multicore

Why is speed sub-additive?

Consider DRAM bank B and its row buffer.
Considerations in creating an abstraction for multicore

Why is speed sub-additive?
Example:
Red task (victim):
1. load address x to register y
2. load address x’ to register z

x and x’ are in the same row in DRAM bank B’
But x and x’ are in different columns (different addresses)
Considerations in creating an abstraction for multicore

Why is speed sub-additive?
Example:
Red task:
1. load address x to register y
2. load address x’ to register z
Considerations in creating an abstraction for multicore

Why is speed sub-additive?
Example:
Red task:
1. load address $x$ to register $y$
2. load address $x'$ to register $z$
Green task:
1. load address $a$ (where $a$ is in same bank as $x$ but in different row)
Considerations in creating an abstraction for multicore

Why is speed sub-additive?

Example:

Red task:
1. load address x to register y
2. load address x’ to register z

Green task:
1. load address a (where a is in same bank as x but in different row)

Blue task:
1. load address b (where b is in same bank as x but in different row)
Considerations in creating an abstraction for multicore

Why is speed sub-additive?
Example:

Red task:
1. load address x to register y
2. load address x’ to register z

Green task:
1. load address a (where a is in same bank as x but in different row)

Blue task:
1. load address b (where b is in same bank as x but in different row)
Considerations in creating an abstraction for multicore

Why is speed sub-additive?

Example:

Red task:
1. load address x to register y
2. load address x’ to register z

Green task:
1. load address a (where a is in same bank as x but in different row)

Blue task:
1. load address b (where b is in same bank as x but in different row)

Observation: Green can evict Red’s data in the row buffer.
Considerations in creating an abstraction for multicore

Why is speed sub-additive?

Example:

Red task:
1. load address x to register y
2. load address x’ to register z

Green task:
1. load address a (where a is in same bank as x but in different row)

Blue task:
1. load address b (where b is in same bank as x but in different row)

Observation: Blue can evict Red’s data in the row buffer.
Considerations in creating an abstraction for multicore

Why is speed sub-additive?
Example:

Red task:
1. load address x to register y
2. load address x’ to register z

Green task:
1. load address a (where a is in same bank as x but in different row)

Blue task:
1. load address b (where b is in same bank as x but in different row)

Observation: If Blue has evicted Red’s data in the row buffer, then Green cannot evict it more.
Considerations in creating an abstraction for multicore

Why is speed sub-additive?

Example:
Red task:
1. load address x to register y
2. load address x’ to register z

Green task:
1. load address a (where a is in same bank as x but in different row)

Blue task:
1. load address b (where b is in same bank as x but in different row)

Observation: If Green has evicted Red’s data in the row buffer, then Blue cannot evict it more.
Considerations in creating an abstraction for multicore

Why is speed sub-additive?
Example:

Red task:
1. load address x to register y
2. load address x’ to register z

Green task:
1. load address a (where a is in same bank as x but in different row)

Blue task:
1. load address b (where b is in same bank as x but in different row)

Red experiences a slowdown due to Blue or Green but the slowdown is not greater by both.
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Consider Last-Level Cache (L3). Assume associativity = 2. Consider one specific cache set.
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Example:
Red task (victim):
1. load address x to register y
2. load address x to register z
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Example:
Red task (victim):
1. load address x to register y
2. load address x to register z
Green task:
1. load address a (where a is in same cache set x)
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Example:
Red task (victim):
1. load address x to register y
2. load address x to register z

Green task:
1. load address a (where a is in same cache set x)

Observation: Green’s data and Red’s data fit in the cache. No capacity miss.
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Example:
Red task (victim):
1. load address x to register y
2. load address x to register z

Blue task:
1. load address b (where b is in same cache set x)

Observation: Blue’s data and Red’s data fit in the cache. No capacity miss.
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Example:
Red task (victim):
1. load address x to register y
2. load address x to register z
Green task:
1. load address a (where a is in same cache set x)
Blue task:
1. load address b (where b is in same cache set x)
Observation: Blue’s data, Green’s data, and Red’s data do not fit in the cache. Capacity miss.
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Example:
Red task (victim):
1. load address x to register y
2. load address x to register z

Green task:
1. load address a (where a is in same cache set x)

Blue task:
1. load address b (where b is in same cache set x)

Observation: Green alone does not cause slowdown of Red.
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Example:
   Red task (victim):
      1. load address x to register y
      2. load address x to register z
   Green task:
      1. load address a (where a is in same cache set x)
   Blue task:
      1. load address b (where b is in same cache set x)

Observation: Blue alone does not cause slowdown of Red.
Considerations in creating an abstraction for multicore

Why is speed super-additive?

Example:
Red task (victim):
1. load address x to register y
2. load address x to register z
Green task:
1. load address a (where a is in same cache set x)
Blue task:
1. load address b (where b is in same cache set x)
Observation: But Green and Blue cause slowdown of Red.
Considerations in creating an abstraction for multicore

Speed of execution of a task as a function of co-runners can be either
(i) Additive
(ii) Sub-additive
(iii) Super-additive
Considerations in creating an abstraction for multicore

Speed of execution of a task as a function of co-runners can be either
(i) Additive
(ii) Sub-additive
(iii) Super-additive

If speed is not additive, how can we describe speed as a function of co-runners?
Considerations in creating an abstraction for multicore

Speed of execution of a task as a function of co-runners can be either
(i) Additive
(ii) Sub-additive
(iii) Super-additive

For each task: Enumerate the set of possible co-runner set.
Considerations in creating an abstraction for multicore

Main idea:
For each task $i$, for each co-runner set of task $i$ do

\[ \text{lowerbound speed}_{i}^{\text{cor}} \leq \text{speed}(i, t) \leq 1 \]

For each task: Enumerate the set of possible co-runner set.
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task $C_{red}=4$

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
</tbody>
</table>

![Diagram of multicore system with memory hierarchy and resource consumption](image)
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task $C_{red}=4$

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{cyan}</td>
<td>0.5</td>
</tr>
</tbody>
</table>

Core 1
- L1/L2

Core 2
- L1/L2

Core 3
- L1/L2

Last-Level Cache (L3)

Memory Bus (and Mem Controller)

DRAM Bank 0

DRAM Bank 1

DRAM Bank 2

DRAM Bank 3

…

DRAM Bank B
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task

\[ C_{\text{red}} = 4 \]

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{green}</td>
<td>0.45</td>
</tr>
</tbody>
</table>
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task $C_{\text{red}}=4$

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{brown}</td>
<td>0.45</td>
</tr>
</tbody>
</table>
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task $C_{red}=4$

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{blue}</td>
<td>0.25</td>
</tr>
</tbody>
</table>
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task

\[ C_{\text{red}} = 4 \]

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{cyan, brown}</td>
<td>0.18</td>
</tr>
</tbody>
</table>
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task $C_{\text{red}}=4$

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{cyan, blue}</td>
<td>0.12</td>
</tr>
</tbody>
</table>

Core 1

Core 2

Core 3

L1/L2

L1/L2

L1/L2

Last-Level Cache (L3)

Memory Bus (and Mem Controller)

DRAM

Bank 0

DRAM

Bank 1

DRAM

Bank 2

DRAM

Bank 3

⋯

DRAM

Bank B
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task $C_{red}=4$

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{green,brown}</td>
<td>0.13</td>
</tr>
</tbody>
</table>

Core 1
L1/L2

Core 2
L1/L2

Core 3
L1/L2

Last-Level Cache (L3)

Memory Bus (and Mem Controller)

DRAM Bank 0

DRAM Bank 1

DRAM Bank 2

DRAM Bank 3

... DRAM Bank B
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task
\[ C_{red} = 4 \]

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{green, blue}</td>
<td>0.19</td>
</tr>
</tbody>
</table>

![Diagram of a multicore processor with caches and memories](image)
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task

$C_{red}=4$

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{cyan}</td>
<td>0.5</td>
</tr>
<tr>
<td>{green}</td>
<td>0.45</td>
</tr>
<tr>
<td>{brown}</td>
<td>0.45</td>
</tr>
<tr>
<td>{blue}</td>
<td>0.25</td>
</tr>
<tr>
<td>{cyan, brown}</td>
<td>0.18</td>
</tr>
<tr>
<td>{cyan, blue}</td>
<td>0.12</td>
</tr>
<tr>
<td>{green, brown}</td>
<td>0.13</td>
</tr>
<tr>
<td>{green, blue}</td>
<td>0.19</td>
</tr>
</tbody>
</table>
Considerations in creating an abstraction for multicore

Describing resource consumption of Red task $C_{\text{red}}=4$

<table>
<thead>
<tr>
<th>Co-runner set</th>
<th>Speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>{}</td>
<td>1</td>
</tr>
<tr>
<td>{cyan}</td>
<td>0.5</td>
</tr>
<tr>
<td>{green}</td>
<td>0.45</td>
</tr>
<tr>
<td>{brown}</td>
<td>0.45</td>
</tr>
<tr>
<td>{blue}</td>
<td>0.25</td>
</tr>
<tr>
<td>{cyan, brown}</td>
<td>0.18</td>
</tr>
<tr>
<td>{cyan, blue}</td>
<td>0.12</td>
</tr>
<tr>
<td>{green, brown}</td>
<td>0.13</td>
</tr>
<tr>
<td>{green, blue}</td>
<td>0.19</td>
</tr>
</tbody>
</table>

Describe resource consumption of other tasks analogously.
Considerations in creating an abstraction for multicore

Main idea:
For each task $i$, for each co-runner set of task $i$ do
\[
\text{lowerboundspeed}_{i}^{\text{cor}} \leq \text{speed}(i,t) \leq 1
\]

For each task: Enumerate the set of possible co-runner set.

Isn’t that exponential? Couldn’t this be bad?
Considerations in creating an abstraction for multicore

Main idea:
For each task \(i\), for each co-runner set of task \(i\) do
\[
\text{lowerboundspeed}_{t}^{\text{cor}} \leq \text{speed}(i, t) \leq 1
\]

Given a task \(i\): #co-runner sets of task \(i\) ≤ \[
\left(\frac{n\text{tasks} - 1}{n\text{processors} - 1} + 1\right)^{n\text{processors} - 1}
\]
Considerations in creating an abstraction for multicore

Main idea:
For each task $i$,
for each co-runner set of task $i$ do

\[ \text{lowerboundspeed}^\text{cor}_i \leq \text{speed}(i, t) \leq 1 \]

Given a task $i$: #co-runner sets of task $i$ $\leq$ polynomial

If number of processors is fixed.
Considerations in creating an abstraction for multicore

Main idea:
For each task $i$, for each co-runner set of task $i$ do

\[
\text{lowerbound speed}^{\text{cor}}_i \leq \text{speed}(i, t) \leq 1
\]

Given a task $i$: #co-runner sets of task $i \leq n_{tasks}$

If number of processors = 2.
Considerations in creating an abstraction for multicore

Main idea:
For each task \( i \),
for each co-runner set of task \( i \) do
\[
\text{lowerboundspeed}_{\text{cor}}^i \leq \text{speed}(i, t) \leq 1
\]

Given a task \( i \): #co-runner sets of task \( i \) \( \leq \left( \frac{\text{ntasks} - 1}{2} + 1 \right)^2 \)

If number of processors = 3.
Considerations in creating an abstraction for multicore

Main idea:
For each task $i$, for each co-runner set of task $i$ do

\[ \text{lowerbound speed}_i^{cor} \leq \text{speed}(i, t) \leq 1 \]

Given a task $i$: #co-runner sets of task $i \leq \left( \frac{\text{ntasks} - 1}{3} + 1 \right)^3$
Considerations in creating an abstraction for multicore

Given a task $i$: A job of task $i$ can experience different co-runner set at different times.
How Co-Runners Impact Speed of Execution

Core 1
L1/L2

Core 2
L1/L2

Core 3
L1/L2

Arrives | Finishes | Time

Speed=1
How Co-Runners Impact Speed of Execution

Core 1
L1/L2

Core 2
L1/L2

Core 3
L1/L2

Arrives | Finishes | Time
Considerations in creating an abstraction for multicore

Given a task $i$: A job of task $i$ can experience different co-runner set at different times.
Considerations in creating an abstraction for multicore

In some computer systems, there are some resources (e.g., memory bus) where the arbitration depends on the processor id of the requestor.
unschedulable

Core 1
L1/L2

Core 2
L1/L2

Core 3
L1/L2

Last-Level Cache (L3)

Memory Bus (and Mem Controller)

DRAM Bank 0

DRAM Bank 1

DRAM Bank 2

DRAM Bank 3

***DRAM Bank B

schedulable

Core 1
L1/L2

Core 2
L1/L2

Core 3
L1/L2

Last-Level Cache (L3)

Memory Bus (and Mem Controller)

DRAM Bank 0

DRAM Bank 1

DRAM Bank 2

DRAM Bank 3

***DRAM Bank B
Considerations in creating an abstraction for multicore

In some computer systems, there are some resources (e.g., memory bus) where the arbitration depends on the processor id of the requestor.
Considerations in creating an abstraction for multicore

The program behavior may change over time.
Considerations in creating an abstraction for multicore

The program behavior may change over time.

```c
while (1) {
    s = wait_until_next_sample()
    update_datastructures(s)
    a = compute_actuation_command()
    actuate_command(a)
}
```
Considerations in creating an abstraction for multicore

The program behavior may change over time.

while (1) {
  s = wait_until_next_sample()
  update_datastructures(s)
  a = compute_actuation_command()
  actuate_command(a)
}

The program behavior here
Considerations in creating an abstraction for multicore

The program behavior may change over time.

```c
while (1) {
    s = wait_until_next_sample()
    update_datastructures(s)
    a = compute_actuation_command()
    actuate_command(a)
}
```

The program behavior here is different from the program behavior here.
Considerations in creating an abstraction for multicore

The program behavior may change over time.

```
while (1) {
    s = wait_until_next_sample()
    update_datastructures(s)
    a = compute_actuation_command()
    actuate_command(a)
}
```

Describe a task as a sequence of segments where each segment may have different description of lower bound on speed as a function of co-runners.
Taskset example

\[ |\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\} \]

\[
T_1 = 1.500 \quad D_1 = 1.500 \quad V_1 = \{v_1^1\} \quad \text{prio}_1 = 3 \quad \text{proc}_1 = 1
\]

\[
C_1^1 = 0.250 \quad \text{pd}^1_1 = 0.500 \quad \text{CO}^1_1 = \{(\{v_3^1\}, 1.0), (\{v_4^1\}, 0.5), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}
\]

\[
T_2 = 2.000 \quad D_2 = 2.000 \quad V_2 = \{v_2^1\} \quad \text{prio}_2 = 2 \quad \text{proc}_2 = 1
\]

\[
C_2^1 = 0.250 \quad \text{pd}^1_2 = 0.500 \quad \text{CO}^1_2 = \{(\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}
\]

\[
T_3 = 2.000 \quad D_3 = 2.000 \quad V_3 = \{v_3^1\} \quad \text{prio}_3 = 3 \quad \text{proc}_3 = 2
\]

\[
C_3^1 = 0.250 \quad \text{pd}^1_3 = 0.500 \quad \text{CO}^1_3 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 0.5)\}
\]

\[
T_4 = 2.000 \quad D_4 = 2.000 \quad V_4 = \{v_4^1\} \quad \text{prio}_4 = 2 \quad \text{proc}_4 = 2
\]

\[
C_4^1 = 0.250 \quad \text{pd}^1_4 = 0.500 \quad \text{CO}^1_4 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}
\]

\[
T_5 = 2.250 \quad D_5 = 2.250 \quad V_5 = \{v_5^1, v_5^2\} \quad \text{prio}_5 = 1 \quad \text{proc}_5 = 2
\]

\[
C_5^1 = 0.500 \quad \text{pd}^1_5 = 1.000 \quad \text{CO}^1_5 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0)\}
\]

\[
C_5^2 = 0.125 \quad \text{pd}^2_5 = 0.500 \quad \text{CO}^2_5 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}
\]
### Taskset example

| \(|\tau| = 5\) | \(\text{procs}(\Pi) = \{P_1, P_2\}\) |
|-----------------|---------------------------------|
| \(T_1\) = 1.500 | \(D_1\) = 1.500 | \(V_1 = \{v_1^1\}\) | \(\text{prio}_1 = 3\) | \(\text{proc}_1 = 1\) |
| \(C_1\) = 0.250 | \(\text{pd}_1\) = 0.500 | \(\text{CO}_1 = \{(\{v_3^1\}, 1.0), (\{v_4^1\}, 0.5), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}\) |
| \(T_2\) = 2.000 | \(D_2\) = 2.000 | \(V_2 = \{v_2^1\}\) | \(\text{prio}_2 = 2\) | \(\text{proc}_2 = 1\) |
| \(C_2\) = 0.250 | \(\text{pd}_2\) = 0.500 | \(\text{CO}_2 = \{(\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}\) |
| \(T_3\) = 2.000 | \(D_3\) = 2.000 | \(V_3 = \{v_3^1\}\) | \(\text{prio}_3 = 3\) | \(\text{proc}_3 = 2\) |
| \(C_3\) = 0.250 | \(\text{pd}_3\) = 0.500 | \(\text{CO}_3 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 0.5)\}\) |
| \(T_4\) = 2.000 | \(D_4\) = 2.000 | \(V_4 = \{v_4^1\}\) | \(\text{prio}_4 = 2\) | \(\text{proc}_4 = 2\) |
| \(C_4\) = 0.250 | \(\text{pd}_4\) = 0.500 | \(\text{CO}_4 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}\) |
| \(T_5\) = 2.250 | \(D_5\) = 2.250 | \(V_5 = \{v_5^1, v_5^2\}\) | \(\text{prio}_5 = 1\) | \(\text{proc}_5 = 2\) |
| \(C_5\) = 0.500 | \(\text{pd}_5\) = 1.000 | \(\text{CO}_5 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0)\}\) |
| \(C_5^2\) = 0.125 | \(\text{pd}_5^2\) = 0.500 | \(\text{CO}_5^2 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}\) |
### Taskset example


\[
|\tau| = 5 \quad \text{procs}(\tau) = \{P_1, P_2\}
\]

\[
\begin{align*}
T_1 &= 1.500 & D_1 &= 1.500 & V_1 = \{v^1_1\} & \text{prio}_1 = 3 & \text{proc}_1 = 1 \\
C^1_1 &= 0.250 & \text{pd}^1_1 &= 0.500 & \text{CO}_1^1 &= \{\{v^3_3, 1.0\}, \{v^4_4, 0.5\}, \{v^5_5, 1.0\}, \{v^2_5, 1.0\}\} \\
T_2 &= 2.000 & D_2 &= 2.000 & V_2 = \{v^1_2\} & \text{prio}_2 = 2 & \text{proc}_2 = 1 \\
C^1_2 &= 0.250 & \text{pd}^1_2 &= 0.500 & \text{CO}_2^1 &= \{\{v^3_3, 0.5\}, \{v^4_4, 1.0\}, \{v^5_5, 1.0\}, \{v^2_5, 1.0\}\} \\
T_3 &= 2.000 & D_3 &= 2.000 & V_3 = \{v^1_3\} & \text{prio}_3 = 3 & \text{proc}_3 = 2 \\
C^1_3 &= 0.250 & \text{pd}^1_3 &= 0.500 & \text{CO}_3^1 &= \{\{v^1_1, 1.0\}, \{v^2_2, 0.5\}\} \\
T_4 &= 2.000 & D_4 &= 2.000 & V_4 = \{v^1_4\} & \text{prio}_4 = 2 & \text{proc}_4 = 2 \\
C^1_4 &= 0.250 & \text{pd}^1_4 &= 0.500 & \text{CO}_4^1 &= \{\{v^1_1, 0.5\}, \{v^2_2, 1.0\}\} \\
T_5 &= 2.250 & D_5 &= 2.250 & V_5 = \{v^1_5, v^2_5\} & \text{prio}_5 = 1 & \text{proc}_5 = 2 \\
C^1_5 &= 0.500 & \text{pd}^1_5 &= 1.000 & \text{CO}_5^1 &= \{\{v^1_1, 1.0\}, \{v^2_2, 1.0\}\} \\
C^2_5 &= 0.125 & \text{pd}^2_5 &= 0.500 & \text{CO}_5^2 &= \{\{v^1_1, 0.5\}, \{v^2_2, 1.0\}\}
\end{align*}
\]
### Taskset example

| $|\tau| = 5$ | $\text{procs}(\Pi) = \{P_1, P_2\}$ |
|---|---|
| \begin{align*}
T_1 & = 1.500 \\
D_1 & = 1.500 \\
V_1 & = \{v_1^1\} \\
C_1 & = 0.250 \\
pd_1 & = 0.500 \\
CO_1 & = \{(\{v_3^1\}, 1.0), (\{v_4^1\}, 0.5), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}
\end{align*} | \begin{align*}
prio_1 & = 3 \\
proc_1 & = 1
\end{align*} |

| \begin{align*}
T_2 & = 2.000 \\
D_2 & = 2.000 \\
V_2 & = \{v_2^1\} \\
C_2 & = 0.250 \\
pd_2 & = 0.500 \\
CO_2 & = \{(\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}
\end{align*} | \begin{align*}
prio_2 & = 2 \\
proc_2 & = 1
\end{align*} |

| \begin{align*}
T_3 & = 2.000 \\
D_3 & = 2.000 \\
V_3 & = \{v_3^1\} \\
C_3 & = 0.250 \\
pd_3 & = 0.500 \\
CO_3 & = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 0.5)\}
\end{align*} | \begin{align*}
prio_3 & = 3 \\
proc_3 & = 2
\end{align*} |

| \begin{align*}
T_4 & = 2.000 \\
D_4 & = 2.000 \\
V_4 & = \{v_4^1\} \\
C_4 & = 0.250 \\
pd_4 & = 0.500 \\
CO_4 & = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}
\end{align*} | \begin{align*}
prio_4 & = 2 \\
proc_4 & = 2
\end{align*} |

| \begin{align*}
T_5 & = 2.250 \\
D_5 & = 2.250 \\
V_5 & = \{v_1^1, v_5^2\} \\
C_5 & = 0.500 \\
pd_5 & = 1.000 \\
CO_5 & = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0)\}
\end{align*} | \begin{align*}
prio_5 & = 1 \\
proc_5 & = 2
\end{align*} |
Taskset example

\[|\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\}\]

\[
\begin{align*}
T_1 &= 1.500 & \text{prio}_1 &= 3 & \text{proc}_1 &= 1 \\
D_1 &= 1.500 & V_1 &= \{v_1^1\} & \text{CO}_1 &= \{(\{v_3^1\}, 1.0), (\{v_4^1\}, 0.5), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\} \\
C_1 &= 0.250 & \text{pd}_1 &= 0.500
\end{align*}
\]

\[
\begin{align*}
T_2 &= 2.000 & \text{prio}_2 &= 2 & \text{proc}_2 &= 1 \\
D_2 &= 2.000 & V_2 &= \{v_2^1\} & \text{CO}_2 &= \{(\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\} \\
C_2 &= 0.250 & \text{pd}_2 &= 0.500
\end{align*}
\]

\[
\begin{align*}
T_3 &= 2.000 & \text{prio}_3 &= 3 & \text{proc}_3 &= 2 \\
D_3 &= 2.000 & V_3 &= \{v_3^1\} & \text{CO}_3 &= \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 0.5)\} \\
C_3 &= 0.250 & \text{pd}_3 &= 0.500
\end{align*}
\]

\[
\begin{align*}
T_4 &= 2.000 & \text{prio}_4 &= 2 & \text{proc}_4 &= 2 \\
D_4 &= 2.000 & V_4 &= \{v_4^1\} & \text{CO}_4 &= \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\} \\
C_4 &= 0.250 & \text{pd}_4 &= 0.500
\end{align*}
\]

\[
\begin{align*}
T_5 &= 2.250 & \text{prio}_5 &= 1 & \text{proc}_5 &= 2 \\
D_5 &= 2.250 & V_5 &= \{v_5^1, v_5^2\} & \text{CO}_5 &= \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0)\} \\
C_5 &= 0.500 & \text{pd}_5 &= 1.000
\end{align*}
\]

\[
\begin{align*}
C_5^2 &= 0.125 & \text{pd}_5^2 &= 0.500 & \text{CO}_5^2 &= \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}
\end{align*}
\]
Taskset example

|\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\}

\begin{align*}
T_1 & = 1.500 & D_1 & = 1.500 & V_1 = \{\nu_1^1\} & \text{prio}_1 = 3 & \text{proc}_1 = 1 \\
C_1 & = 0.250 & p_{d1} & = 0.500 & \text{CO}_1 = \{\{\nu_3^1, 1.0\}, \{\nu_4^1, 0.5\}, \{\nu_5^1, 1.0\}, \{\nu_5^2, 1.0\}\}
\end{align*}

\begin{align*}
T_2 & = 2.000 & D_2 & = 2.000 & V_2 = \{\nu_2^1\} & \text{prio}_2 = 2 & \text{proc}_2 = 1 \\
C_2 & = 0.250 & p_{d2} & = 0.500 & \text{CO}_2 = \{\{\nu_3^1, 0.5\}, \{\nu_4^1, 1.0\}, \{\nu_5^1, 1.0\}, \{\nu_5^2, 1.0\}\}
\end{align*}

\begin{align*}
T_3 & = 2.000 & D_3 & = 2.000 & V_3 = \{\nu_3^1\} & \text{prio}_3 = 3 & \text{proc}_3 = 2 \\
C_3 & = 0.250 & p_{d3} & = 0.500 & \text{CO}_3 = \{\{\nu_1^1, 1.0\}, \{\nu_2^1, 0.5\}\}
\end{align*}

\begin{align*}
T_4 & = 2.000 & D_4 & = 2.000 & V_4 = \{\nu_4^1\} & \text{prio}_4 = 2 & \text{proc}_4 = 2 \\
C_4 & = 0.250 & p_{d4} & = 0.500 & \text{CO}_4 = \{\{\nu_1^1, 0.5\}, \{\nu_2^1, 1.0\}\}
\end{align*}

\begin{align*}
T_5 & = 2.250 & D_5 & = 2.250 & V_5 = \{\nu_5^1, \nu_5^2\} & \text{prio}_5 = 1 & \text{proc}_5 = 2 \\
C_5 & = 0.500 & p_{d5} & = 1.000 & \text{CO}_5 = \{\{\nu_1^1, 1.0\}, \{\nu_2^1, 1.0\}\}
\end{align*}

\begin{align*}
C_5 & = 0.125 & p_{d5} & = 0.500 & \text{CO}_5 = \{\{\nu_1^1, 0.5\}, \{\nu_2^1, 1.0\}\}
\end{align*}

Task 3
Taskset example

\[ |\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\} \]

\[
\begin{align*}
T_1 &= 1.500 & D_1 &= 1.500 & V_1 &= \{v^1_1\} & \text{prio}_1 = 3 & \text{proc}_1 = 1 \\
C_1 &= 0.250 & \text{pd}_1 &= 0.500 & \text{CO}_1 &= \{(\{v^1_3\}, 1.0), (\{v^1_4\}, 0.5), (\{v^1_5\}, 1.0), (\{v^2_5\}, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_2 &= 2.000 & D_2 &= 2.000 & V_2 &= \{v^1_2\} & \text{prio}_2 = 2 & \text{proc}_2 = 1 \\
C_2 &= 0.250 & \text{pd}_2 &= 0.500 & \text{CO}_2 &= \{(\{v^1_3\}, 0.5), (\{v^1_4\}, 1.0), (\{v^1_5\}, 1.0), (\{v^2_5\}, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_3 &= 2.000 & D_3 &= 2.000 & V_3 &= \{v^1_3\} & \text{prio}_3 = 3 & \text{proc}_3 = 2 \\
C_3 &= 0.250 & \text{pd}_3 &= 0.500 & \text{CO}_3 &= \{(\{v^1_1\}, 1.0), (\{v^1_2\}, 0.5)\}
\end{align*}
\]

\[
\begin{align*}
T_4 &= 2.000 & D_4 &= 2.000 & V_4 &= \{v^1_4\} & \text{prio}_4 = 2 & \text{proc}_4 = 2 \\
C_4 &= 0.250 & \text{pd}_4 &= 0.500 & \text{CO}_4 &= \{(\{v^1_1\}, 0.5), (\{v^1_2\}, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_5 &= 2.250 & D_5 &= 2.250 & V_5 &= \{v^1_5, v^2_5\} & \text{prio}_5 = 1 & \text{proc}_5 = 2 \\
C_5 &= 0.500 & \text{pd}_5 &= 1.000 & \text{CO}_5 &= \{(\{v^1_1\}, 1.0), (\{v^1_2\}, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
C_5 &= 0.125 & \text{pd}_5 &= 0.500 & \text{CO}_5 &= \{(\{v^1_1\}, 0.5), (\{v^1_2\}, 1.0)\}
\end{align*}
\]

Task 4
## Taskset example

\[ |\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\} \]

<table>
<thead>
<tr>
<th>Task</th>
<th>( T )</th>
<th>( D )</th>
<th>( \phi )</th>
<th>( V )</th>
<th>( \pi )</th>
<th>( \text{proc} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>( T_1 )</td>
<td>1.500</td>
<td>1.500</td>
<td>0.250</td>
<td>( {v_1^1} )</td>
<td>3</td>
<td>1</td>
</tr>
<tr>
<td>( C_1 )</td>
<td>0.250</td>
<td>0.500</td>
<td>( {v_3^1, 1.0}, {v_4^1, 0.5}, {v_5^1, 1.0}, {v_5^2, 1.0} )</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>( T_2 )</td>
<td>2.000</td>
<td>2.000</td>
<td>0.250</td>
<td>( {v_2^1} )</td>
<td>2</td>
<td>1</td>
</tr>
<tr>
<td>( C_2 )</td>
<td>0.250</td>
<td>0.500</td>
<td>( {v_3^1, 0.5}, {v_4^1, 1.0}, {v_5^1, 1.0}, {v_5^2, 1.0} )</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>( T_3 )</td>
<td>2.000</td>
<td>2.000</td>
<td>0.250</td>
<td>( {v_3^1} )</td>
<td>3</td>
<td>2</td>
</tr>
<tr>
<td>( C_3 )</td>
<td>0.250</td>
<td>0.500</td>
<td>( {v_1^1, 1.0}, {v_2^1, 0.5} )</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>( T_4 )</td>
<td>2.000</td>
<td>2.000</td>
<td>0.250</td>
<td>( {v_4^1} )</td>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td>( C_4 )</td>
<td>0.250</td>
<td>0.500</td>
<td>( {v_1^1, 0.5}, {v_2^1, 1.0} )</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>( T_5 )</td>
<td>2.250</td>
<td>2.250</td>
<td>0.500</td>
<td>( {v_5^1, v_5^2} )</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>( C_5 )</td>
<td>0.500</td>
<td>1.000</td>
<td>( {v_1^1, 1.0}, {v_2^1, 1.0} )</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>( C_5^2 )</td>
<td>0.125</td>
<td>0.500</td>
<td>( {v_1^1, 0.5}, {v_2^1, 1.0} )</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
Taskset example

Minimum inter-arrival times of tasks

\[
\tau = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\}
\]

\[
\begin{array}{llllll}
T_1 & = 1.500 & D_1 & = 1.500 & V_1 = \{v_1^1\} & \text{prio}_1 = 3 & \text{proc}_1 = 1 \\
C_1 & = 0.250 & \text{pd}_1 & = 0.500 & \text{CO}_1 = \{(\{v_3^1\}, 1.0), (\{v_4^1\}, 0.5), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\} \\
\end{array}
\]

\[
\begin{array}{llllll}
T_2 & = 2.000 & D_2 & = 2.000 & V_2 = \{v_2^1\} & \text{prio}_2 = 2 & \text{proc}_2 = 1 \\
C_2 & = 0.250 & \text{pd}_2 & = 0.500 & \text{CO}_2 = \{(\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\} \\
\end{array}
\]

\[
\begin{array}{llllll}
T_3 & = 2.000 & D_3 & = 2.000 & V_3 = \{v_3^1\} & \text{prio}_3 = 3 & \text{proc}_3 = 2 \\
C_3 & = 0.250 & \text{pd}_3 & = 0.500 & \text{CO}_3 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 0.5)\} \\
\end{array}
\]

\[
\begin{array}{llllll}
T_4 & = 2.000 & D_4 & = 2.000 & V_4 = \{v_4^1\} & \text{prio}_4 = 2 & \text{proc}_4 = 2 \\
C_4 & = 0.250 & \text{pd}_4 & = 0.500 & \text{CO}_4 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\} \\
\end{array}
\]

\[
\begin{array}{llllll}
T_5 & = 2.250 & D_5 & = 2.250 & V_5 = \{v_1^1, v_5^2\} & \text{prio}_5 = 1 & \text{proc}_5 = 2 \\
C_5 & = 0.500 & \text{pd}_5 & = 1.000 & \text{CO}_5 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0)\} \\
C_5 & = 0.125 & \text{pd}_5 & = 0.500 & \text{CO}_5 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\} \\
\end{array}
\]
Taskset example

|τ| = 5  \quad \text{procs}(\Pi) = \{P_1, P_2\}

| \begin{align*}
  T_1 &= 1.500 \quad [D_1 = 1.500] \\
  C_1 &= 0.250 \quad p_{d_1} = 0.500 \\
\end{align*} |

| \begin{align*}
  V_1 &= \{v_1^1\} \quad \text{prio}_1 = 3 \quad \text{proc}_1 = 1 \\
  \text{CO}_1 &= \{((\{v_3^1\}, 1.0), (\{v_4^1\}, 0.5), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0))\} \\
\end{align*} |

| \begin{align*}
  T_2 &= 2.000 \quad [D_2 = 2.000] \\
  C_2 &= 0.250 \quad p_{d_2} = 0.500 \\
\end{align*} |

| \begin{align*}
  V_2 &= \{v_2^1\} \quad \text{prio}_2 = 2 \quad \text{proc}_2 = 1 \\
  \text{CO}_2 &= \{((\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0))\} \\
\end{align*} |

| \begin{align*}
  T_3 &= 2.000 \quad [D_3 = 2.000] \\
  C_3 &= 0.250 \quad p_{d_3} = 0.500 \\
\end{align*} |

| \begin{align*}
  V_3 &= \{v_3^1\} \quad \text{prio}_3 = 3 \quad \text{proc}_3 = 2 \\
  \text{CO}_3 &= \{((\{v_1^1\}, 1.0), (\{v_2^1\}, 0.5))\} \\
\end{align*} |

| \begin{align*}
  T_4 &= 2.000 \quad [D_4 = 2.000] \\
  C_4 &= 0.250 \quad p_{d_4} = 0.500 \\
\end{align*} |

| \begin{align*}
  V_4 &= \{v_4^1\} \quad \text{prio}_4 = 2 \quad \text{proc}_4 = 2 \\
  \text{CO}_4 &= \{((\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0))\} \\
\end{align*} |

| \begin{align*}
  T_5 &= 2.250 \quad [D_5 = 2.250] \\
  C_5 &= 0.500 \quad p_{d_5} = 1.000 \\
\end{align*} |

| \begin{align*}
  V_5 &= \{v_5^1, v_5^2\} \quad \text{prio}_5 = 1 \quad \text{proc}_5 = 2 \\
  \text{CO}_5 &= \{((\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0))\} \\
\end{align*} |
### Taskset example

\[ |\tau| = 5 \]
\[ \text{procs}(\Pi) = \{P_1, P_2\} \]

<table>
<thead>
<tr>
<th>Task</th>
<th>Speed</th>
<th>Sets of segments for each task</th>
</tr>
</thead>
<tbody>
<tr>
<td>(T_1)</td>
<td>1.500</td>
<td>(D_1 = V_1 = {v_1^1})</td>
</tr>
<tr>
<td>(C_1)</td>
<td>0.250</td>
<td>(D_1 = V_1 = {v_1^1})</td>
</tr>
<tr>
<td>(T_2)</td>
<td>2.000</td>
<td>(D_2 = V_2 = {v_2^1})</td>
</tr>
<tr>
<td>(C_2)</td>
<td>0.250</td>
<td>(D_2 = V_2 = {v_2^1})</td>
</tr>
<tr>
<td>(T_3)</td>
<td>2.000</td>
<td>(D_3 = V_3 = {v_3^1})</td>
</tr>
<tr>
<td>(C_3)</td>
<td>0.250</td>
<td>(D_3 = V_3 = {v_3^1})</td>
</tr>
<tr>
<td>(T_4)</td>
<td>2.000</td>
<td>(D_4 = V_4 = {v_4^1})</td>
</tr>
<tr>
<td>(C_4)</td>
<td>0.250</td>
<td>(D_4 = V_4 = {v_4^1})</td>
</tr>
<tr>
<td>(T_5)</td>
<td>2.250</td>
<td>(D_5 = V_5 = {v_5^1, v_5^2})</td>
</tr>
<tr>
<td>(C_5)</td>
<td>0.500</td>
<td>(D_5 = V_5 = {v_5^1, v_5^2})</td>
</tr>
<tr>
<td>(C_5)</td>
<td>0.125</td>
<td>(D_5 = V_5 = {v_5^1, v_5^2})</td>
</tr>
</tbody>
</table>
Taskset example

\[ |\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\} \]

\[
\begin{align*}
T_1 &= 1.500 & D_1 &= 1.500 & V_1 = \{v^1_1\} & \text{prio}_1 = 3 & \text{proc}_1 = 1 \\
C^1_1 &= 0.250 & \text{pd}^1_1 &= 0.500 & \text{CO}^1_1 &= \{(\{v^1_3\}, 1.0), (\{v^1_4\}, 0.5), (\{v^1_5\}, 1.0), (\{v^2_5\}, 1.0)\} \\

T_2 &= 2.000 & D_2 &= 2.000 & V_2 = \{v^1_2\} & \text{prio}_2 = 2 & \text{proc}_2 = 1 \\
C^1_2 &= 0.250 & \text{pd}^1_2 &= 0.500 & \text{CO}^1_2 &= \{(\{v^1_3\}, 0.5), (\{v^1_4\}, 1.0), (\{v^1_5\}, 1.0), (\{v^2_5\}, 1.0)\} \\

T_3 &= 2.000 & D_3 &= 2.000 & V_3 = \{v^1_3\} & \text{prio}_3 = 3 & \text{proc}_3 = 2 \\
C^1_3 &= 0.250 & \text{pd}^1_3 &= 0.500 & \text{CO}^1_3 &= \{(\{v^1_1\}, 1.0), (\{v^1_2\}, 0.5)\} \\

T_4 &= 2.000 & D_4 &= 2.000 & V_4 = \{v^1_4\} & \text{prio}_4 = 2 & \text{proc}_4 = 2 \\
C^1_4 &= 0.250 & \text{pd}^1_4 &= 0.500 & \text{CO}^1_4 &= \{(\{v^1_1\}, 0.5), (\{v^1_2\}, 1.0)\} \\

T_5 &= 2.250 & D_5 &= 2.250 & V_5 = \{v^1_5, v^2_5\} & \text{prio}_5 = 1 & \text{proc}_5 = 2 \\
C^1_5 &= 0.500 & \text{pd}^1_5 &= 1.000 & \text{CO}^1_5 &= \{(\{v^1_1\}, 1.0), (\{v^1_2\}, 1.0)\} \\
C^2_5 &= 0.125 & \text{pd}^2_5 &= 0.500 & \text{CO}^2_5 &= \{(\{v^1_1\}, 0.5), (\{v^1_2\}, 1.0)\}
\end{align*}
\]

Task 1 has 1 segment.
Taskset example

\[ |\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\} \]

\[
\begin{align*}
T_1 &= 1.500 \quad D_1 = 1.500 \quad V_1 = \{v_1^1\} \quad \text{prio}_1 = 3 \quad \text{proc}_1 = 1 \\
C_1^1 &= 0.250 \quad \text{pd}_1^1 = 0.500 \quad \text{CO}_1^1 = \{(\{v_3^1\}, 1.0), (\{v_4^1\}, 0.5), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_2 &= 2.000 \quad D_2 = 2.000 \quad V_2 = \{v_2^1\} \quad \text{prio}_2 = 2 \quad \text{proc}_2 = 1 \\
C_2^1 &= 0.250 \quad \text{pd}_2^1 = 0.500 \quad \text{CO}_2^1 = \{(\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_3 &= 2.000 \quad D_3 = 2.000 \quad V_3 = \{v_3^1\} \quad \text{prio}_3 = 3 \quad \text{proc}_3 = 2 \\
C_3^1 &= 0.250 \quad \text{pd}_3^1 = 0.500 \quad \text{CO}_3^1 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 0.5)\}
\end{align*}
\]

\[
\begin{align*}
T_4 &= 2.000 \quad D_4 = 2.000 \quad V_4 = \{v_4^1\} \quad \text{prio}_4 = 2 \quad \text{proc}_4 = 2 \\
C_4^1 &= 0.250 \quad \text{pd}_4^1 = 0.500 \quad \text{CO}_4^1 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_5 &= 2.250 \quad D_5 = 2.250 \quad V_5 = \{v_5^1, v_5^2\} \quad \text{prio}_5 = 1 \quad \text{proc}_5 = 2 \\
C_5^1 &= 0.500 \quad \text{pd}_5^1 = 1.000 \quad \text{CO}_5^1 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
C_5^2 &= 0.125 \quad \text{pd}_5^2 = 0.500 \quad \text{CO}_5^2 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}
\end{align*}
\]

Task 5 has 2 segments.
Taskset example

Priorities for each task

<table>
<thead>
<tr>
<th>Task</th>
<th>Priority</th>
<th>Precedence</th>
<th>( \tau )</th>
<th>( \text{procs}(\Pi) = {P_1, P_2} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>( T_1 )</td>
<td>1</td>
<td>( D_1 )</td>
<td>1.500</td>
<td>( {v_1^1} )</td>
</tr>
<tr>
<td>( C_1 )</td>
<td>( D_1 )</td>
<td>1.500</td>
<td>( {v_1^1} )</td>
<td>( \text{proc}_1 = 1 )</td>
</tr>
<tr>
<td>( V_1 )</td>
<td>( {v_1^1} )</td>
<td>( \text{pd}_1 )</td>
<td>0.500</td>
<td>( \text{CO}_1 = {({v_3^1}, 1.0), ({v_4^1}, 0.5), ({v_5^1}, 1.0), ({v_5^2}, 1.0)} )</td>
</tr>
<tr>
<td>( T_2 )</td>
<td>2</td>
<td>( D_2 )</td>
<td>2.000</td>
<td>( {v_2^1} )</td>
</tr>
<tr>
<td>( C_2 )</td>
<td>( D_2 )</td>
<td>2.000</td>
<td>( {v_2^1} )</td>
<td>( \text{proc}_2 = 1 )</td>
</tr>
<tr>
<td>( V_2 )</td>
<td>( {v_2^1} )</td>
<td>( \text{pd}_2 )</td>
<td>0.500</td>
<td>( \text{CO}_2 = {({v_3^1}, 0.5), ({v_4^1}, 1.0), ({v_5^1}, 1.0), ({v_5^2}, 1.0)} )</td>
</tr>
<tr>
<td>( T_3 )</td>
<td>2</td>
<td>( D_3 )</td>
<td>2.000</td>
<td>( {v_3^1} )</td>
</tr>
<tr>
<td>( C_3 )</td>
<td>( D_3 )</td>
<td>2.000</td>
<td>( {v_3^1} )</td>
<td>( \text{proc}_3 = 2 )</td>
</tr>
<tr>
<td>( V_3 )</td>
<td>( {v_3^1} )</td>
<td>( \text{pd}_3 )</td>
<td>0.500</td>
<td>( \text{CO}_3 = {({v_1^1}, 1.0), ({v_2^1}, 0.5)} )</td>
</tr>
<tr>
<td>( T_4 )</td>
<td>2</td>
<td>( D_4 )</td>
<td>2.000</td>
<td>( {v_4^1} )</td>
</tr>
<tr>
<td>( C_4 )</td>
<td>( D_4 )</td>
<td>2.000</td>
<td>( {v_4^1} )</td>
<td>( \text{proc}_4 = 2 )</td>
</tr>
<tr>
<td>( V_4 )</td>
<td>( {v_4^1} )</td>
<td>( \text{pd}_4 )</td>
<td>0.500</td>
<td>( \text{CO}_4 = {({v_1^1}, 0.5), ({v_2^1}, 1.0)} )</td>
</tr>
<tr>
<td>( T_5 )</td>
<td>2.250</td>
<td>( D_5 )</td>
<td>2.250</td>
<td>( {v_5^1, v_5^2} )</td>
</tr>
<tr>
<td>( C_5 )</td>
<td>( D_5 )</td>
<td>2.250</td>
<td>( {v_5^1, v_5^2} )</td>
<td>( \text{proc}_5 = 2 )</td>
</tr>
<tr>
<td>( V_5 )</td>
<td>( {v_5^1, v_5^2} )</td>
<td>( \text{pd}_5 )</td>
<td>1.000</td>
<td>( \text{CO}_5 = {({v_1^1}, 1.0), ({v_2^1}, 1.0)} )</td>
</tr>
<tr>
<td>( \text{pd}_5 )</td>
<td>( {v_5^1} )</td>
<td>0.500</td>
<td>( \text{CO}_5 )</td>
<td>( {({v_1^1}, 0.5), ({v_2^1}, 1.0)} )</td>
</tr>
</tbody>
</table>
Taskset example

<table>
<thead>
<tr>
<th>task</th>
<th>( T )</th>
<th>( D )</th>
<th>( V )</th>
<th>( \text{prio} )</th>
<th>( \text{proc} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1.500</td>
<td>1.500</td>
<td>{v_1^1}</td>
<td>3</td>
<td>1</td>
</tr>
<tr>
<td>2</td>
<td>2.000</td>
<td>2.000</td>
<td>{v_2^1}</td>
<td>2</td>
<td>1</td>
</tr>
<tr>
<td>3</td>
<td>2.000</td>
<td>2.000</td>
<td>{v_3^1}</td>
<td>3</td>
<td>2</td>
</tr>
<tr>
<td>4</td>
<td>2.000</td>
<td>2.000</td>
<td>{v_4^1}</td>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td>5</td>
<td>2.250</td>
<td>2.250</td>
<td>{v_5^1, v_5^2}</td>
<td>1</td>
<td>2</td>
</tr>
</tbody>
</table>

Processor to each task is assigned

procs(\( \Pi \)) = \{P_1, P_2\}

\( |\tau| = 5 \)
### Taskset example

| $|\tau| = 5$ | $\text{procs}(\Pi) = \{P_1, P_2\}$ |
|----------------|-------------------|
| $T_1 = 1.500$ | $D_1 = 1.500$    |
| $C_1 = 0.250$ | $V_1 = \{v_1^1\}$ |
| $\text{pd}_1 = 0.500$ | $\text{prio}_1 = 3$ |
| $\text{proc}_1 = 1$ | $\text{CO}_1 = \{(\{v_3\}, 1.0), (\{v_4\}, 0.5), (\{v_5\}, 1.0), (\{v_5^2\}, 1.0)\}$ |

| $T_2 = 2.000$ | $D_2 = 2.000$ |
| $C_2 = 0.250$ | $V_2 = \{v_2^1\}$ |
| $\text{pd}_2 = 0.500$ | $\text{prio}_2 = 2$ |
| $\text{proc}_2 = 1$ | $\text{CO}_2 = \{(\{v_3\}, 0.5), (\{v_4\}, 1.0), (\{v_5\}, 1.0), (\{v_5^2\}, 1.0)\}$ |

| $T_3 = 2.000$ | $D_3 = 2.000$ |
| $C_3 = 0.250$ | $V_3 = \{v_3^1\}$ |
| $\text{pd}_3 = 0.500$ | $\text{prio}_3 = 3$ |
| $\text{proc}_3 = 2$ | $\text{CO}_3 = \{(\{v_1\}, 1.0), (\{v_2\}, 0.5)\}$ |

| $T_4 = 2.000$ | $D_4 = 2.000$ |
| $C_4 = 0.250$ | $V_4 = \{v_4^1\}$ |
| $\text{pd}_4 = 0.500$ | $\text{prio}_4 = 2$ |
| $\text{proc}_4 = 2$ | $\text{CO}_4 = \{(\{v_1\}, 0.5), (\{v_2\}, 1.0)\}$ |

| $T_5 = 2.250$ | $D_5 = 2.250$ |
| $C_5 = 0.500$ | $V_5 = \{v_5^1, v_5^2\}$ |
| $\text{pd}_5 = 1.000$ | $\text{prio}_5 = 1$ |
| $\text{proc}_5 = 2$ | $\text{CO}_5 = \{(\{v_1\}, 1.0), (\{v_2\}, 1.0)\}$ |

| $C_5 = 0.125$ | $\text{pd}_5 = 0.500$ |
| $\text{CO}_5 = \{(\{v_1\}, 0.5), (\{v_2\}, 1.0)\}$ |
### Taskset Example

<table>
<thead>
<tr>
<th>Task</th>
<th>Speed (s)</th>
<th>Duration (s)</th>
<th>Prioritization</th>
<th>Co-runners</th>
</tr>
</thead>
<tbody>
<tr>
<td>Task 1</td>
<td>$1.500$</td>
<td>$1.500$</td>
<td>$3$</td>
<td>${v_1^1}$</td>
</tr>
<tr>
<td>Task 2</td>
<td>$2.000$</td>
<td>$2.000$</td>
<td>$2$</td>
<td>${v_2^1}$</td>
</tr>
<tr>
<td>Task 3</td>
<td>$2.000$</td>
<td>$2.000$</td>
<td>$3$</td>
<td>${v_3^1}$</td>
</tr>
<tr>
<td>Task 4</td>
<td>$2.000$</td>
<td>$2.000$</td>
<td>$2$</td>
<td>${v_4^1}$</td>
</tr>
<tr>
<td>Task 5</td>
<td>$2.250$</td>
<td>$2.250$</td>
<td>$1$</td>
<td>${v_5^1, v_5^2}$</td>
</tr>
</tbody>
</table>

**Speed as function of co-runners**

*Example Co-runners:*

- $CO_1^1 = \{(\{v_3^1\}, 1.0), (\{v_4^1\}, 0.5), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}$
- $CO_2^1 = \{(\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_5^2\}, 1.0)\}$
- $CO_3^1 = \{(\{v_3^1\}, 1.0), (\{v_2^1\}, 0.5)\}$
- $CO_4^1 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}$
- $CO_5^1 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0)\}$
- $CO_5^2 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}$
Taskset example

If segment 1 of task 1 executes in parallel with segment 1 of task 4, then Segment 1 of task 1 executes with speed 0.5.

\[ |\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\} \]

\begin{align*}
T_1 &= 1.500 \quad D_1 = 1.500 \quad V_1 = \{v_1^1\} \quad \text{prio}_1 = 3 \quad \text{proc}_1 = 1 \\
C_1^1 &= 0.250 \quad \text{pd}_1^1 = 0.500 \quad \text{CO}_1^1 = \{(\{v_3^1\}, 1.0), (\{v_4^1, 0.5\}, (\{v_5^1\}, 1.0), (\{v_2^2\}, 1.0))
\end{align*}

\begin{align*}
T_2 &= 2.000 \quad D_2 = 2.000 \quad V_2 = \{v_2^1\} \quad \text{prio}_2 = 2 \quad \text{proc}_2 = 1 \\
C_2^1 &= 0.250 \quad \text{pd}_2^1 = 0.500 \quad \text{CO}_2^1 = \{(\{v_3^1\}, 0.5), (\{v_4^1\}, 1.0), (\{v_5^1\}, 1.0), (\{v_2^2\}, 1.0))
\end{align*}

\begin{align*}
T_3 &= 2.000 \quad D_3 = 2.000 \quad V_3 = \{v_3^1\} \quad \text{prio}_3 = 3 \quad \text{proc}_3 = 2 \\
C_3^1 &= 0.250 \quad \text{pd}_3^1 = 0.500 \quad \text{CO}_3^1 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 0.5)\}
\end{align*}

\begin{align*}
T_4 &= 2.000 \quad D_4 = 2.000 \quad V_4 = \{v_4^1\} \quad \text{prio}_4 = 2 \quad \text{proc}_4 = 2 \\
C_4^1 &= 0.250 \quad \text{pd}_4^1 = 0.500 \quad \text{CO}_4^1 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}
\end{align*}

\begin{align*}
T_5 &= 2.250 \quad D_5 = 2.250 \quad V_5 = \{v_5^1, v_5^2\} \quad \text{prio}_5 = 1 \quad \text{proc}_5 = 2 \\
C_5^1 &= 0.500 \quad \text{pd}_5^1 = 1.000 \quad \text{CO}_5^1 = \{(\{v_1^1\}, 1.0), (\{v_2^1\}, 1.0))
\end{align*}

\begin{align*}
C_5^2 &= 0.125 \quad \text{pd}_5^2 = 0.500 \quad \text{CO}_5^2 = \{(\{v_1^1\}, 0.5), (\{v_2^1\}, 1.0)\}
\end{align*}
Taskset example

If segment 1 of task 1 executes in parallel with a segmentset for which information in CO is not available, then segment 1 of task 1 executes with speed 0.5.

\[ |\tau| = 5 \quad \text{procs}(\Pi) = \{P_1, P_2\} \]

\[
\begin{align*}
T_1 &= 1.500 \quad D_1 = 1.500 \quad V_1 = \{v_1^1\} \quad \text{prio}_1 = 3 \quad \text{proc}_1 = 1 \\
C_1 &= 0.250 \quad \text{pd}_1 = 0.500 \\
\text{CO}_1 &= \{(v_3^3, 1.0), (v_4^1, 0.5), (v_5^1, 1.0), (v_5^2, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_2 &= 2.000 \quad D_2 = 2.000 \quad V_2 = \{v_2^1\} \quad \text{prio}_2 = 2 \quad \text{proc}_2 = 1 \\
C_2 &= 0.250 \quad \text{pd}_2 = 0.500 \\
\text{CO}_2 &= \{(v_3^3, 0.5), (v_4^1, 1.0), (v_5^1, 1.0), (v_5^2, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_3 &= 2.000 \quad D_3 = 2.000 \quad V_3 = \{v_3^1\} \quad \text{prio}_3 = 3 \quad \text{proc}_3 = 2 \\
C_3 &= 0.250 \quad \text{pd}_3 = 0.500 \\
\text{CO}_3 &= \{(v_1^1, 1.0), (v_2^1, 0.5)\}
\end{align*}
\]

\[
\begin{align*}
T_4 &= 2.000 \quad D_4 = 2.000 \quad V_4 = \{v_4^1\} \quad \text{prio}_4 = 2 \quad \text{proc}_4 = 2 \\
C_4 &= 0.250 \quad \text{pd}_4 = 0.500 \\
\text{CO}_4 &= \{(v_1^1, 0.5), (v_2^1, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
T_5 &= 2.250 \quad D_5 = 2.250 \quad V_5 = \{v_5^1, v_5^2\} \quad \text{prio}_5 = 1 \quad \text{proc}_5 = 2 \\
C_5 &= 0.500 \quad \text{pd}_5 = 1.000 \\
\text{CO}_5 &= \{(v_1^1, 1.0), (v_2^1, 1.0)\}
\end{align*}
\]

\[
\begin{align*}
C_5 &= 0.125 \quad \text{pd}_5 = 0.500 \\
\text{CO}_5 &= \{(v_1^1, 0.5), (v_2^1, 1.0)\}
\end{align*}
\]
**Big Research Questions**

Q1: What is a good abstraction that describes the effect of undocumented hardware? (DONE)

Q2: How to create a schedulability analysis that uses this abstraction?

Before discussing them, let us discuss:
1. Other approaches (DONE)
2. General ideas for abstractions in other disciplines (DONE)
Big Research Questions

Q1: What is a good abstraction that describes the effect of undocumented hardware?

Q2: How to create a schedulability analysis that uses this abstraction?

Let us discuss Q2.
Big Research Questions

**Q1:** What is a good abstraction that describes the effect of undocumented hardware?

**Q2:** How to create a schedulability analysis that uses this abstraction?

In order to simplify our discussion initially, let us consider a taskset with only Red and Brown task. Also, let us consider that the minimum inter-arrival time of each of these tasks is infinity (i.e., generates just a single job).
Let $\text{dur}_{\{\text{red}\}}$ denote the cumulative duration that the set of tasks that execute is the set $\{\text{red}\}$.
Let $\text{dur}_{\{\text{red}\}}$ denote the cumulative duration that the set of tasks that execute is the set $\{\text{red}\}$.

Let $\text{dur}_{\{\text{red,brown}\}}$ denote the cumulative duration that the set of tasks that execute is the set $\{\text{red,brown}\}$. 
Maximize

\[
\text{dur}_{\{\text{red,brown}\}} + \text{dur}_{\{\text{red}\}}
\]

subject to

\[
0.45 \cdot \text{dur}_{\{\text{red,brown}\}} + 1.00 \cdot \text{dur}_{\{\text{red}\}} \leq C_{\text{red}}
\]

\[
0.6 \cdot \text{dur}_{\{\text{red,brown}\}} + 1.0 \cdot \text{dur}_{\{\text{brown}\}} \leq C_{\text{brown}}
\]
Maximize
\[ \text{dur}_{\{\text{red,brown}\}} + \text{dur}_{\{\text{red}\}} \]
subject to
\[ 0.45 \times \text{dur}_{\{\text{red,brown}\}} + 1.00 \times \text{dur}_{\{\text{red}\}} \leq C_{\text{red}} \]
\[ 0.6 \times \text{dur}_{\{\text{red,brown}\}} + 1.0 \times \text{dur}_{\{\text{brown}\}} \leq C_{\text{brown}} \]

Lower bound on the speed of execution of Red when it executes in parallel with Brown.
Maximize $\text{dur}_{\{\text{red, brown}\}} + \text{dur}_{\{\text{red}\}}$
subject to

$0.45\times\text{dur}_{\{\text{red, brown}\}} + 1.00\times\text{dur}_{\{\text{red}\}} \leq C_{\text{red}}$

$0.6\times\text{dur}_{\{\text{red, brown}\}} + 1.0\times\text{dur}_{\{\text{brown}\}} \leq C_{\text{brown}}$

Lower bound on the speed of execution of Brown when it executes in parallel with Red.
Maximize
\[ \text{dur}_{\text{red,brown}} + \text{dur}_{\text{red}} \]
subject to
\[ 0.45 \times \text{dur}_{\text{red,brown}} + 1.00 \times \text{dur}_{\text{red}} \leq C_{\text{red}} \]
\[ 0.6 \times \text{dur}_{\text{red,brown}} + 1.0 \times \text{dur}_{\text{brown}} \leq C_{\text{brown}} \]
\[ 0 \leq \text{dur}_{\text{red,brown}} \]
\[ 0 \leq \text{dur}_{\text{red}} \]
\[ 0 \leq \text{dur}_{\text{brown}} \]
Timing Analysis of Software Executing on Undocumented Multicore Processors

Maximize
\[ \text{dur}_{\{\text{red,brown}\}} + \text{dur}_{\{\text{red}\}} \]
subject to
\[ 0.45 \times \text{dur}_{\{\text{red,brown}\}} + 1.00 \times \text{dur}_{\{\text{red}\}} \leq C_{\text{red}} \]
\[ 0.6 \times \text{dur}_{\{\text{red,brown}\}} + 1.0 \times \text{dur}_{\{\text{brown}\}} \leq C_{\text{brown}} \]
\[ 0 \leq \text{dur}_{\{\text{red,brown}\}} \]
\[ 0 \leq \text{dur}_{\{\text{red}\}} \]
\[ 0 \leq \text{dur}_{\{\text{brown}\}} \]

Solving this optimization problem yields an upper bound on the response time of Red.
Maximize
\[ \text{dur}_{\text{red, brown}} + \text{dur}_{\text{red}} \]
subject to
\[ 0.45 \times \text{dur}_{\text{red, brown}} + 1.00 \times \text{dur}_{\text{red}} \leq C_{\text{red}} \]
\[ 0.6 \times \text{dur}_{\text{red, brown}} + 1.0 \times \text{dur}_{\text{brown}} \leq C_{\text{brown}} \]
\[ 0 \leq \text{dur}_{\text{red, brown}} \]
\[ 0 \leq \text{dur}_{\text{red}} \]
\[ 0 \leq \text{dur}_{\text{brown}} \]
Let us formulate this in general.
Let us consider a time interval of duration $t$. Task $i$ (victim) Arrives.
Let $\tau$ denote the taskset. Let $\Pi$ denote the computer platform.

Let $\tau$ denote the taskset. Let $\Pi$ denote the computer platform.

Task $i$ (victim) Arrives

Busy executing task $i$ or higher priority

Let us consider a time interval of duration $t$
Let \( \tau \) denote the taskset.
Let \( \Pi \) denote the computer platform.

Given task \( i \), find a value of \( t \) such that an upper bound on the duration of time during which the processor (Processor 3) is busy executing task \( i \) or higher priority tasks is equal to \( t \).

Let us consider a time interval of duration \( t \).

Task \( i \) Arrives (victim)
Let $\tau$ denote the taskset.
Let $\Pi$ denote the computer platform.

Given task $i$, find a value of $t$ such that an upper bound on the duration of time during which the processor (Processor 3) is busy executing task $i$ or higher priority tasks is equal to $t$.

Let us consider a time interval of duration $t$.
Define \( \text{reqlp}(\tau, \Pi, i, t) \) as the optimal value of the objective function of the following:

\[
\begin{align*}
\text{maximize} & \quad \sum_{i' \in \text{hep}(\tau, \Pi, i)} \sum_{v_{i'}^k \in V_{i'}} \sum_{s \in S(\tau, \Pi, i', k')} \text{du}_s \\
\text{subject to} & \quad \forall i' \in \text{hep}(\tau, \Pi, i) \land \forall v_{i'}^k \in V_{i'}, \quad \sum_{s \in S(\tau, \Pi, i', k')} \text{pw}_{i', s}^k \times \text{du}_s \leq \left[ \frac{t}{T_{i'}} \right] \times C_{i'}^{k'} \\
& \quad \forall i' \in \text{top}(\tau, \Pi, i) \land \forall v_{i'}^k \in V_{i'}, \quad \sum_{s \in S(\tau, \Pi, i', k')} \text{pw}_{i', s}^k \times \text{du}_s \leq x UB(\tau, \Pi, i', k', t) \\
& \quad \forall s \in S(\tau, \Pi) \land (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s)) \\
& \quad \forall s \in S(\tau, \Pi) \land (\exists (i'', k''') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k''') \in s)) \\
& \quad \text{du}_s \in \mathbb{R}_{\geq 0}
\end{align*}
\]

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** \((\forall \tau \in \tau, (\exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t)) \Rightarrow \tau \text{ is schedulable on } \Pi\)**
Define $\text{reqlp}(\tau, \Pi, i, t)$ as the optimal value of the objective function of the following:

\[
\text{maximize} \quad \sum_{i' \in \text{hep}(	au, \Pi, i)} \sum_{V_i, s' \in S(\tau, \Pi, i', k')} \sum_{s \in S(\tau, \Pi, i', k')} d_{u_s} \\
\text{subject to} \\
\forall i' \in \text{hep}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_i, \sum_{s \in S(\tau, \Pi, i', k')} p_{w_{i'}^{k'}, s \{i', k'} \times d_{u_s} \leq \left[ \frac{t}{T_{i'}} \right] \times C_{i'}^{k'} \\
\forall i' \in \text{top}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_i, \sum_{s \in S(\tau, \Pi, i', k')} p_{w_{i'}^{k'}, s \{i', k'} \times d_{u_s} \leq x_{UB}(\tau, \Pi, i', k', t) \\
(\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s)) \\
\forall s \in S(\tau, \Pi) \text{ s.t. } (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s)) \Rightarrow d_{u_s} \in \mathbb{R}_{\geq 0}
\]

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** $\left( \forall i \in \tau, \exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t \right) \Rightarrow \tau$ is schedulable on $\Pi$
Define $reqlp(\tau, \Pi, i, t)$ as the optimal value of the objective function of the following:

$$\max \sum_{i' \in \text{hep}(\tau, \Pi, i)} \sum_{s' \in \text{V}_{i'}} \sum_{s \in S(\tau, \Pi, i', k') \times \text{du}_s \leq \left[\frac{t}{T_{i'}}\right] \times C_{i'}^{k'}$$

subject to

$$\forall i' \in \text{hep}(\tau, \Pi, i), \forall v_{i'}^{k'} \in \text{V}_{i'}, \sum_{s \in S(\tau, \Pi, i', k') \times \text{du}_s \leq \left[\frac{t}{T_{i'}}\right] \times C_{i'}^{k'}$$

$$\forall i' \in \text{top}(\tau, \Pi, i), \forall v_{i'}^{k'} \in \text{V}_{i'}, \sum_{s \in S(\tau, \Pi, i', k') \times \text{du}_s \leq \left[\frac{t}{T_{i'}}\right] \times C_{i'}^{k'}$$

$$\forall s \in S(\tau, \Pi) \text{ s.t. } (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i) \land ((i'', k'') \in s))) \land (\exists (i', k') \text{ s.t. } ((i', k') \in s), \text{du}_s \in \mathbb{R}_{\geq 0})$$

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** $(\forall \tau_i \in \tau, (\exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t)) \Rightarrow \tau$ is schedulable on $\Pi$
Define $\text{reqlp}(\tau, \Pi, i, t)$ as the optimal value of the objective function of the following:

$$\text{maximize} \quad \sum_{i' \in \text{hep}(\tau, \Pi, i)} \sum_{v_{i'} \in V_{i'}} \sum_{s \in S(\tau, \Pi, i', k')} d_{u_s}$$

subject to

$$\forall i' \in \text{hep}(\tau, \Pi, i), \forall v_{i'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} p_{w_{i'}} \times d_{u_s} \leq \left[ \frac{t}{T_i'} \right] \times C_{i'}^{k'}$$

$$\forall i' \in \text{top}(\tau, \Pi, i), \forall v_{i'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} p_{w_{i'}} \times d_{u_s} \leq x_{UB}(\tau, \Pi, i', k', t)$$

$$(\exists i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s))$$

$$\forall s \in S(\tau, \Pi) \text{ s.t. } (\exists i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s), d_{u_s} \in \mathbb{R}_{\geq 0}$$

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** \( (\forall \tau_i \in \tau, (\exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t)) \Rightarrow \tau \text{ is schedulable on } \Pi \)
Define $\text{reqlp}(\tau, \Pi, i, t)$ as the optimal value of the objective function of the following:

\[
\begin{align*}
\text{maximize} & \sum_{i' \in \text{hep}(\tau, \Pi, i)} \sum_{v_{i'}^{k'} \in V_{i'}} \sum_{s \in S(\tau, \Pi, i', k')} du_s \\
\text{subject to} & \forall i' \in \text{hep}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} \text{pw}_{i', s \setminus \{i', k'\}}^{k'} \times du_s \leq \left\lfloor \frac{t}{T_{i'}} \right\rfloor \times C_{i'}^{k'} \\
& \forall i' \in \text{top}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} \text{pw}_{i', s \setminus \{i', k'\}}^{k'} \times du_s \leq \text{xUB}(\tau, \Pi, i', k', t) \\
& \forall s \in S(\tau, \Pi) \text{ s.t. } (\exists i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s) \\
& \forall s \in S(\tau, \Pi) \text{ s.t. } (\exists i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s), du_s \in \mathbb{R}_{\geq 0}
\end{align*}
\]

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** $(\forall \tau_i \in \tau, (\exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t)) \Rightarrow \tau$ is schedulable on $\Pi$

Given $\tau, \Pi, i, t$, evaluating $\text{reqlp}(\tau, \Pi, i, t)$ can be done in polynomial time.
Define \( \text{reqlp}(\tau, \Pi, i, t) \) as the optimal value of the objective function of the following:

\[
\begin{align*}
\text{maximize} & \sum_{i' \in \text{hep}(\tau, \Pi, i)} \sum_{v_{i'} \in V_{i'}} \sum_{s \in S(\tau, \Pi, i', k')} d_u_s \\
\text{subject to} & \forall i' \in \text{hep}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} p_w_{i', s \backslash \{i', k'\}} \times d_u_s \leq \left\lceil \frac{t}{T_{i'}} \right\rceil \times C_{i'}^{k'} \\
& \forall i' \in \text{top}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} p_w_{i', s \backslash \{i', k'\}} \times d_u_s \leq \text{xUB}(\tau, \Pi, i', k', t) \\
& \exists \langle i'', k'' \rangle \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s) \\
& \forall s \in S(\tau, \Pi) \text{ s.t. } (\exists \langle i'', k'' \rangle \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land ((i'', k'') \in s)), d_u_s \in \mathbb{R}_{\geq 0}
\end{align*}
\]

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** \( (\forall \tau_i \in \tau, (\exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t)) \Rightarrow \tau \) is schedulable on \( \Pi \)

Evaluating this can be done in pseudo-polynomial time.
Define $\text{reqlp}(\tau, \Pi, i, t)$ as the optimal value of the objective function of the following:

$$\text{maximize} \quad \sum_{i' \in \text{hep}(\tau, \Pi, i)} \sum_{v_{i'}^k \in V_{i'}} \sum_{s \in S(\tau, \Pi, i', k')} du_s$$

subject to

- $\forall i' \in \text{hep}(\tau, \Pi, i), \forall v_{i'}^k \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} \text{pw}_{i', s \setminus \{i', k'\}}^k \times du_s \leq \left[ \frac{t}{T_{i'}} \right] \times C_{i'}^{k'}$
- $\forall i' \in \text{top}(\tau, \Pi, i), \forall v_{i'}^k \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k') \land (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land (\langle i'', k'' \rangle \in s))} \text{du}_s \leq xUB(\tau, \Pi, i', k', t)$
- $\forall s \in S(\tau, \Pi) \text{ s.t. } (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \land (\langle i'', k'' \rangle \in s))$, $\text{du}_s \in \mathbb{R}_{\geq 0}$

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** $(\forall \tau_i \in \tau, (\exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t)) \Rightarrow \tau$ is schedulable on $\Pi$

This is a generalization of the classic response-time analysis for Rate-Monotonic.
Define \( \text{reqlp}(\tau, \Pi, i, t) \) as the optimal value of the objective function of the following:

\[
\begin{align*}
\text{maximize} & \quad \sum_{i' \in \text{hep}(\tau, \Pi, i)} \sum_{v_{i'}^{k'} \in V_{i'}} \sum_{s \in S(\tau, \Pi, i', k')} du_s \\
\text{subject to} & \quad \forall i' \in \text{hep}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} \left \{ \text{pw}_{i', s}^{k'} \} \right \} \times du_s \leq \left \lceil \frac{t}{T_{i'}} \right \rceil \times C_{i'}^{k'} \\
& \quad \forall i' \in \text{top}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} \left \{ \text{pw}_{i', s}^{k'} \} \right \} \times du_s \leq xUB(\tau, \Pi, i', k', t) \\
& \quad \forall s \in S(\tau, \Pi) \quad (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \wedge ((i'', k'') \in s)) \\
& \quad \forall s \in S(\tau, \Pi) \text{ s.t. } (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i)) \wedge ((i'', k'') \in s)), du_s \in \mathbb{R}_{\geq 0}
\end{align*}
\]

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** \((\forall \tau_i \in \tau, (\exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t)) \Rightarrow \tau \text{ is schedulable on } \Pi\)

For details (about this theorem and other results about the model), see B. Andersson et al., "Schedulability Analysis of Tasks with Co-Runner-Dependent Execution Times," ACM TECS, 2018.
Define \( \text{reqlp}(\tau, \Pi, i, t) \) as the optimal value of the objective function of the following:

\[
\begin{align*}
\text{maximize} & \quad \sum_{i' \in \text{hep}(\tau, \Pi, i)} \sum_{k' \in V_{i'}} \sum_{s \in S(\tau, \Pi, i', k')} \text{du}_{s} \\
\text{subject to} & \quad \forall i' \in \text{hep}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} \text{pw}_{i', s}^{k'} \times \text{du}_{s} \leq \left[ \frac{t}{T_{i'}} \right] \times C_{i'}^{k'} \\
& \quad \forall i' \in \text{top}(\tau, \Pi, i), \forall v_{i'}^{k'} \in V_{i'}, \sum_{s \in S(\tau, \Pi, i', k')} \text{pw}_{i', s}^{k'} \times \text{du}_{s} \leq \text{xUB}(\tau, \Pi, i', k', t) \\
& \quad \forall s \in S(\tau, \Pi) \text{ s.t. } (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i')) \wedge ((i'', k'') \in s)) \\
& \quad \forall s \in S(\tau, \Pi) \text{ s.t. } (\exists (i'', k'') \text{ s.t. } (i'' \in \text{hep}(\tau, \Pi, i')) \wedge ((i'', k'') \in s)), \text{du}_{s} \in \mathbb{R}_{\geq 0}
\end{align*}
\]

Schedulability analysis (timing verification) is done as follows:

**Theorem 4.2.** \((\forall \tau_i \in \tau, (\exists t \in [0, D_i], \text{reqlp}(\tau, \Pi, i, t) \leq t)) \Rightarrow \tau \text{ is schedulable on } \Pi\)

We have a schedulability analysis for tasks with co-runner dependent execution times.
Comparing this result to classic Rate-Monotonic Analysis

<table>
<thead>
<tr>
<th>Work</th>
<th>Model (Execution Time Depends on Core Runners)</th>
<th>Time Complexity of Exact Schedulability Analysis</th>
<th>Formulation of Schedulability Test</th>
<th>Time Complexity of Schedulability Test</th>
</tr>
</thead>
<tbody>
<tr>
<td>Previous work</td>
<td>No</td>
<td>( \leq ) Pseudo-polynomial</td>
<td>((\forall \tau_i \in \tau, (\exists t \in [0, D_i], \sum_{j \in h_{ep}(\tau, \Pi, i)} \left\lfloor \frac{t}{T_j} \right\rfloor C_j \leq t)) \Rightarrow \text{schedulable})</td>
<td>Pseudo-polynomial</td>
</tr>
<tr>
<td>This article</td>
<td>Yes</td>
<td>Co-NP-hard in the strong sense</td>
<td>((\forall \tau_i \in \tau, (\exists t \in [0, D_i], rqlp(\tau, \Pi, i, t) \leq t)) \Rightarrow \text{schedulable})</td>
<td>Pseudo-polynomial if the number of processors in (\Pi) is fixed</td>
</tr>
</tbody>
</table>
This result generalizes classic Rate-Monotonic Analysis to undocumented multicore

<table>
<thead>
<tr>
<th>Work</th>
<th>Model (Execution Time Depends on Corunners)</th>
<th>Time Complexity of Exact Schedulability Analysis</th>
<th>Formulation of Schedulability Test</th>
<th>Time Complexity of Schedulability Test</th>
</tr>
</thead>
<tbody>
<tr>
<td>Previous work</td>
<td>No</td>
<td>$\leq$ Pseudo-polynomial</td>
<td>$(\forall \tau_i \in \tau, (\exists t \in [0, D_i], \sum_{j \in hep(\tau, \Pi, i)} \left\lfloor \frac{t}{T_j} \right\rfloor C_j \leq t)) \Leftrightarrow$ schedulable</td>
<td>Pseudo-polynomial</td>
</tr>
<tr>
<td>This article</td>
<td>Yes</td>
<td>Co-NP-hard in the strong sense</td>
<td>$(\forall \tau_i \in \tau, (\exists t \in [0, D_i], reqlp(\tau, \Pi, i, t) \leq t)) \Rightarrow$ schedulable</td>
<td>Pseudo-polynomial if the number of processors in $\Pi$ is fixed</td>
</tr>
</tbody>
</table>
This looks very theoretical. Does this really work in reality?
I understand that for each task, the set of co-runner sets is polynomial in the number of tasks assuming that the number of processors is fixed.
But I still wonder how one can obtain the lower bound of speed of execution of one task given co-runners.
5.2 Obtaining Task Set Parameters for a Given Software System

Every scheduling theory that provides 100% guarantees on timing relies on a model. Traditional scheduling theory relies on knowledge of an upper bound on the execution time of a program, and therefore, the research community has developed methods for finding an upper bound on the execution requirement of a program when the program runs in isolation on a single-core processor (Wilhelm et al. 2003). One can distinguish between (1) static methods that take the program as input and compute an upper bound without running the program, (2) hybrid methods that measure execution times of parts of a program and then use static methods to compute an upper bound for the entire program, and (3) dynamic methods where a software practitioner uses domain knowledge to identify a set of worst-case inputs and then measures the execution time of the program for these inputs. Sometimes the program is run multiple times with variations of inputs and the maximum is added based on engineering judgment. The dynamic method is common in industry today.

The model used in this article, however, also requires that a task/segment is described with a lower bound on its speed as a function of components. Unfortunately, the current state of the art does not exist any method for obtaining such a lower bound while considering all the complexities of modern memory systems. Therefore, in order to use our scheduling theory (as in practice and also in our case study/validation in Section 5.3), we need to develop a method for obtaining such a lower bound. We will do so by using method (3) mentioned above but extending it so that it also provides a lower bound on its speed as a function of components. The new method is shown below as pseudo-code:

1. Let SS be an integer indicating the sample size used in our experimental method. Assign a value to SS (for example $SS = 100$).
2. For each task $\tau_i \in \mathcal{V}$, for each $C^i_k \in \mathcal{V}_k$, do:
   a. Run a job of task $\tau_i$ on the computer platform such that no other tasks execute on this platform.
   b. Repeat the measurements $SS$ times and let $CMB^i_k$ denote the set of these measurements (execution times from measurements of a task executed in isolation).
3. For each task $\tau_i \in \mathcal{V}$, for each $C^i_k \in \mathcal{V}_k$, do:
   a. Find an upper bound on the number of memory accesses performed by a segment of a job of task $\tau_i$. (This can be obtained using performance monitoring counters. Run these measurements $SS$ times. Let $H^i_k$ denote the largest measurement.
   b. Find an upper bound on the time for a single memory access assuming that we do not know the conditions. Let $MA$ denote this (e.g., MA could be $500$ nanoseconds).
   c. $pd^i_k = \max_{C^i_k \in CMB^i_k} \frac{MA}{H^i_k}$.
4. For each task $\tau_i \in \mathcal{V}$, for each $C^i_k \in \mathcal{V}_k$, do:
   a. For each task $\tau_j \in \mathcal{V}$, for each $C^j_l \in \mathcal{V}_l$, do:
      b. For those segments in co, keep running all of them continuously (i.e., if segment $C^i_k$ is in co, then when $C^i_k$ has finished execution, let this segment $C^i_k$ start execution again immediately).
   c. While the segments in (b) execute continuously, execute a single job of segment $C^i_k$ of task $\tau_i$ and measure the execution time of $C^i_k$. Repeat this measurement $SS$ times and let $CMB^i_k$ denote the set of measurements (execution times from measurements of a task executed with co-runners).
OK, I buy that you can obtain these lower bounds on the speed of execution through measurements. But I wonder how trustworthy these numbers are. Have you done any validation?
OK, I buy that you can obtain these lower bounds on the speed of execution through measurements. But I wonder how trustworthy these numbers are. Have you done any validation?

You could do a case study. Choose a set of tasks and use your measurement-based approach to obtain speed as a function of co-runners. Now, you have a taskset. Then run your schedulability analysis on this taskset.
OK, I buy that you can obtain these lower bounds on the speed of execution through measurements. But I wonder how trustworthy these numbers are. Have you done any validation?

Then run the system and measure response times of each task. If you observe one case where the observed response time > calculated upper bound on response time, then you have falsified your theory; otherwise, you have corroborated your theory.
5.3 Validation of Model and Schedulability Analysis

We want to compare the computed upper bound on the response time of a task against the measured response times of jobs of this task in order to find out if there is any case in practice where the measured response time of this task exceeds the upper bound on the response time of this task. If so, the schedulability analysis would be unsafe. To get insight into this question, we pursue a case study with a specific taskset and a real embedded platform (ARM Cortex-A9 quad-core processor).

To avoid cache-related preemption delay, we used the software cache partitioning implementation of Linux/RK (Kim et al. 2013) and assigned a private cache partition to each task.

We constructed a taskset as follows. First, we created six synthetic programs with different intensities of memory accesses on the target platform. We call each such program a building block. There is no shared variable between building blocks. We characterize the execution time and consumer interference using the method in the previous section with SS = 100. The consumer description is complete; that is, the building block is characterized for each possible set of consumers.

Then we create tasks from these building blocks as follows. Task τ₁ consists of executing building block 1. Hence, we model task τ₁ as having a single segment. Task τ₂ consists of executing building block 2 and then executing building block 3. Hence, we model task τ₂ as having two segments. Task τ₃ consists of executing building block 4. Hence, we model task τ₃ as having a single segment. Task τ₄ consists of executing building block 5. Hence, we model task τ₄ as having a single segment. Task τ₅ consists of executing building block 6. Hence, we model task τ₅ as having a single segment. Task τ₆ consists of executing building block 2 and then executing building block 6.

Hence, we model task τ₆ as having two segments.

We assign tasks to processors and assign priorities to tasks and assign T and D parameters and obtain the execution time and consumer interference parameters from the building blocks. This gives us the taskset specified in Table 3. The CO parameters are not shown because they consume lots of space; these parameters are available at the end of the source code of the tool that performs schedulability testing.

Our tool yields the following upper bounds on response times of task: 0.113 for τ₁, 0.554 for τ₂, 0.823 for τ₃, 0.155 for τ₄, 0.146 for τ₅, and 0.007 for τ₆. (units in seconds).

We run the system for 1,000 seconds and measure the response times of jobs. For task τ₁, we record the maximum response time of a job of this task and let τ₁ denote it. From the experiment, we obtain τ₁ = 0.107, τ₂ = 0.483, τ₃ = 0.127, τ₄ = 0.137, τ₅ = 0.142, τ₆ = 0.305. (units in seconds).

With these figures, we observe that the response time overestimates the observed response times by less than 15%. From this experiment, we see that (1) no deadline was missed and (2) the observed response times were close to the computed upper bound.
From B. Andersson et al., "Schedulability Analysis of Tasks with Co-Runner-Dependent Execution Times," ACM TECS, 2018:

With these figures, we obtain that the response time overestimates the observed response times by less than 15%. From this experiment, we see that (1) no deadline was missed and (3) the observed response times were close to the computed upper bound.
Conclusion: It is possible to analyze timing of software executing on undocumented multicore.
Thanks!