Soft Real-Time Systems

Soft real-time systems

Principles: All principles from both general purpose and hard real-time except
Failure to meet a deadline is considered neither application nor system failure
It's just considered less "good"
What that means is poorly defined and varies from system to system

Missing deadlines

Number of missed deadlines

Amount by which deadlines are missed

"Cost" of missing the deadlines is equalized (weight, cost function)

Frequency of missed deadlines

Fairness

Stability

Missed deadlines trigger algorithmic changes: Period
Compute time
Qualitative changes in the output

Example: MPEG Player (24 frames/second): Change frame rate
Change size of image
Change color depth
Change resolution
Drop frames
Drop parts of frames (change focus)
Change dithering/anti-aliasing
Change other processing

Resources: Primarily CPU and network
Everything else, to a lesser degree

API
1. Applications try to run and unimportant ones are dumped (collaborative load shedding) or simply don't run non-critical parts (SPRING)

2. Applications try to run and get a proportional share of the resources (SMART)

3. Applications explicitly reserve the resources they need (RT-Mach w/Reserves, Rialto) on a FCFS basis

4. Applications specify a range of acceptable resource allocations, and the system dynamically determines how much they get (MMOSS)

5. Applications specify cost-benefit functions and the system determines how much resources they should get, and when, based on this function (Jensen, CMU, DQM).

In general, the application may or may not get the resources it needs

It may or may not be informed when it has missed a deadline

The application may or may not negotiate for the resources

Approaches: Flexible scheduling algorithms
First-come-first-serve reservations
Renegotiation
Cooperation vs. enforcement
Low-level system, high-level system, middleware, application
All or nothing vs. partial allocations
Static vs. dynamic allocation schemes
Job admission or not
Incorporate real-time and non-real-time (spectrum?)
...

Examples: Mach, Rialto (MS), lots of other experimental systems
RealAudio/Video (application solution)

Applications: Desktop audio and video
Virtual reality
Internet telephony
Any non-critical real-time systems
Any system with timeliness concerns

Plusses: Allows for some sort of guarantees in time-sensitive applications
Allows for less-than-worst-case resource allocations

Issues: Everything
How soft are the guarantees
What kind of guarantees are offered
How to mix in non-real-time apps
How to have a spectrum of softness
API and development overhead (hard real-time requires much more knowledgeable developers)
How to do it
How to measure performance
What to do when there are too many applications
What constitutes adequate perfomance
How do you decide how to parcel out the resources
What happens when a deadline is missed
...

My Research

Goals: Build a flexible and general soft real-time system
Develop an unintrusive SRT API
Determine what OS services are really needed to support SRT processing
Determine what API elements are really needed
Implement it and test it with real applications

Most SRT systems start with assumptions about what is needed, then derive a system

Methodology: Start with middleware system
Push it until we discover what can't be done
Implement those things in the kernel

The System: Middleware only, on Unix (Solaris and Linux)
QoS Levels
Minimal API - three function calls
Centralized resource management
Best-effort scheduling
Cooperative resource usage by applications

QoS Levels: Each application has a number of modes (levels) in which it can operate
The levels are sorted by resource usage and benefit provided
The set of levels specify a discrete function of benefit vs. resource usage
Separate SRT policy from SRT mechanism

Level selection: DQM dynamically chooses appropriate levels for running applications
Levels chosen to maximize global benefit across applications
NP-complete problem (reduces to the knapsack problem)

Algorithms: Fair
Proportional
Optimal (with variations)
Greedy (with variations)
Hybrids

Results: It works
>90% utilization with no missed deadlines
Stable execution

Future work: Into the kernel
Application development
Metric development and analysis
Algorithm development and analysis
Fairness, stability specification, overhead analysis
Multiple resources
User Interface work
Networked QoS Levels