Soft real-time systems

Principles
  • All principles from both general purpose and hard real-time except
  • Failure to meet a deadline is considered neither application nor system failure
  • It's just considered less "good"
  • What that means is poorly defined and varies from system to system
  • Missing deadlines
  • Number of missed deadlines
  • Amount by which deadlines are missed
  • "Cost" of missing the deadlines is equalized (weight, cost function)
  • Frequency of missed deadlines
  • Fairness
  • Stability
  • Missed deadlines trigger algorithmic changes
  • Period
  • Compute time
  • Qualitative changes in the output
  • Example: MPEG Player (24 frames/second)
  • Change frame rate
  • Change size of image
  • Change color depth
  • Change resolution
  • Drop frames
  • Drop parts of frames (change focus)
  • Change dithering/anti-aliasing
  • Change other processing
  • Resources
  • Primarily CPU and network
  • Everything else, to a lesser degree
  • API
    1. Applications try to run and unimportant ones are dumped (collaborative load shedding) or simply don't run non-critical parts (SPRING)

    2. Applications try to run and get a proportional share of the resources (SMART)

    3. Applications explicitly reserve the resources they need (RT-Mach w/Reserves, Rialto) on a FCFS basis

    4. Applications specify a range of acceptable resource allocations, and the system dynamically determines how much they get (MMOSS)

    5. Applications specify cost-benefit functions and the system determines how much resources they should get, and when, based on this function (Jensen, CMU, DQM).

  • In general, the application may or may not get the resources it needs
  • It may or may not be informed when it has missed a deadline
  • The application may or may not negotiate for the resources
    Approaches
  • Flexible scheduling algorithms
  • First-come-first-serve reservations
  • Renegotiation
  • Cooperation vs. enforcement
  • Low-level system, high-level system, middleware, application
  • All or nothing vs. partial allocations
  • Static vs. dynamic allocation schemes
  • Job admission or not
  • Incorporate real-time and non-real-time (spectrum?)
  • ...
  • Examples
  • Mach, Rialto (MS), lots of other experimental systems
  • RealAudio/Video (application solution)
  • Applications
  • Desktop audio and video
  • Virtual reality
  • Internet telephony
  • Any non-critical real-time systems
  • Any system with timeliness concerns
  • Plusses
  • Allows for some sort of guarantees in time-sensitive applications
  • Allows for less-than-worst-case resource allocations
  • Issues
  • Everything
  • How soft are the guarantees
  • What kind of guarantees are offered
  • How to mix in non-real-time apps
  • How to have a spectrum of softness
  • API and development overhead (hard real-time requires much more knowledgeable developers)
  • How to do it
  • How to measure performance
  • What to do when there are too many applications
  • What constitutes adequate perfomance
  • How do you decide how to parcel out the resources
  • What happens when a deadline is missed
  • ...

  • My Research

    Goals
  • Build a flexible and general soft real-time system
  • Develop an unintrusive SRT API
  • Determine what OS services are really needed to support SRT processing
  • Determine what API elements are really needed
  • Implement it and test it with real applications
  • Most SRT systems start with assumptions about what is needed, then derive a system
    Methodology
  • Start with middleware system
  • Push it until we discover what can't be done
  • Implement those things in the kernel
  • The System
  • Middleware only, on Unix (Solaris and Linux)
  • QoS Levels
  • Minimal API - three function calls
  • Centralized resource management
  • Best-effort scheduling
  • Cooperative resource usage by applications
  • QoS Levels
  • Each application has a number of modes (levels) in which it can operate
  • The levels are sorted by resource usage and benefit provided
  • The set of levels specify a discrete function of benefit vs. resource usage
  • Separate SRT policy from SRT mechanism
  • Level selection
  • DQM dynamically chooses appropriate levels for running applications
  • Levels chosen to maximize global benefit across applications
  • NP-complete problem (reduces to the knapsack problem)
  • Algorithms
  • Fair
  • Proportional
  • Optimal (with variations)
  • Greedy (with variations)
  • Hybrids
  • Results
  • It works
  • >90% utilization with no missed deadlines
  • Stable execution
  • Future work
  • Into the kernel
  • Application development
  • Metric development and analysis
  • Algorithm development and analysis
  • Fairness, stability specification, overhead analysis
  • Multiple resources
  • User Interface work
  • Networked QoS Levels