AUDIO Extension
for the X Window System

This project aims to develop an extension to the X Window System that provides audio services. Like the X Window System itself, the extension is designed to support network transparent operation, however great care has been taken to support local clients equally well as if they had direct access to the audio hardware.

Project status

  • PCM audio support is complete; clients can playback and/or capture data through the X server; local playback/capture achieves latency <5ms, making the extension suitable for even the most demanding applications
  • An (experimental) ALSA plugin is provided; in principle it allows any application that uses ALSA to access the audio device to transparently use the X audio extension as well; many applications are known to work, including but not limited to: xmms, vlc as well as all ALSA sample programs. Some programs known not to work (e.g. Adobe Flash Player) are still under investigation
  • The extension introduces the concept of an "audio manager" that performs a similiar role for audio as the window/compositing manager do for images; currently only a very minimalistic audio manager has been implemented (it can just mix multiple client audio streams without any weighting), a considerably more powerful audio manager is under preparation
  • Hardware mixer controls are not yet supported, but will be introduced along with the new audio manager
  • Currently only ALSA is supported as server-side audio backend; a backend using the OSS API is under preparation, but is suffering from a lack of testers

If you are interested and would like to try it out, you are encouraged to proceed to the download section; be sure to read the installation instructions!

Why not esd/arts/PulseAudio/...?

Esd/arts/PulseAudio provide stand-alone audio servers; if applications want to do audio in addition to presenting a graphical user interface, they have to connect to two separate services. While this works fine in many cases from an end user's perspective, this is rather undesirable from a system integration perspective.

Any stand-alone audio server inevitably ends up duplicating a lot of functionality already present in the X server: this includes communication (marshalling, sending and receiving messages from/to client), resource management, authentication and access control. In contrast, the audio extension developed in this project reuses the infrastructure provided through the X Window System. This results in cleaner system integration overall (e.g. just enable XSELinux and the audio services immediately become subject to mandatory access control like all other X services), and in a comparatively small code base (just around 7000 lines of server-side code).

Stand-alone audio services also have a hard time synchronizing audio and video for multimedia applications. If X server, audio server and application are running on the same machine the application can do the synchronization "naively" (IPC latencies are neglegibly low). For network transparent operation on the other hand communication delay jitter cannot be neglected and synchronization has to be performed on the target system. This means that X server and audio server must be tightly integrated and be able to refer to objects within the other for purposes of synchronization.

Using the X protocol as transport mechanism also means that audio services become network-transparent in cases where they previously were not (e.g. ssh tunneling, packet filters). A common myth is that X not well-suited for audio due to tight latency requirments; however this project demonstrates that the obstacles are surmountable if done properly.

Helge Bahmann <>