Theatre Sound Design, Show Control & Virtual Sound System Software

SoundMan™-Server Virtual Sound System API Command Language (SoundMan-Script)


Home Products Support Community Company
Site Map SoundMan Virtual Sound System Software Downloads Theatre Sound Design Directory Contact
News ShowMan™ & E-Show™ Show Control Software Frequently Asked Questions Bibliographies & Articles About
Pricing - End User AudioBox™ Theatre Sound Design & Show Control System Manuals & Documentation Show Control Resources
SoundMan Virtual Sound System Software
SoundMan-Server Virtual Sound System (VSS) API Software
SoundMan-Server Virtual Sound System API Command Language (SoundMan-Script)
User Interfaces & Application Software for SoundMan
SoundMan User Group Discussion List
SoundMan Licence Agreement

SoundMan-Server Virtual Sound System API Command Language (SoundMan-Script)
Updated: 2008-11-30; SoundMan-Server V1.0.36

Introduction to SoundMan-Script commands:

In this document, currently available features are shown in normal upright text. Features planned for the future are shown in italic text, like this.

SoundMan-Server is an audio playback and matrixing engine. All command control and monitoring is performed by textual commands, called SoundMan-Script commands. Some real time control can also be performed by MIDI messages communicated via a MIDI interface and Open Sound Control messages communicated via the network. SoundMan-Script must be used to configure the MIDI and OSC interface.

SoundMan-Script commands are normally sent to the engine using a Telnet client on port 20000 at the host where SoundMan-Server is running. Currently only one Server can be running on any single PC. At a future time we may allow additional Servers; then the port number will become configurable. SoundMan-Script commands may also be entered (usually for testing purposes) in the Server monitoring dialog. There is a command line window and a command history window. These exist primarily for Server and client debug purposes; they are not intended for normal show operations.

It is anticipated that most users will not directly use a Telnet program to communicate with the server. Instead, various other programs containing graphic representations of Server functions will open TCP connections to the Server on port 20000, and will send commands and receive responses in response to various internal state changes.

Multiple sessions can be open at the same time to a single Server. Currently up to 20 connections may be made at the same time. Each connection currently sees the same view of the Server, and can control any aspect.

In the next version, SoundMan-Server will be able to host a large number of VST plugins and they will be utilized to create patch points where external and internal devices can be inserted into internal digital signal 'paths'.

In a future release access controls will be available to limit the features and sections of the Server that can be controlled and monitored from any given session. This will allow a clean separation of control when a single Server controls multiple simultaneous shows. It will also be possible to synchronize audio playback with and to have it chase and lock to MIDI Time Code.

We will also develop a high level SoundMan-Script language which defines and controls the user's three-dimensional space, called SoundMan-Space, and the multichannel live and playback sound objects, called SoundMan-Objects, as they move, diverge, converge, diffuse, expand and contract in this space.


Summary Developer Notes

We invite all third party developers interested in using SoundMan-Server as your audio engine API to contact us for individual support and quotations on versions specific to your needs and we can provide you with an OEM ISV version that does exactly what you want at a cost that is unbeatable.

At the moment, SoundMan-Server is a highly refined audio engine with enormous capabilities but we are only just beginning to exercise the SoundMan-Script command structure so that the largest number of options and features can be accessed by third party applications.

We will of course be increasing SoundMan-Script commands and capabilities and will keep you informed of these developments. If you have any questions at all please don't hesitate to communicate with us directly at any time. On request we will forward to you all currently proposed and implemented commands for comment and implementation, pending finalization of the command set.

SoundMan-Server can be controlled by making a TCP/IP socket connection to port 20000. SoundMan-Server acts as a server on this port, listening for connections. Sixteen connections can easily be made simultaneously. Once a connection is established, commands are sent via an easily readable ASCII, line based protocol. Telnet is a suitable tool for testing connections and sending simple commands.

Assuming the application is on same machine:

  • Start - Run "telnet 127.0.0.1 20000"

or, if not, (for example):

  • telnet 127.0.0.1 20000

SM-S is controlled by opening port 20000 on the target system. Note that you can easily telnet in from another system; the control program does NOT have to run on the main system. Further, you can have up to 16 sessions going at once. So you could have an operator interface, a couple of designer interfaces, an automatic actor tracking interface and a loudspeaker processing interface all running at the same time, using the same SM-S host system.

Multiple SM-S host machines can also run on the same network, allowing a 'SoundMan-Server Processor Farm' of virtual audio engines, networked together and controlling different segments of the same or even different audio systems simultaneous and coordinated by the user interface software.

All commands to the engine are text. All responses from the engine are text. They are simple, regular, and allow a lot of abbreviation to make things simple and fast.

The machine is in one of two states: idle with no interface open, or operational with an interface open. (Currently you can only have one interface open, but that will change at some point. There will be a handful of rather complex configuration rules for dealing with multiple interfaces, because they aren't necessarily time-locked, and other issues).

The engine works only with ASIO interfaces. ASIO interfaces have names (as defined by the manufacturer of the interface) and the interfaces are in a list, so have index numbers. You can open or close an interface by name or number. You can also specify a few parameters when you are setting the interface. These can control the number of input and output channels you want, and the number of playback channels. There is also an operating mode parameter that controls assorted features and limitations.

There are some ASIO drivers that provide a stardard ASIO interface layer for other audio driver models but this adds one more interpretive level and therefore increases latency.

  • CONFIG GET INTERFACES

Returns a numbered list of the available interfaces

  • CONFIG SET INTERFACE <"name" or number> <parameters>

Selects a particular interface.

  • CONFIG GET INTERFACE (note singular, not plural)

Gets information about the current interface.

There is also a command to close the current interface:

  • CONFIG CLEAR INTERFACE

The parameters are somewhat important. SoundMan-Server defaults to AudioBox AB1616 emulation. So it will have no more than 16 inputs, 16 outputs, and 16 playback channels, regardless of the number of channels on the physical interface. You might prefer instead something like

  • CONFIG SET INTERFACE "MOTU 324" MODE ABEMULATION ON INPUTS 0 OUTPUTS 40 PLAYBACKS 200;

This will configure the matrix to vaguely work like an extended AB. It will have no live input channels, 40 output channels (if the interface is at least that big), and 200 playback channels.

  • CONFIG CLEAR LOG

Clears the history buffer. This can be useful for applications that leave the server up for extended periods of time. For instance, a command could be sent to clear the log at midnight.

SoundMan-Server's inherent latency is extremely small, usually only a few samples long so to keep overall latency to a minimum carefully choose the audio interface card since that will determine the majority of the latency time.

For manual testing there is a command line interface where commands can be typed directly. This is an easy way to experiment with commands and see what the results are.


Command Format Overview

All commands are intended to be human-readable, but also easy to generate mechanically. As a result, various key elements can be expressed in several different ways.

Some of these ways are easier for humans to enter and comprehend; others are easier for a machine to generate. Regardless of the exact format chosen for expression, the results will usually be exactly the same, as long as all of the elements are entered in a single command line.

All commands consist of "words". These words are separated by "whitespace", that is, spaces and tab characters. Where whitespace is required to separate words, at least one space or tab character is required; however multiple whitespace characters can be used. Whitespace can precede the first word of a command and can follow the last word of the command.

Sometimes the whitespace separator characters are optional. This is generally the case when two items can be easily differentiated without intervening space characters. For instance, if a semicolon is used to end a command, whitespace is not required before the semicolon -- a semicolon can easily be told from any other command term without any need of spaces separating it from the other terms.

Generally speaking, all commands are of one of two forms: Action requests beginning with the word SET, and inquiry commands beginning with the word GET. (There are a few other special terms also.)

GET or SET by itself is not meaningful. You have to say which kind of item within the Server environment that you want to interrogate or change, and which specific item you are interested in. Then you have to say which specific property of that item you want to set or get.

So a GET or SET command has the general format:

GET <item type> <item identifier> <item property>

SET <item type> <item identifiers> <item property> <property value>

Note that with a GET command you should only identify a single item in one command. For instance, only inquire about the current gain setting for a single channel at once. This makes it easier to sort out the inquiry responses and be sure that you always get the response you expect.

On the other hand, with a SET command it is often very handy (and sometimes necessary) to be able to set the same property for multiple items at the same time. For instance, you might want to mute a range of channels at the same time.

When you set a property for multiple items, you must be sure that all of the items have that property. If not, the command may return an error without acting on any of the items, or some of the items may be changed before an error is returned. It will in general not be possible to determine what if anything was changed in these conditions except by using GET commands to see what the current values are, or by re-issuing a corrected command.

Delimiter usage and meanings:

{ word | other word } either 'word' or 'other word' is *required*
[ word | other word ] either 'word' or 'other word' or <nothing> :
<descriptive name> 'descriptive name' describes something that goes in this place.

Generally in the <descriptive name> case there should be an earlier definition of exactly what <descriptive name> turns into. For instance you might have <channel list> that is defined to be

<channel list> ::= { <channel number> | <channel range> } ...
<channel range> ::= <channel number> { - | TO } <channel number>

This would say a <channel list> is one or more instances of either a <channel number> or a <channel range>, and a <channel range> is a <channel number> followed by a dash or TO followed by another channel number. Then there would have to be some additional description to say that the second number in the range has to be not less than the first number and a few other things.

  • START
  • { SAMPLES <number> |
  • TIME <number> |
  • TC <timecode> |
  • <number> }

Which is another way of saying

START { SAMPLES <number> | TIME <number> | TC <timecode> | <number> }


Overview of Command Structure

SoundMan-Server can be controlled via another program or the command line interface, which is similar to a DOS command line interface.

Simplistically, there are exactly two commands: GET and SET. SET changes things, and GET returns their current values.

Now we get to the question of "things". What is it you can SET or GET? We can SET or GET attributes of various things. The most common thing is a channel.

This simplicity is reflected in the basic command formats:

  • SET CHANNEL <channel list> <attribute-value pairs for all channels in the list>
  • GET CHANNEL <channel list> <attribute>
  • CLOSE CHANNEL <channel list>

(You can abbreviate CHANNEL as CHAN.)

Well, what is a channel list? This would seem to be fairly central to the interface. And it is.

Fortunately a channel list is pretty simple. Anyone that has programmed a lighting console in the last 20 years will recognize the basic "channel list" format immediately. The only slight difference is that we not only have channel numbers, but we also have channel types.

Playbacks are completely separate from live inputs. You can think of the box as two complete matrices summing to the same outputs.

Channel capability: *EVERY* live input, playback input, crosspoint, and output, has a "control lump" that provides the following functionality:

  • mute
  • solo
  • gain fader
  • delay fader
  • phase invert
  • delay enable
  • EQ enable
  • 7 bands of parametric EQ, each of which has:
    • EQ band enable
    • band biquad parameters - or -
    • shape freq Q gain
      • 6db/octave lowpass
      • 12db/octave lowpass
      • 6db/octave highpass
      • 12db/octave highpass
      • adjustable Q bandpass or notch (depending on the gain setting)
      • 6db/octave low-shelf
      • 12db/octave low shelf
      • 6db/octave high shelf
      • 12db/octave high shelf
      • flat frequency response w/ adjustable gain
  • 31 band graphic EQ

Every "analog" control can be faded over time. This includes gain, delay, EQ frequency, EQ band gain, etc.

You have all of the above controls at every summing node in the box.

Playback inputs have additional parameters:

  • track file name
  • channel number in the file to play (in case file is stereo or multi-channel)
  • sample rate
  • playback speed in percentage, 100 = normal
  • playback speed as pitch, adjustable in fractional cents.
    • (speed and pitch multiply. 50% playback rate at 2 octaves pitch is normal playback speed)
  • track start/stop/pause/resume commands and states
  • track start point
  • track stop point
  • track resume point
  • track loop-from point
  • track loop-to point
  • loop count

The start point is where to start playing initially, normally 0. The stop point is when to stop playing, normally the end of the track. The resume point is the point to loop to when you hit the end of the track. It doesn't have to be the same as the start time. Normally disabled and zero.

The loop-from and loop-to points set up an internal loop. It could be used the same as the end-resume point. You can set the number of times to loop. Normally this is a very large number that is effectively infinite. (In reality there can be any number of loops in a track, and they can overlap in strange and confusing ways. However, the interface only lets you specify one loop point pair currently.)


Channel Groups

You often want to specify more than one channel in a command. Channel specifications are a list of channels and channel ranges, somewhat in the way that a light board specifies channels. If you give more than one channel to a command, exactly the same thing happens to all of the channels you specify. This is especially important with the PLAY, PAUSE, STOP and RESUME commands. If you specify a list of channels in a single command they WILL start or stop synchronously and stay that way. If you use a series of individual play or stop commands you have no assurance of synchronization.

Groups allow you to lock any number of playback channels to timecode. You can also lock the timecode generators to incoming timecode from a reader, so you can either regenerate timecode or you can generate timecode of a different form locked to the input timcode (for instance, generate 24 fps timecode from 30 fps timecode.

SMPTE readers are "fake output channels". There are two of them, on channels O1000 and O1001. You can route any input channel or for that matter playback channel to either of the readers. You should never route more than ONE source to a given reader if you want it to decode correctly

SMPTE generators are "fake playback channels". This lets you start and stop them in sychronism to other playback channels, and to set speed and pitch and starting time using "track" parameters, just as you might do if you were playing back a timecode track recording.

There are two SMPTE generators, on channels P1000 and P1001.

Playing back multiple tracks in sync to timecode requires the use of a GROUP. There are a number of group channels with names starting from GROUP 0 or G0. The group is the ringmaster that receives incoming timecode from a timecode reader and makes sure that all of the playback channels in the group stay in sync.

To make this work you first have to declare a group that has a number of playback channels in the group, and which also has a SMPTE reader. You also specify the timecode lock mode and starting timecode value for the tracks. All tracks in a group need to start at the same timcode value, and tracks locked to timecode position cannot be set to loop. If the track looped there would be multiple possible timecodes for each position in the track. This is not currently allowed.

There are two ways you can lock tracks to timecode. You can lock them by pitch/speed, or you can lock them by time.

Sometimes all you need is to make sure that your playback channels will play at the same speed as some external device, like a video projector, but you can start and stop the sound manually. If this is what you want, you want to lock to timecode speed. This lets you start and stop tracks manually, and the tracks can loop. However, they will faithfully follow the incoming SMPTE playback rate.

More commonly you want to lock the playback tracks to a constant time position. When you do this you can set all of the tracks to be locked (in the group declaration) and start the tracks. If timecode is not present, or it is before the track starting time, the tracks will not play. Once the timecode appears and is stable and within the track range the tracks will play to timecode. You can manually stop the tracks even if timecode is still playing, and they will stop. When you start them again there will be a short delay while they seek to the current position for the current timecode value.

Audio will only follow timecode that is running forward. If the timecode stops, or starts running backward, or runs at a constant frame number, the audio will not play. Once the timecode is running forward and is within the track time range the tracks will play IF they have previously received a PLAY command.

When playing tracks to timecode, you do not send the Play command to the individual tracks as you would normally do. Instead, you tell the group to play, for instance, PLAY G0.

The commands necessary to play to timecode and all specific new commands are described under SET GROUP, below.

Channels are numbered from 0 to numchannels-1. Channels are classified by their type: input, output, xpoint, etc. So a complete channel specification would be

Also if you are setting delay fades for a group of playing channels and don't want phase errors you should probably send a single delay change command to all the channels.

You can specify channel groups which allows groups to to be controlled by master channels and other groups:
G | GROUP <nn> syntax can be added to the channel list syntax so that you can mix normal channels and groups of channels in the same command.

Future versions of SoundMan-Server will include additional tools for synchronizing multiple servers on multiple interfaces. If you can manage to loop a spare input/output pair between all of the servers, they will be able to lock to pretty much sample accurate across all interfaces

  • INPUT 4
  • OUTPUT 12
  • XPOINT 4.12
  • PLAYBACK 9
  • PX 9.5

You can run these together:

  • I4
  • O12
  • X4.12

You can do channel lists:

  • I4 I5 I6 INPUT 17

You can do channel ranges:

  • I4-7
  • INPUT 4-7
  • INPUT 4 - INPUT 7
  • XPOINT 4-7.12-14

This last one specifies all crosspoints between inputs 4/5/6/7 and outputs 12/13/14

You can mix-and-match in a single command:

  • SET CHANNEL I0-9 O0-9 X0-9.0-9 GAINDB -144;

Assuming a 10x10 matrix that would be equivalent to

  • SET MATRIX FULL OFF;

Channel ranges are very powerful tools.

Basic commands:

  • STOP <playback channels>
  • PLAY <playback channels>
  • PAUSE <playback channels>
  • RESUME (or PLAY) <playback channels>
  • SET MATRIX
  • SET CHANNEL


Detailed Command Structure

A channel spec for a single channel has the form:

  • <channel type> <optional spaces> <channel number>

There are several values for channel type:

  • <channel type> ::=
    • { INPUT | IN | I |
    • OUTPUT | OUT | O |
    • CROSSPOINT | XPOINT | X |
    • PLAYBACK | PB | P |
    • PLAYBACKCROSSPOINT | PBXPOINT | PX }

In the above lines the vertical bar "|" means "or". So you can talk about an input channel as an INPUT or as an IN or simply as an I. They all mean exactly the same thing. You can use one form in one command and another form in another command. In fact you can use all of the forms in the same command if you want to.

A <channel number> is a fairly simple concept. A channel number is simply an integer from 0 to one less than the number of channels. That is to say, channel numbers are zero-based, just like they were in the commands to the AudioBox.

Since channel numbers are zero-based, we subtract one from the instance number of the channel to get the channel number.

For instance, the first input channel would be called INPUT 0. If we have 16 input channels, the last input channel would be called INPUT 15.

So, skipping ahead a bit, if we wanted to un-mute input channel 7 (the eighth input channel), we would say:

  • SET CHANNEL INPUT 7 MUTE OFF or
  • SET CHAN I7 MUTE OFF

Above I mentioned a channel list, but I've only described how to talk about a single channel.

The formal syntax for channel list is a little convoluted, but the concepts are simple. A channel list is simply a list of channels and channel ranges.

  • <channel range> ::= <first channel spec> <dash or TO> <higher channel number or spec>

Channel ranges refer to a sequential group of channels of a single type; for instance, a range of input channels or a range of output channels. You cannot have a single range that includes both input and output channels.

Because people are human they like to do things simply. But machines are simpler yet they may want to do things the hard way. A channel range lets you have it both ways.

Most people would do a channel range for input channels as something like:

  • INPUT 3 - 5 or I3-5

A machine on the other hand might end up generating:

  • INPUT 03 TO INPUT 05

Those channel range specifications are exactly equivalent and will talk about the same three input channels: 3, 4, 5.

Finally we get to channel list. A channel list is a list of channels and ranges separated by spaces where necessary to make the words clear. (Usually spaces aren't required). Following are examples of channel lists as a human might be inclined to enter them:

  • I1
  • I1-4
  • I0-3 O4-7
  • I0-3 I7 IN 17 OUT16-31
  • I1 O1 X1.1
  • I0-15 O0-15 X0-15.0-15
  • I 1 3 5 7 9 O6 TO11 X1.6-11 3.6-11 5.6-11 X 9. 6 TO 11

In those last three examples we've introduced the syntax for crosspoint channel specifications. Note that a crosspoint has to talk about both an input and an output channel, since it is the junction in the routing matrix between an input and an output. A crosspoint channel spec must therefore have both an input and an output channel number, or possibly input and/or output channel ranges.

After the channel spec is the channel attribute. There are lots of channel attributes, because there are lots of things you can do with a channel. The more common ones are:

MUTE SOLO GAIN GAINDB PHASEREVERSE DELAY EQ

Some of those, like MUTE, are "on/off" attributes. The only values they can have are ON or OFF.

So you can say:

  • SET CHAN I1 MUTE OFF
  • SET CHAN I2 SOLO ON

or

  • SET CHAN I1 MUTE OFF CHAN I2 SOLO ON

This example shows that you can "loop back" from the end of an attribute list to the front of a channel spec, all in the same command. It depends on how lazy you are whether you want to do that.

You can also do several things to a channel or channel group in the same command:

  • SET CHAN I1 MUTE OFF SOLO ON GAINDB -20.3

Some channel attributes are "analog" in nature. For instance, gain and delay can both have values other than ON and OFF. You use numbers to describe the values for an analog attribute. An example above is the "GAINDB -20.3". You could also do "DELAY .015" which would set the delay for some channel to 15 milliseconds.

When you set an on/off parameter, the change is instant. It goes from the current value or state to the new value or state instantly, with no intermediate values, because there are no intermediate values.

With an "analog" parameter, you can either set the value instantly (as shown above) or you can fade to the new level over time. Depending on the parameter you are setting you can sometimes specify a curve that will be used for the fade. In other cases the curve is implicit.

When you specify an analog value, you are specifying the target value or final value. You give the value you want to be there at the end of the change. You do NOT say anything about the current value. So you can't talk about the concept of specifying a "fade up" or "fade down", because the same target level could be either up or down, depending on the current level.

Suppose I want to "fade out" the volume on a channel over 3.1 seconds. I can say the following:

  • SET CHAN I1 GAIN 0 FADETIME 3.1

Of course, maybe I'm a bit more comfortable thinking in terms of timecode. In that case:

  • SET CHAN I1 GAIN 0 FADETC 00:00:03:03.0 or
  • SET CHAN I1 GAINDB -144 FADETC 3:3 or
  • SET CHAN I1 GAIN 0 FADETC 93

Note that timecode is currently assumed to be 30fps non-drop. Fade times are in seconds, and fade timecodes are in frames. So the last example above shows a fade of 93 frames, which is 3 seconds plus 3 frames at 30fps.

The SET MATRIX command is modeled on the AB command of that name, but extended some:

  • SET MATRIX { FULL | DIAGONAL } { ON | OFF | GAIN <gain> | GAINDB <gain in db> }

Examples:

  • SET MATRIX FULL OFF;
  • SET MATRIX DIAGONAL GAINDB -10;

The SET CHANNEL command is the workhorse of the thing:

  • SET CHANNEL <channels> <options>

Options include:

GAIN GAINDB MUTE SOLO DELAY EQ PHASEREVERSE TRACK

  • GAIN <0 to 10> [ FADETIME <seconds> | FADETC <timecode> ]

GAIN 1 is full gain. Gain .5 is -6db. Gain 2 is double normal, and will clip unless the signal levels are low.

  • GAINDB <20 to -150> [ FADETIME <seconds> | FADETC <timecode> ]

Does exactly the same thing as GAIN, but lets you do it in dB.

  • MUTE | SOLO | PHASE REVERSE { ON | OFF }

These are binary options, you can turn them on or off.

  • DELAY OFF
  • DELAY <ON> [ TIME <seconds> | TC <timecode> ]

Lets you turn the delay off (remembering the current delay value) or turn the delay on, or set a delay value. In some cases you can set a negative delay. (The sum of the input and crosspoint delay for all crosspoints that are on has to be greater than or equal to zero.)

  • EQ { ON | OFF }
  • EQ BAND <n> [ ON | OFF ]
    • SHAPE <shape>
    • FREQ <freq> [ FADETIME | FADETC ]
    • BANDWIDTH | BW | Q <n> [ FADETIME | FADETC ]
    • GAIN | GAINDB <n> [ FADETIME | FADETC ]
    • PARAMS <5 numbers>

Note that Q and bandwidth are not the same thing, but they control the same internal parameter.

  • TRACK FILE <path-to-file> [ CHANNEL <n> ]
    • START <seconds> | STARTTC <timecode>
    • STOP <seconds> | STOPTC <timecode>
    • RESUME <seconds> | RESUMETC <timecode>
    • LOOP FROM { TIME | TIMECODE | TC } <n> TO { TIME | TIMECODE | TC } <n>
    • SAMPLERATE <rate>
    • PITCH <n> [ FADETIME | FADETC ]
    • SPEED <n> [ FADETIME | FADETC ]

Example of setting up from scratch and doing a simple multitrack playback:

  • CONFIG SET INTERFACE 1;
  • SET MATRIX DIAGONAL FULL;
  • SET CHAN I0-15 GAIN 0;
  • SET CHAN P0 GAIN 1 TRACK FILE "C:\TRACKS\TRACK1.WAV";
  • SET CHAN P1 GAIN 1 TRACK FILE "C:\TRACKS\TRACK2.WAV";
  • SET CHAN P2 GAIN 1 TRACK FILE "C:\TRACKS\TRACK3.WAV";
  • SET CHAN P3 GAIN 1 TRACK FILE "C:\TRACKS\TRACK4.WAV";
  • SET CHAN O0-3 GAINDB -15;
  • PLAY P0-3;


Examples of Real World Commands

With the above information you should have a pretty good chance of making commands that would set up many of the features of the routing matrix. There are many other commands (there are things other than channels, and there are some special commands other than SET). There are also specialized attributes that exist only for playback channels. These are needed to be able to play back audio files and will be detailed in the next release of this document.

I'll give one final example here. This is the complete setup required to play a seven-channel synchronized fountain show with appropriate static routing to the outputs. The first 5 channels are audio, the last two are a SMPTE track for the lasers and lighting, and an FSK track for the fountain controller.

CONFIG SET INTERFACE "MOTU 324" MODE ABEMULATION ON INPUTS 0 PLAYBACKS 16;
SET MATRIX OFF;
SET CHAN P1 TRACK FILE "T:\AudioBox001.WAV" START TIME 120; Skip the preroll
SET CHAN P2 TRACK FILE "T:\AudioBox002.WAV" START TIME 120;
SET CHAN P3 TRACK FILE "T:\AudioBox003.WAV" START TIME 120;
SET CHAN P4 TRACK FILE "T:\AudioBox004.WAV" START TIME 120;
SET CHAN P5 TRACK FILE "T:\AudioBox005.WAV" START TIME 120;
SET CHAN P6 TRACK FILE "T:\AudioBox006.WAV" START TIME 120;
SET CHAN P7 TRACK FILE "T:\AudioBox007.WAV" START TIME 120;
SET CHAN PX 1.0 2.1 3.0 4.1 5.0 5.1 GAIN 1; everything to 1 & 2
SET CHAN PX 6.6 7.7 GAIN 1; smpte and fountain
SET CHAN PX 3.2 4.3 5.4 GAIN 1; spread surround around
PLAY P 1-7;

The first line attaches the server to a MOTU 324 ASIO driver. It also says to not use any input channels, have 16 playback channels (the default for ABEMULATION mode) and use however many output channels (up to 16 for ABEMULATION mode) that are on the interface.

The second line sets the gain for all channels and crosspoints to "off", which is GAIN 0 or GAINDB -144.

The next 7 lines assign audio files to playback channels and set the starting point 120 seconds into the tracks. This skips the first 2 minutes of SMPTE preroll on these tracks. The tracks will not loop, and will stop when they reach the end of the various wave files. (The files are in different lengths, so will stop at different times.)

The next three lines are a little confusing. They are setting up the matrix in a test configuration for stereo playback. The tracks actually are:

  • P1left main
  • P2 right main
  • P3 center
  • P4 left rear surround
  • P5 right rear surround
  • P6 SMPTE
  • P7 Fountain FSK

In a real configuration we might actually set up the matrix something like this:

  • P1 to channel 0, gain -10db
  • P2 to channel 1, gain -10db
  • P4 to channel 2, gain -10db
  • P5 to channel 3, gain -10db
  • P3 to channel 4, gain -3db
  • P6 to channel 7, gain full
  • P7 to channel 8, gain full
  • P1-5 to channel 15, gain -22dB for booth monitor

The commands to do that might be

  • SET CHAN P1-7 GAINDB 0; set all playback tracks to "passthrough" gain
  • SET CHAN OUT 0-3 GAINDB -10 CHAN O4 GAINDB -3; set main output levels
  • SET CHAN OUTPUT 7 OUTPUT 8 GAIN 0; smpte/fsk outputs to full
  • SET CHAN OUT 15 GAINDB -22 MUTE ON
  • SET CHAN PX 1.0 2.1 4.2 5.3 6.7 7.8 GAIN 0; All main xpoints to full
  • set chan px1-5.15 gain 0; set monitor crosspoints to full

Two things worth noting in the above:

  1. We don't have to start with playback channel 0 if we don't want to
  2. Commands don't have to be in upper case.


Commands and Responses

All commands are sent on a single line per command. The command line is ended with a normal "newline" character or combination, normally the Enter or Return key on a keyboard, or in C, the sequences "\n" or "\r\n".

Command text may be terminated before the end of the line with a semicolon ";" character. This allows comments on the end of the command lines, if desired. (Also, when entering commands in the command line of the Server monitor window, the semicolon is mandatory to end the command.)

Commands are entered as human-readable text strings. The commands consist of combinations of certain key words and parameters of various types. Many of the key words and parameter identifiers can be spelled in various ways, catering to both how humans and machines might want to enter commands. For instance, the key word "channel" can also be entered as "chan".

Commands may be entered in upper case or lower case, or any combination.

There are two basic kinds of commands: control commands and inquiry commands. Control commands have a strictly defined single-line response format. Inquiry commands have a more loosely defined response format that may be a single line, or often multiple lines.


Control Command Responses

The purpose of a control command response is to indicate that the command was received, and to additionally indicate that the command succeeded or failed. Control command responses immediately follow the command, and are never returned out of order, and every control command will return a single-line response.

Control command ("SET" command) responses are human-readable text, and are always returned in upper case. These responses always begin in the first character of the response message (no leading spaces) and always end with an "\r\n" line ending combination.

To make machine parsing of responses easier, every control command response begins with the word "OK" if the command succeeded or the word "ERROR" and a four digit response code if the command failed. Following this response code will be a description of the error.

The error response numbers are intended to indicate the general category of the error cause. These numbers are not completely standardized yet, and will be documented in a future command set revision. A response starting with "ERROR" is the only signifier of an error having occurred.

SoundMan-Server does not require a response to errors from client applications and will in all other respects continue operating normally. We recommend the client application create a log of responses and errors so shows can be debugged. Of course you can also obtain logs from SoundMan-Server. Some client applications should deal with errors that are critical to their operation. The general principle behind SoundMan-Server is that there should be nothing that can stop the show, and that errors can be rectified by subsequently sending the correct commands.

However, you should pay special attention to these two error types:

  1. If any command other than an initial query or two returns "interface not open" you will only get errors until the interface is opened.
  2. If you send multiple channels in a command and get an error such as "channel NN doesn't exist," the ENTIRE command on all channels will fail. Note there are times you MUST send multiple channels in a command, such as sync play and sync stop.

We recommend querying the interface on startup to see how many inputs, outputs, and playback channels there are. If there are fewer than the show requires you may want to bring up a message letting the operator know.

We also recommend storing the real number of in/out/pb channels, and have a routine that builds channel lists for commands. This routine would check the requested channels against the real channels and remove any channels that don't exist. You might want to generate internal error messages if you end up cutting channels from commands.

After these error mitigation procedures, simply logging commands and errors (possibly showing the errors in a special colour) and responses (with timestamps) should be sufficient.


Inquiry Response Format

An inquiry command (a "GET" command) can return a single line error response as described above for control commands. This would most often happen when there is something wrong with the syntax or parameters of the inquiry command.

An inquiry command normally returns either a single line response or a multiple-line response. Any specific command will always return the same type of response, either single-line or multiple-line. The number of lines in a multi-line response is not fixed, but it is easy to determine mechanically. A single-line response obviously only returns a single line.

Inquiry command responses are not limited to upper-case only, as are command and error responses.

Some inquiry commands can request multiple responses at periodic intervals. Each of these periodic responses can be single-line or multiple-line, depending on the normal response format for the inquiry command in question. An example of this is the GET VU command, which can request that VU updates be sent at periodic intervals until requested otherwise.

Asynchronous responses (ones that are not the direct result of a preceding command being issued) are not yet implemented. However when they are, the first character of the first line of an async response will be an "@" character to flag the line as async. If it is a multi-line response only the first line will be flagged with an @ sign. It is expected that all async responses will be single line responses, but the program receiving one should check the rules given above to determine if the response is a single-line async response or the first line of a multi-line async response. All multi-line responses are guaranteed to be transmitted as a group. No other single or multi-line response will ever be mixed with another multi- line response.


Single-Line Response Format

A single-line inquiry response begins with an indication of the response purpose. This is not as formalized as the OK or ERROR response described above, but does indicate the purpose of the response. For instance, a GET VU response begins with "VU". The response code for each inquiry command is documented with the command.

All single-line responses except OK or ERROR are terminated by a semicolon to make parsing easy. The caller receiving a response can tell if it is single-line or multi-line by making some simple checks:
a) If the first word is OK or ERROR it is a single-line response.
b) If the response line ends with a semicolon it is a single-line response.
c) In all other cases it is the first line of a multi-line response.
When a program first connects to the Soundman-Server socket it will receive a Welcome message. This welcome message is primarily intended for the unusual case where a human operator has connected to SoundMan-Server. This welcome message isn't a 'response' because no command has been sent yet.
However, the program connecting may be expecting a response of some sort. In order to have the maximum chance that the Welcome message will be ignored by the connecting program, the message is set up to look like either three single-line responses, or a three-line multiline response.
Thus the entire welcome message consists of a three-line 'banner,' with each line starting with OK. This is followed by a period on a line by itself, which is the terminator for a multi-line response.


Multi-Line Response Format

Any response with a first line that does not have a first word of OK or ERROR or does not end with a semicolon is the first line of a multi-line response. The complete multi-line response ends with a line containing a single ".\r\n" in the first character position.

A multi-line response begins with a header line indicating the type of the response. Following that will be zero or more lines of actual response text. A final line consisting of a lone period (technically the sequence ".\r\n") indicates the end of the response. No other responses can be interspersed with this multi-line response, and the order of the lines in the response will never be jumbled (unless caused by network problems and failure of correct packet sequence reassembly.)

Note that a program sending an inquiry command that returns a multi-line response must expect the possibility of getting a single-line error response instead. Thus, it is not sufficient to go into a loop looking for an ".\r\n" response line to end the response. It is also necessary to check for the first line of the response beginning with "ERROR". No multi-line response will ever begin with this string.


Queued Responses

SoundMan-Server always sends a command or inquiry response immediately on completion of the command or inquiry processing. The Server only processes one command at a time (on each control port) and never overlaps command processing or processes commands out of order.

Thus the Server will never queue up responses, nor will it send them out of order. However, in the Telnet TCP protocol, the send and receive streams are asynchronous. Multiple commands may be sent by a client before attempting to receive responses; there is no requirement that a receive be done between every command. (Most clients will be able to have a read "pending" at all times on the TCP port, and thus will always be able to receive responses from the Server.)

If a client sends multiple commands before looking for responses, there will be multiple responses queued up in the response queue. Doing this is generally bad practice; if one command fails and this is not noted until after a number of other commands are sent, the subsequent commands may not have the intended effect. Even though the responses have been queued by the Telnet protocol, there should be the same number of responses as commands issued.


Missing Responses

The Server should always send a response for every command. However, network protocols are not 100% reliable, and there is the slight chance that either a command to the Server or a response from the Server can get lost.

Since the Server is always a "responder" it does not need any sort of timeout on receiving the next command -- it is prepared to wait forever for another command.

However, most client software will care about getting responses to its commands and inquiries. It should therefore have code to protect against the possibility that an expected response will not arrive. How this is coded must be determined by each individual application. However, processing responses immediately after each command is sent can make detection and recovery from missing responses (and the possibility of commands that got lost before getting to the Server) much easier.


Setting Multiple Properties in One Command

SET commands can also be used to set multiple properties for the same items. For instance, you could set both gain and delay for multiple channels with a single command. Thus, the overall SET format expands to:

SET <item type> <item identifiers> [<item property> <value>]…

In the above line, the "[]…" syntax indicates that the items in the square brackets are repeated one or more times.

A single SET command can also set properties for multiple types of items, or more commonly, for multiple items of the same or similar type. For instance you might want to mute one or more channels while at the same time unmuting other channels. You can do this by repeating the contents of the SET command beginning with the <item type>. So we can expand the above definition of SET to:

SET [<item type> <item identifiers> [<item property> <value>]…]…

This is probably a little confusing, so an example is in order. Say you want to mute input channel 1 and unmute input channel 2. You can use the following command to do this:

SET CHANNEL INPUT 1 MUTE ON CHANNEL INPUT 2 MUTE OFF

You can look at this as being two separate commands combined into one:

SET CHANNEL INPUT 1 MUTE ON

SET CHANNEL INPUT 2 MUTE OFF

Of course when you combine multiples like this you don't have to set the same property in both. The following is perfectly legal:

SET CHAN IN 1 GAINDB -20 CHANNEL OUT 5 PHASEREVERSE ON

Note the above example uses the abbreviations CHAN, IN, and OUT instead of spelling out the full words CHANNEL, INPUT, and OUTPUT. You can use abbreviations any place that you would use the full word.


SoundMan-Server Items and Properties

Before proceeding with a description of the command format, it is necessary to introduce some idea of the "things" that exist inside Soundman-Server, and what properties those things have. The various commands all refer to these items and properties, so it is important to understand what they are talking about.

SoundMan-Server is basically a two-dimensional audio matrix. Audio input appears on input channels , where it is controlled in various ways. This controlled audio stream from an input is then routed to a row of crosspoints where it can be directed to output channels where it can be further controlled. Each input channel has its own row of crosspoints, with one crosspoint control for each available output channel. Each output channel sums the signal at its crosspoints for all of the input channels. Thus, the signal from any input channel can be directed to any or all of the output channels.

There are various types of input channels. One type of input channel is connected to an ASIO audio "live input". Another type of input channel is a "playback channel" and is connected to a single audio channel in an audio file being played. (If the audio file has multiple channels, each channel in the audio file must be connected to a separate matrix input in order for the sound in that channel to be heard.) Another type of input channel is a signal generator. There can be other specialized types of input channels.

In contrast to the multiple types of input channels, there is essentially only one type of output channel. This is connected to the live output channel on the ASIO interface, and results in audio output to a speaker or other device.

As mentioned above, the input channels are connected to the output channels through "crosspoint channels", or more simply just "crosspoints". A crosspoint never connects directly to an input source or an output destination -- it is always an intermediary between the input channel and the output channel.

Thus, SoundMan-Server has the following basic types of "items" that you might want to control:

  • The ASIO interface
  • Input channels
  • Playback channels (a form of Input channel)
  • Output channels
  • Crosspoint channels
  • Signal Generators

There are a few additional items that can exist under certain conditions and configurations, but they are not as generally useful as the items above. They are discussed later in separate sections.

The ASIO interface and the signal and timecode generators will be discussed in separate sections later. That leaves the various forms of channels to be discussed here.


Channel Syntax

The general form of the command set has been introduced in preceding sections. This section begins detailing the actual syntax of the various commands and command options available.

As mentioned previously, all commands begin with a key word. The available command key words are:

  • CONFIG SET
  • CONFIG GET
  • SET
  • GET
  • PLAY
  • PAUSE
  • RESUME
  • STOP
  • LOOP
  • REPEAT

Except for the CONFIG commands, almost every action refers to one or more channels . A channel can be an input channel, an output channel, a playback channel, a crosspoint channel, or a playback crosspoint channel.

It is very common to want to talk about more than one channel in a single command. Therefore, there is specific syntax to talk about single or multiple channels in a simple and easy manner. Channels can also be named.

Since channels are used in virtually every command, the forms are detailed in the next section before the actual commands are described. In general any channel of any type can appear in <channels> for any command. Some few commands or parameters are only applicable to certain channel types. For instance, the PLAY command only accepts playback channels, and specifying an output channel would be an error. These limitations are described in the individual commands.


Channels

Individual or multiple channels are used in virtually every command. This section describes the syntax used to describe channels in a command.

There are several ways that the same thing can be accomplished. There is no preference on one way over another, they can all be used interchangeably. Multiple methods can be used in the same command without problems. It is felt that certain forms are generally more reasonable when entered by hand, while other forms may be preferred when a program is generating the commands. There is no requirement that one form or another be used.

At the lowest level is a Channel Specifier . This is the way to specify one single channel. A Channel Specifier must give the type of the channel and the number of the channel. Channels have the following types. The following are equivalent and all mean the same thing:

  • INPUT
  • IN
  • I

As are these:

  • CROSSPOINT
  • XPOINT
  • X
  • PLAYBACK
  • PB
  • P
  • PLAYBACKCROSSPOINT
  • PBXPOINT
  • PX
  • OUTPUT
  • OUT
  • O

A CROSSPOINT goes between an INPUT and an OUTPUT.

A PLAYBACKCROSSPOINT goes between a PLAYBACK and an OUTPUT.

While INPUT and PLAYBACK channels are very similar, it helps to give them separate names. This allows each type to use the same channel number range, while still referring to separate channels.

Each channel must have a channel number . Channel numbers begin with 0 and work upward. The maximum input or output channel number is determined by the ASIO interface in use, plus any limitations on the number of channels that might have been given when the interface was configured. For a typical 2x8 interface such as an Echo Gina, the maximum input channel number is 1 and the maximum output channel number is 7. This gives an input channel range of 0 to 1 and an output channel range of 0 to 7.

So a complete Channel Specifier consists of the channel type and the channel number . The following are some examples of valid input and output channel specifiers:

  • INPUT 5
  • OUT 3
  • PLAYBACK 7
  • I2
  • O4
  • P7

Note that for convenience the space between the channel type and the channel number is optional. You can put it in or leave it out, it doesn't matter.

Channel Specifiers for Crosspoints and Playback Crosspoints are more complex. There is one crosspoint for each input/output channel combination. So to specify the crosspoint of interest, you must specify both an input and an output channel number, rather than just a single number.

This is done using the form:

Input-channel . Output-channel

So a complete Crosspoint Specifier might be of the form:

  • CROSSPOINT 1.0
  • X0.0
  • PX5.5

In each case, the first number is the input channel number and the second number is the output channel number. Jumping ahead a bit, the command:

SET CHAN X3.2 GAINDB -90

This would set the gain for input channel number three to output channel number 2 to -90dB, or nearly silent. Note that this is setting the crosspoint gain, not the input or output gain itself! Even though this sets a very low gain between input channel 3 and output 2, it is quite possible to have a high gain between input channel 3 and output channel 5 at the same time.

Individual channels can also be named and their names used in all commands instead of the channel number.

A crosspoint can be named, and referred to by name in commands that take a crosspoint identifier. The crosspoint can also be referred to by input_name.output_name, or by the usual number.number.


Channel Ranges

Often you want to do the same thing to sequentially numbered channels. You could do this by specifying each channel individually: IN1 IN2 IN 3 IN4 for instance. But this gets tedious. So you can do the same thing with a channel range.

A channel range consists of the channel type specifier followed by the lowest numbered channel in the range, a dash or the word TO, and the highest numbered channel in the range. So the example given above can also be done as

  • IN1-4
  • IN 1-4
  • INPUT 1 TO 4
  • INPUT1 TO INPUT4

This illustrates that spaces are optional around the dash, and between the channel type and the first channel number. Note that spaces are only optional around TO when it is next to a number. It also shows that if you want you can repeat the channel type after the dash. However, if you do this, it must be the same channel type as the initial type. The following examples are invalid channel ranges:

  • 1.
    • no type given
  • IN 4-3
    • first channel is higher than second channel
  • IN 3 5
    • this is a channel list, not a channel range
  • INPUT5-OUTPUT7
    • cannot have different channel types in a range

Crosspoint channels can also have channel ranges. Just as the basic crosspoint channel specification is more complex than an input or output channel, the crosspoint channel ranges are more complex.

In a crosspoint channel range you can have an input channel range, an output channel range, or both an input and output channel range. The ranges are specified just like the input and output ranges described above, but are combined into the crosspoint format:

  • X 1.2-4
    • input 1 to outputs 2 to 4
  • PX 0-7.9
    • playback inputs 0 to 7 to output 9
  • X 0-15.0-15
    • all inputs 0 to 15 to all outputs 0 to 15
  • X0to15.0to15
    • same as the previous line

Ranges with a dash work as you would expect. You can do an input channel range, an output chanel range, or both in the same channel spec. It is important to realize that doing this makes a rectangle, so

x 0-1.0-1

is channels

x 0.0 0.1 1.0 1.1

A comma separator is always equivalent to a space in a channel list. So you can do things like

IN 3-5,9

and it works fine. It is exactly equivalent to

IN 3-5 9

In crosspoints, a space after the end of a valid crosspoint spec (that is, after a valid (input-range . output-range) goes back to expecting an input-range. And that expects a dot and an output-range.

So you CAN'T do things like

X 1,3,5.0-4

or

X 0-3.1,5,9

In the first case the first 1 is an input channel, and since there is no dash the next thing needs to be a dot and then the output channel. It isn't, so it is an error.
In the second case the 1 is an output channel, and since the next thing is a space (a comma is the same as a space) the next thing has to be an input channel. So we see 5 9 and the 5 needs to be followed by a dot.


Channel Lists

Sometimes you need to specify a collection of channels and you can not use a range because you have to skip channels. For instance, in a multi-pair stereo setup you might want to set the level for all of the left channels: 0 2 4 6. You can do this easily with a channel list .

A channel list is nothing more than a list of channel specifiers and channel ranges. Unlike channel ranges, there are no requirements that the channel numbers be in increasing order, nor that all channels must be the same type.

In a channel range, you can have a channel type followed by a collection of channel numbers separated by spaces. For instance:

OUTPUT 3 4 5 6

You could also do that as

O3 O4 O5 O6

Of course this is equivalent to the channel range

O3-6

Or to the combination of a list and a range:

  • OUT 3 4-5 6
  • OUT 4-5 6 3
    • this works as well

A channel list doesn't require that all of the channel types be the same:

SET CHANNEL I 0-15 O 0-15 X 0-15 . 0-15 GAIN 0

This would set every input, output, and crosspoint gain to zero (minus infinity in dB) on a 16 by 16 channel matrix with a single command.

This should give you a feel for the channel specification format. More formally it is specified as:

// Channel:

// I | IN | INPUT number

// O | OUT | OUTPUT number

// X | XPOINT | CROSSPOINT number.number

// PX | PBXPOINT | PLAYBACKCROSSPOINT number.number

//

// Range:

// number { TO | - } number

//

// Sequence:

// number,number


Common Channel Properties

All channels, whether input, output, crosspoint, playback, signal generator, timecode generator or any other type, have certain control features in common. Certain channels (notably playback channels) have some additional features or properties not shared by other channel types. However, they still have the basic features shared by all channels regardless of type.

Every channel has the following control abilities:

  • Gain
  • Delay Time
  • Seven Bands of EQ
  • Mute
  • Solo
  • Phase (Polarity) Reverse
  • Delay Enable
  • Overall EQ Enable

Channel properties can be divided into continuous and discrete settings. The first three parameters above are continuous, and the remaining ones are discrete.

Any continuous parameter can be faded over time. The fade can be linear, logarithmic, table lookup, or some other shape. Sometimes the fade shape is determined by the type of the parameter. For instance, delay fades are always linear. Other times there is a default fade shape (which may be different under different conditions) and this default fade shape can be overridden to other fade shapes. In general the fade shape has been chosen to produce the most pleasing results under most circumstances; but there will always be cases where another shape will produce better results.


Fade by Time

The command option is in the form: FADETIME time

A fade time is the time required to "fade" some continuous control value from its current value to the target value. The time is expressed in seconds, with a possible fractional part. The fade time refers to the previous parameter setting in the command line. So the term "FADETIME 2.3" might be controlling a gain fade in 2.3 seconds, or a change in the delay time over 2.3 seconds, or some other value.


Fade by Timecode

The command option is in the form: FADETC timecode

A fade timecode is the same concept as a fade time, but the fade time is expressed as a timecode. The default framerate for the timecode is 30 frames per second, but this can be modified with a framerate specifier in the timecode value. This timecode value is converted internally to a time in seconds, and this is further converted to a time in samples at the current interface sample rate. You can specify either FADETIME or FADETC for a single item, but not both. However, it is perfectly legal to use a fade time on one term (possibly gain) while using a fade timecode on another term (possibly delay).

The timecode value has the form:

[[[[ HOURS :] MINUTES :] SECONDS :] FRAMES [.FRACTIONS][F24|F25|F29|F30]

This long line can be divided into two parts: a time value and a framerate specifier. The time value will be discussed first, and includes everything up to the "fractions" part.

If you just enter a number with no colons, it represents a time in frames , not in seconds. In other words, the statement "FADETC 15.5" will do a fade over 15.5 frames , or just over half a second at 30 frames per second. Note that it is not necessary to specify zeros for the hours, minutes, and seconds. "FADETC 15:2" is a fade of 15 seconds and 2 frames. So is "FADETC 00:0:15:2.0".

Note that you don't need exactly two digits in each field. It is also not required to keep the values for hours, minutes, seconds and frames in the natural ranges. If you give a value that is larger than the natural range for a field (greater than 59 for minutes and seconds, grater than or equal to the framerate value for frames) the value will be converted correctly. It is perfectly valid to say "FADETC 402.17" to fade over 402 and a fraction frames, or to say "FADETC 105:15" to fade over 105 seconds and 15 frames. The only limit is that the maximum legal timecode value is 24 hours.

If you want to give a timecode at some framerate other than 30 frames per second, you will need to specify the framerate on the end of the timecode value. You do this with a "framerate modifier" that appears on the end of the timecode. There is no space between the end of the timecode and the start of the framerate modifier. The framerate modifiers and their meanings are:

  • F24
    • 24 frames per second
  • F25
    • 25 frames per second
  • F29
    • 29.97 frames per second drop frame
  • F30
    • 30 frames per second (the default value)
Examples of valid timecode values are:
  • 13
    • 13 frames
  • 13F24
    • 13 frames at 24 frames per second
  • 13.22
    • 13.22 frames
  • 457.6
    • 457.6 frames
  • 1:2.2F29
    • 1 second 2.2 frames at 29.97 frames per second dropframe
  • 45:0
    • 45 seconds 0 frames
  • 1:45:0
    • 1minute 45 seconds 0 frames
  • 1:2:3:4.5
    • 1hour 2 minutes 3 seconds 4.5 frames
  • 0.5
    • half a frame
  • 1:0:0:0
    • one hour
  • 357:0
    • 357 seconds
  • 01:00:00:00F24
    • one hour at 24 frames per second
The following are examples of invalid timecode values:
  • .5
    • invalid, no leading digit
  • 1:::0
    • invalid, no digits between colons
  • :5.1
    • invalid, no digit before the colon
  • 14.52F22
    • invalid, 22 frames per second is not valid
  • 7000:0:0:0.1
    • invalid, the maximum time is 24 hours, not 7000
  • 01:00:00:00 F24
    • invalid, there is a space before the "F24"


Fade Profiles

The option is in the form: { LOG | LIN | LINEAR | EXP | EXPONENTIAL }

Most gain fades will let you control the technique used to perform the fade. The gain change can be a straight linear level change from the current level to the final level, or a linear lookup in a gain table. Or it can be a log or exponential fade. The fade type chosen by default is different for different gain terms. By default most fades will be log or exponential fades. (Currently log and exponential fades are the same type. This may change in the future.) The main exception is MIDI fades, which are linear fades using table lookup. Specifying a log fade type for a MIDI fade will cause it to calculate a smooth fade rather than interpolating from table values.


Parametric Equalizer Properties

Each of the seven bands of EQ can be controlled individually. Each band has the following properties:

  • Filter Shape
    • Flat
    • 6dB/Octave Low Pass
    • 12dB/Octave Low Pass
    • 6dB/Octave High Pass
    • 12dB/Octave High Pass
    • 6dB/Octave Low Shelf
    • 12dB/Octave Low Shelf
    • 6dB/Octave High Shelf
    • 12dB/Octave High Shelf
    • Bandpass
  • Filter Band Enable
  • Corner (or Center) Frequency
  • Gain (positive or negative)
  • Bandwidth (for bandpass filter)
  • Biquadratic Parameters

As with the normal channel properties, the filter properties can also be divided into continuous and discrete parameters, and all continuous parameters can be faded over time. The continuous filter properties are the frequency, gain, and bandwidth.

Because of the mathematical implementation of each filter section (a biquad section) a simple linear change in some parameter such as frequency will not necessarily result in simple linear changes to the parameter values to the biquad over the fade range. As a result, it is quite possible that the sound of a filter fade may be unacceptable due to aliasing or other faults. In that case a small modification of the filter or fade properties might produce a perfectly acceptable fade. The user should audition any filter fades under actual conditions to determine if the results are acceptable, and be prepared to do some experimentation if not.

A Flat filter is not really a filter as it does not control the gain relative to frequency. It can be used as an additional controllable gain element in some cases. It can sometimes be useful to switch a Shelf filter to a Flat filter when disabling the filter section, as it will retain the shelf gain of the section rather than switching the section to unity gain. In general though filter sections should be disabled whenever possible to reduce processing overhead.

With a Low Pass filter, the low frequencies below the corner frequency are passed through. The signal will be 3dB down at the corner frequency. The shape is essentially flat below the corner frequency, and then rolls off at 6dB or 12dB per octave above the corner frequency. The gain value sets the level of the flat portion of the frequency band. A filter gain of 0dB passes the lowest frequencies through at unchanged level. A positive gain in dB will boost the low frequencies while rolling off the higher frequencies, and a negative gain in dB will reduce all of the signal by at least the amount of the gain setting.

With a High Pass filter, the high frequencies above the corner frequency are passed through. The signal will be 3dB down at the corner frequency. The shape is essentially flat above the corner frequency, and then rolls off at 6dB or 12dB per octave below the corner frequency. The gain value sets the level of the flat portion of the frequency band. A filter gain of 0dB passes the highest frequencies through at unchanged level. A positive gain in dB will boost the high frequencies while rolling off the lower frequencies, and a negative gain in dB will reduce all of the signal by at least the amount of the gain setting.

A Low Shelf filter is really the opposite of a Low Pass filter. A low pass filter rolls off high frequencies and passes low frequencies. A Low Shelf filter controls low frequencies while passing high frequencies unchanged. A Low Shelf filter is thus more closely related to a High Pass filter.

In a Low Shelf filter, the gain controls the attenuation of the low frequencies below the corner frequency. The low frequencies can be decreased with a negative gain in dB, or can be boosted by a positive gain in dB. The amount of boost or cut is controlled by the gain. If the gain is negative, the upper frequencies will be unchanged. The gain will be down by 3dB at the corner frequency, and then will decrease at 6dB or 12dB per octave until the specified reduction level is achieved. At that point the gain will transition back to a flat portion for the remainder of the low frequency spectrum.

A High Shelf filter is the opposite of a low shelf filter, and thus is more akin to a Low Pass filter. In a High Shelf filter the frequencies above the corner frequency are cut or boosted by the specified gain amount, and the gain below the corner frequency is unchanged.

Note that a high or low pass filter set to 0dB gain will pass some of the frequency spectrum through at unchanged level, while rolling off part of the spectrum. On the other hand, a high or low shelf filter set to 0dB will have no effect on the signal at all.

Finally the bandpass filter controls the gain at a center frequency. The gain can be boosted at this frequency with a positive gain in dB, and can be cut at the center frequency with a negative gain in dB. Thus, the bandpass filter can be used both to gently boost the signal level over several octaves, or to cut out a very narrow and deep chunk of frequencies, for instance to eliminate some power line hum.

The bandpass filter has one additional control that the other filter shapes do not have: the Bandwidth value. A large bandwidth value makes the filter very wide, affecting multiple octaves. A very small bandwidth value affects a very narrow frequency range. The exact bandwidth in Hz depends on the center frequency. For instance, with a center frequency of 1000Hz and a bandwidth of .5, the filter will have half the effect at 500Hz and 1500Hz than it does at 1000Hz. (.5 * 1000Hz = 500Hz. 1000Hz-500Hz = 500Hz lower side; 1000Hz+500Hz = 1500Hz upper side.)

Each filter section is implemented internally with a filter type known as a "biquad". This is a mathematical representation of a general filter. How the biquad filters the waveshape presented to it depends on the five parameter values it is given. SoundMan-Server contains routines to convert a number of standard filter shape requests into the appropriate biquad parameters and apply them to the filter band. However, external control programs may have their own biquad parameter computation algorithms that will produce additional filter shapes, or variations on filter shapes that they prefer. So it is possible to set the shape of a filter by setting the biquad parameter values directly.

If a filter is set using biquad parameters and then interrogated for settings, the internal algorithms will analyze the parameters based on the known algorithms. The result may be a misrepresentation of the actual filter shape, depending on the values set in the parameters. If a program is setting biquad parameters directly, it should interrogate the current parameters directly rather than performing the normal GET requests for parameters such as filter shape and frequency. The program should also avoid any attempt to "fade" filter parameters such as gain, frequency, or bandwidth. Instead, the actual biquad parameter values should be updated rapidly by the program to implement the fade. (Although it is debatable how successful this fade will be audibly.)

A note on the range of possible filter settings: not every seemingly logical filter setting is in fact possible to realize in the biquad filters being used. This is especially the case in filters with positive gains, and especially with higher-frequency filters. The problem is that the biquad parameter values are mathematically limited to certain number ranges. If the numbers exceed the valid range, the filter will not do what the user requested, and will very often produce feedback, distortion, or no output signal at all.

The Server checks the parameter values generated by the requested change and will refuse to make the change if the parameters will exceed the valid range. If the invalid values result from an "immediate" (no fade time specified) change, an ERROR reply will be returned to the setting request. However, if the filter parameters are being faded, and the initial settings at the start of the fade are valid, then no error message will be generated. Instead, the fade will simply "freeze" at some point when the next fade step would have generated invalid biquad filter parameters.


Graphic Equalizer Properties

A 31 band graphic equalizer is available at the channels and crosspoints. This can be used as an alternative to the 7-band parametric EQ described above.

The graphic equalizer is implemented using an FFT algorithm. While this allows good control over the overall waveshape and is somewhat less processor overhead than 31 individual parametric filters would be, it has some potential drawbacks, and should be used only when absolutely necessary. The graphic EQ is quite processor hungry and memory hungry, and a small number of them can make quite a dent in the amount of processor time needed to process the audio.

The graphic EQ has a QUALITY setting of 1 to 8, where 1 is the lowest audio quality and 8 is the highest. The default is 3, which should be sufficient for most uses. The quality setting affects three things:

  1. The amount of processor time required for the EQ operation
  2. The amount of latency in the audio processing
  3. The overall amount of audio distortion that can be caused.

A lower quality uses less processor power. It can also result in aliasing or chorusing distortion, especially at lower frequencies. This distortion is a natural result of how an FFT works: it divides the frequency spectrum into a number of "bins", where each bin contains the same number of frequencies. For instance, a 4-bin FFT would divide the range of 0-48KHz into 0-12000, 12000-24000, 24000-36000, 36000-48000. This FFT could produce severe aliasing an distortion, because all simultaneous tones in the range of 0-12000Hz would be merged into a single output tone. The same would happen in the range of 12000-24000Hz.

This FFT size would be quite low overhead, but the chances of it being very usable are slim. You might think it would be unusable in all cases, but that might not be the case. Consider a single person speaking or singing, with no background sound in the same channel. The human voice generally makes only a single frequency at any instant. So there would be no two frequencies to merge in either of the two important frequency bins, and the resulting output would presumably be the same as the input: no distortion would result.

SoundMan-Server does not use a 4-point FFT as the quality would only rarely be usable. The usable sizes are in the range of 128 to 4096 points. The QUALITY figure is used to select the number of points in the FFT, and also changes some other internal parameters that affect the overall sound quality. The higher the number the more FFT buckets, so the less chance of aliasing distortion.

Before you arbitrarily decide to just set the quality setting to 8 any time you use a graphic EQ, there are two other important things to know:

  1. A setting of 8 uses about 64 times as much processor as a setting of 1.
  2. The more points in the FFT the longer the latency!

The FFT size can be related directly to the amount of latency in milliseconds it will add to the channel:
FFT size quality sample rate Latency
512 1-2 48000 10.67 ms
1024 3-4 48000 21.33 ms
2048 5-6 48000 42.67 ms
4096 7-8 48000 85.33 ms
512 1-2 44100 11.60 ms
1024 3-4 44100 23.22 ms
2048 5-6 44100 46.44 ms
4096 7-8 44100 92.88 ms
As you can easily see, even a small "QUALITY 1" FFT will add more than 10ms to the audio path. The highest quality FFT processing will add almost a tenth of a second to the audio path. This would not be a good choice to EQ the foldback for a singer or musician. In fact, it would be unusable for any form of live audio, but it could be used on recorded audio. (But then, why didn't you EQ the recording before using it?)

The graphic EQ can be used along with the parametric EQ on the same channel. This should be avoided unless absolutely required, as both the graphic and parametric EQs use a fair amount of processor time. Use the minimum number of parametric bands, as each band added adds more processing time.

By default when first turned on the graphic eq for a channel is flat. All gains are 1, or 0db, whichever way you want to look at it.

You can set graphics eq bands by frequency. You don't have to get the frequencies exact, the nearest band will be set to the gain specified. For instance, band 16 has a center at 630Hz, but here I've specified 650Hz and it worked:

SET CHAN I1 GEQ ON FREQ 400 = 1 FREQ 500 = 1.5 FREQ 650 = 2 FREQ 800 = 1;

You can also set the bands by band number, if you happen to know the numbers. This would probably be easier for a program than a person. This is exactly the same as the command above:

SET CHAN I2 GEQ ON BAND 14 GAINDB 1 BAND 15 GAINDB 1.5 BAND 16 GAINDB 2 BAND 17 GAINDB 1;

If you are lazy and are setting a range of bands (as we are here) you can use the cryptic but compact form to set multiple bands at once. Again, this is exactly the same as above:

SET CHAN I3 GEQ ON BANDS 14 TO 17 = 1 1.5 2 1; COMPACT FORM

You can get the gain for individual bands or for all 31 bands. Probably a program would be more interested in getting all bands at once than a person would, but by doing this you can kinda see the "shape" of the graphics eq. Here we get the gain for all bands in dB. (Note that it doesn't matter if the command is in upper case or lower case):

get chan i3 geq bands gaindb;
GrEqGaindB I3=0,0,0,0,0,0,0,0,0,0,0,0,0,1,1.5,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0;

You can also get the gain as an absolute number, where 1 = flat, > 1 is a bump, and < 1 is a cut. This can be easier for a program to deal with than a human:

get chan i3 geq bands gain;
GrEqGain I3=1,1,1,1,1,1,1,1,1,1,1,1,1,1.12,1.19,1.26,1.12,1,1,1,1,1,1,1,1,1,1,1,1,1,1;

Here is the documentation in small cryptic form of what the commands are. Ignore the "//" on the front of the lines, I clipped this from comments in the source file. The indenting here implies that the indented stuff follows the stuff on the less-indented line above. Thus the first three lines say:

SET CHAN <channels> GEQ QUALITY <number, 1 to 8>
Or
SET CHANNEL <channels> GRAPHICEQ QUALITY <number, 1 to 8>

The term "fpt" means a floating-point number, that is, a number that can have a decimal point rather than just an integer. So taking a selection of the leading lines, we can build the command:

SET CHAN I14 GEQ BAND 2 GAINDB -6.37;

There are commands to get the response for both individual parametric EQ bands, and for all EQ enabled on a channel, including both parametric and graphic EQ. The response values are given in dB, where 0dB indicates a "flat" response for the channel. The eq response values do NOT include overall channel gains, only the gain contributions from the eq for the channel.

    EQ # Eq chanid=ON|OFF
    • RESPONSE # EqResp chanid=fpt,fpt,fpt... 31 times
      • AT <freq> # EqFreqResp chanid=freq=fpt
    • BAND number
      • RESPONSE # EqBandResp chanid=band=fpt,fpt,fpt... 31 times
        • AT <freq> # EqBandFreqResp chanid=band=freq=fpt
The command

GET CHANNEL <chans> EQ RESPONSE

Will get the response of all enabled parametric eq bands for the channel, plus the response of the graphic eq if it is enabled. If no eq is enabled, the response will show as flat at all frequencies.

The response is returned as 31 numbers representing dB values for specific frequencies. The frequencies are the same as the 31 band centers for the graphic eq bands. These are logarithmically spaced across the audio frequency spectrum, so will give a good overall impression of the frequency response, but it will not show fine detail from very narrow filter bands.

If it is known than there are very narrow parametric bands in use, the user can get a more accurate overall response by issuing a few commands of the form:

GET CHANNEL <chans> EQ RESPONSE AT <freq>

Where the frequencies are picked to cover the area of the narrow eq band. For instance, this could be used to request the overall response at the center frequency of a bandpass filter to get an accurate value for the bottom of a notch or the top of a bump. Since the 31 band responses are in 1/3rd octave increments they will provide good overall frequency response coverage, and requesting individual responses should only be needed for the center frequency of bandpass filters, or at most a few frequencies around the corner frequencies of the various enabled filters.

Because it can be desirable to show a graphic response of the individual eq bands as well as the overall eq curve, you can get the response for each band individually. In this case the response will show the eq band response even if the band is disabled. It is up to the caller to realize that disabled eq bands do not contribute to the overall channel eq value. These individual band responses do not include any contribution from the graphic eq subsystem. However, you can get the graphic eq response in dB with

GET CHAN <chans> GEQ GAINDB

To get the overall response for an individual band, use a command in the form:

GET CHAN <chans> EQ BAND <band> RESPONSE

This will return the response as 31 numbers, which are the gains in dB at the standard 31 band centers, as described above. The response format (as shown above to the right of the command) is slightly different than described previously, since the band number is included at the front of the response.

Since the eq band might be a narrow notch or bump, you can get the exact response at a specific frequency with

GET CHAN <chans> EQ BAND <band> RESPONSE AT <freq>

This is again very similar to the command described above, except that it returns the response for only the single band requested, and it will show the band response whether or not the band is enabled.


Playback Channel Properties

In addition to the common channel properties described above, playback channels have a number of additional properties that are required to connect them to the appropriate section and channel of an audio file. These additional properties are generally:

  • The audio file name
  • The channel number in the audio file
  • The starting time offset
  • The ending time offset
  • The beginning loop point time
  • The "loop-to" time
  • The number of loops
  • The file sample rate (if not encoded in the file itself)
  • The playback Speed Factor
  • The playback Pitch Shift

A single Playback Channel can only play a single audio channel from an audio file. If the file is a stereo file or a multitrack file, one audio channel must be dedicated to each channel in the wave file. To mix a stereo audio file to mono, both channels in the file must be assigned to playback channels, and then the outputs from those playback channels must be assigned equally to the same output channels, where they will be summed to mono.

It will be common for multiple audio channels to be assigned to the same audio file, and generally they will be playing from the same area of the file simultaneously. While the file may be specified many times in different playback channels, the file will only be opened a single time, and all of the playback channels will generally be reading data from the same buffer. Soundman-Server contains a "file server" that is responsible for keeping track of the open audio files and insuring that the audio data is available in common buffers for use by the various playback channels.

Note that while it will be common to play multiple channels from the same audio file in exact sync, this is not a requirement. One channel can be playing audio from one section of the file, and simultaneously another channel can be playing sound from a completely different section of the file. Or two channels can be playing audio from the same section (or different sections), but at different speeds due to a pitch shift on only one playback channel.

As with other channel properties, the "continuous" values in a playback channel can be faded. This would be the speed factor and/or pitch. This permits various "broken record" or "drunken" effects where a pitch change fades over time. The various offset times can not be faded in a playback channel, as doing so would generally make little sense. They can however be changed frequently by external command if there is some reason to do so.

The file name is given as a quoted string, and must contain the complete path to the file; for instance "C:\wavefiles\Track14.wav". Currently only wave files and raw 16-bit mono files are accepted as valid playback files. Other file types will be added in the future.

If the file being played is a mono file, or if no file channel number is given, the first or only channel in the file will be played. This is channel number 1. The channel number must be specified to play any channel other than the first channel in the file.

The starting and/or ending time offsets can be used to only play a part of the sound in the file. The times can be specified as either a time in seconds (including fractional seconds) or as a timecode at 30fps. The track will play from the starting time offset up to the ending time, and then will generally stop (but can be made to loop back to the start time continuously).

The loop-from and loop-to times can be used to create an internal loop in the playback. The loop-from time must be after the start time and before or equal to the stop time. The loop-to time can be anywhere in the file, it does not need to be within the start-stop range.

If the loop-to time is before the loop-from time the audio will loop back, and eventually the loop-to time will again be reached. Each time the audio loops the loop count is decremented. If there are loops remaining when the loop-from point is reached, it will again loop back to the loop-to point. If there are no loops remaining, the loop-form point will be disabled and the file will play through until it reaches the ending point or the end of the file.

If the loop-to point is after the loop-from point the effect will be to skip some of the wave file. In this case the loop-from point will not be seen again, and regardless of the loop count the file will play to the stop point or the end of the file. If the loop-to point is past the stop point, the file will play to the end of the file.

Note that the start, stop, loop-from and loop-to points and loop count can be changed while the file is playing. Thus, a loop can be used to produce any number of unusual effects if the control program is on its toes about changing the various loop points as the track plays.

The Sample Rate, Playback Speed and Playback Pitch settings all interact with each other. The most processor-efficient playback is when the file sample rate is the same as the ASIO interface sample rate, as no conversion is required. However, the ASIO interface will often be running at 48KHz, and many playback tracks are likely to be recorded at 44.1KHz. Thus, a sample rate conversion will be required to make the track play at the correct speed.

The Server uses a simple and efficient interpolation algorithm to determine the sample levels when performing a pitch shift or sample rate conversion. This will sound fine on very large selection of material. There will be some material where this algorithm will be unacceptable. In those cases there are two possibilities.

The first is to set the ASIO interface to the same rate as the media sample rate, if possible. This is the most efficient for playback anyway, and should be selected if all of the media is played at 44.1KHz and there is little or no use of live ASIO inputs.

The second method is to use an off line wave file editor to convert the sample rate of the audio file, and then play the converted audio file. Wave editors can perform the conversion off line and take many minutes (or hours) to perform the conversion. Thus they have available many algorithms that cannot be used by a real time playback system such as SoundMan-Server.

The sample rate conversion (if used) can be regarded as setting the "base playback rate" for the file. The Speed and Pitch settings then modify that playback rate.

The Speed setting is a multiplier. If Speed is set to 2, the channel will play back at twice the base playback rate. If it is set to 0.5 it will play back at half the base playback rate. Note that a Speed setting of 2 is equivalent to a pitch shift of one octave up, and a Speed setting of 0.5 is equivalent to one octave down.

The Pitch setting is also a multiplier on the base playback rate. It is specified in semitones, so a Pitch change of 12 will be one octave upward, and is equivalent to a Speed setting of 2. A Pitch change of -12 is one octave downward, and is equivalent to a Speed setting of 0.5. The Pitch value can be fractional: a Pitch of 0.01 is a one cent pitch increase.

The Speed and Pitch values multiply together before adjusting the base playback rate. If you use a Speed of 2 and a Pitch of 12 the file will play back 4 times faster than normal. If you use a speed of 0.5 and a pitch of 12 the file will play back at the normal playback rate, as the speed and pitch changes will cancel out. While certain values like 0.5 and 12 easily correspond to the same things, a speed of 1.5 and a pitch of -6 will not cancel out. A pitch of -6 will only reduce the pitch by 1.414, not by 1.5. A pitch change of -7.02 would nearly cancel a speed change of 1.5, but the result would not be exact.


Signal Generator Properties

The signal generators appear on input channel numbers 1000 and 1001.

The signal generators can generate various fixed-pitch waveshapes, frequency sweeps, and white and pink noise envelopes. By default it will generate a 1000Hz sine wave.

You can control the following generator properties:

  • Frequency
  • Shape
    • Sine
    • Square
    • Triangle
    • White
    • Pink
  • Sweep
    • Pause <