.TL
PROCOM - INTERPROCESS SAMPLE DATA COMMUNICATIONS FACILITY FOR UNIX
.AU
Gareth Loy
.AI
CARL-CME, UCSD, La Jolla, CA 92093
dgl@sdcarl.ucsd.edu
.AB
PROCOM is an interprocess communication scheme for 
.UX
that facilitates
sharing of properties about data streams
on pipes between cooperating processes.
.AE
.NH
INTRODUCTION
.PP
PROCOM is a process
intercommunication scheme for processes sharing sample data via pipes.
It is an outgrowth of the CARL system of passing floating point binary
data between cooperating processes for signal processing.  
Floating point data in general is referred to as 
.I floatsams.
A 
.I floatsam
is a single precision floating point binary sample.
Short integer data is also handled by this software package.
They are called  
.I shortsams.
Treatment of floatsams and shortsams is equivalent.  The only restriction
is that a data stream must not mix types.  In this document, we
will use the term floatsam as generic of both floatsam and shortsam.
.PP
In the past, no information about the floatsams on a pipe 
between processes was available from
one process to the next, and any required information
had to be supplied by the user who was joining
the programs together at the command line of a shell.  Naturally, it is
a mental strain on users to have to remember what each program needs to
know about the data, and how that program needs to have it said.
To this end, this straightforeward extension of the routines available for
passing floatsams and shortsams between programs was developed.
.NH
SPECIFICATION
.PP
PROCOM has been written to satisfy the following criteria:
.IP \(bu
backward compatability with existing versions of
.B getfloat()
and
.B putfloat(),
.IP \(bu
maximum flexibility in specifying properties of a floatsam stream,
.IP \(bu
no degradation of performance speed,
.IP \(bu
conceptual simplicity at the high level,
.IP \(bu
ease of access to low functional levels,
.IP \(bu
addition of fast buffered block mode i/o, and
.IP \(bu
isomorphic facilities for handling shortsam streams.
.NH
STATIC AND DYNAMIC FILES
.PP
We invent the notion of 
.I static
files to mean files stored on permanent media.  
\fIDynamic\fP files are those that are on pipes.
.PP
Experience has shown that headers on static files are unworkable.
Briefly,
a header on static data cannot easily be grown or shrunk without rewriting
the whole file.  Various kludges to get around this are just that: kludges.
Furthermore, headers on static files add non-trivial complexity to random
access i/o routines to step around the header.  Furthermore, special
facilities must be invented to edit headers on static files, since the
whole file is usually not in a format to be swallowed by a regular editor.
People must then learn to use this funny editor.
.PP
Dynamic files on the other hand have only one of these liabilities.
There is no random access for a dynamic file, 
.B (lseek(2)
does not work on pipes)
so i/o routines remain simple.
There is no point in editing a header on a dynamic file, only in reading
and writing it.  Once read, its in-core representation can be 
easily modified.  The only remaining problem is that of stepping around
a header when an i/o subroutine needs to return the first floatsam of the 
file.  This can be done transparently if the header protocol is obeyed.
.PP
Although this scheme is meant to be used only on pipes, one can of course
redirect the output of a program writing headers to a file, creating a
header on a static file.  This is discouraged since one then has all
the problems about headers on static files mentioned above.  The alternative
is to save the header in a separate file from the data, such as is done
by the 
.I csound
file system developed at CARL.
.PP
With these arguments in mind, the following standard for headers on dynamic
files has been constructed.
.NH
FLOATSAM I/O
.PP
Two routines, 
.B fgetfloat() 
and 
.B fputfloat()*
.FS
*The routine 
.B fputfloat() 
comes into being with the advent of PROCOM,
only 
.B putfloat() 
existed before, read on.
.FE
exist which allow fetching and writing single
floatsams to an open file descriptor, such as 
.B stdin 
and 
.B stdout,
or a file open with
.B fopen(1).
.DS
fgetfloat(fp, iop)
	float *fp;
	FILE *iop;
.sp
fputfloat(fp, iop)
	float *fp;
	FILE *iop;
.DE
.B fgetfloat() 
returns the next floatsam from the file descriptor 
in 
.B iop, 
and deposits it at the address in 
.B fp.
.B fputfloat() 
takes the floatsam at the address in 
.B fp 
and writes it on the
file open in
.B iop.
.PP
Two macros exist to apply this pair conveniently
to the standard input and standard
output:
.DS
#define getfloat(fptr)fgetfloat(fptr,stdin)
#define putfloat(fptr)fputfloat(fptr,stdout)
.DE
.PP
If 
.B fgetfloat() 
and 
.B fputfloat()
are successful, they return a value > 0.
The value 0 is returned if there is no more input or output available,
and the value -1 is returned on errors.
.PP
The i/o is fully buffered.  Insofar as the buffers are not flushed
automatically by 
.B exit(1), 
before the file is closed, or the program is
exited, all open file descriptors must be flushed by a call to 
.B fflushfloat().
.DS
fflushfloat(iop)
	FILE *iop;
.DE
A macro exists to apply this to the standard output:
.DS
#define flushfloat()fflushfloat(stdout)
.DE
.B fflushfloat()
frees the buffer for that iop.
.PP
All normal constraints apply for regular UNIX files in the use of
.B getfloat() 
and 
.B putfloat() 
regarding number of simultaneously open files, etc.
.PP
Two vector routines,
.B fgetfbuf()
and
.B fputfbuf()
behave like their scalar counterparts.
.DS
fgetfbuf(fp, n, iop)
	float *fp;
	short n;
	FILE *iop;

fputfbuf(fp, n, iop)
	float *fp;
	short n;
	FILE *iop;
.DE
where fp is a pointer to an array of length n which is to be
copied to/from the file open on iop.  
The vector subroutines are considerably faster than the scalar ones.
.NH
SHORTSAM I/O
.PP
Isomorphic facilities exist for reading and writing shortsam streams.
Their behavior is identical to those above.
.DS
fgetshort(sp, iop)
	short *sp;
	FILE *iop;
.sp
fputshort(sp, iop)
	short *sp;
	FILE *iop;
.DE
.PP
.DS
#define getshort(fptr)fgetshort(fptr,stdin)
#define putshort(fptr)fputshort(fptr,stdout)
.DE
.PP
.DS
fflushshort(iop)
	FILE *iop;
.DE
.DS
#define flushshort()fflushshort(stdout)
.DE
.PP
.DS
fgetsbuf(sp, n, iop)
	short *sp;
	short n;
	FILE *iop;

fputsbuf(sp, n, iop)
	short *sp;
	short n;
	FILE *iop;
.DE
.NH
MISCELLANEOUS I/O ROUTINES
.PP
The following routine returns a floating point number regardless of
the input format of the data stream.
.DS
fgetsample(xp, iop)
	float *xp;
	FILE *iop;

#define getsample(fptr)fgetsample(fptr,stdin)
.DE
Note that getsample() determines the data format by reading the
header.  If there is no header, it defaults to expecting floatsams.
.NH
PROCESS INTERCOMMUNICATION
.PP
All the above routines
have the option to communicate 
across processes via headers attached to the beginning of sample streams.
This facility is backward compatable to existing programs that use
.B fgetfloat() 
and 
.B fputfloat()*.
.FS
* N.B., old programs must be reloaded with the new versions of the
subroutines.
.FE
Routines exist to create a header, add elements to it, delete elements from
it, pretty-print it, write it on an open file, and read it from an open file.
If these actions are not specifically invoked, 
.B fgetfloat() 
knows how to identify headers, which it silently reads and saves away
for possible future reference.  This allows backward compatability,
or innocent use.
.NH
AUTOMATIC HEADER COPYING
.PP
All the abovementioned sample output routines
do not automatically write headers, with one exception.
If a header has been read on the
.B stdin
file descriptor, and if
.B fputfloat() 
or
.B fputshort()
or their vector relatives
is then called for the first time to write a floatsam on the 
.B stdout
file descriptor, the property list associated with
.B stdin
will be copied and written to
.B stdout.
This means that in the simple case where a program merely calls
.B getfloat()
and
.B putfloat()
(and friends)
without reference to headers (i.e., it is innocent of the notion)
then any header passed to that program will be copied through unchanged.
.PP
Simple mechanisms exist to explicitly
copy an input header to an output header which are described below.
.PP
If the automatic copy of a header from 
.B stdin 
to 
.B stdout
is not desired, it can be disabled,
see below.
.NH
HEADERS
.PP
A header consists of a stream of ASCII character codes.  A sentinel at the
beginning of a floatsam stream (consisting of the character string "HEAD")
signals the beginning of a header.  Header elements are called
.I properties
of the floatsam stream, and they 
consist of pairs of
NULL-terminated (ASCII 0) strings.
The first of a pair is taken as a
.I name
and the second as a 
.I value.
This is fashioned after a similar usage in LISP with regard to variables.
These name/value property pairs can be any strings, including blank-separated
words, etc., so long as they are 
NULL-terminated.  Even the sentinel is a name/value pair, where the name is
"HEAD" and the value is not fixed, but can be any string.  A suggested
convention is that the value part of the sentinal pair should be 
a revision level number for the header format, starting with this
initial release as 1.0.
.PP
So a floatsam stream header consists of property lists, and looks as
follows:
.DS
HEAD revision_level
name1 value1
name2 value2
.
.
.
nameN valueN
TAIL revision_level
.DE
where one NULL exists at the end of each string.
After the very last NULL of the very last string
of the header, as many NULLs must be padded to
align the stream to read/write the next sample correctly (modulo
.B sizeof(float)
or
.B sizeof(short), 
where appropriate.  
.PP
The routines described below implement this scheme, and manage correct
header allignment, etc.
.NH
VOCABULARY NAMES FOR STANDARD PROPERTIES
.PP
A vocabulary for standard properties exists, which must be promulgated
wherever needed.  The list is
in flux as of this writing, and suggestions are welcome.  For starters, the
following is recommended.
.PP
Truely general properties only should be in the system vocabulary list,
kept in an include file, e.g.,  /usr/include/carl/defaults.h.
They are things like sampling rate, number of channels,
data length, sample data format, blocking factor, etc.  They should
be all capitalized to distinguish them from local vocabulary that might be
invented between a subset of cooperating processes which wish to
communicate special information, which should be lower-case, or mixed case.
Such special vocabulary terms should possibly include a key word or letter to
indicate the special vocabulary to which it belongs.
.PP
So far, the following global vocabulary names have been defined.
.DS
.ta 2i
/* PROCOM global header name vocabulary */
# define H_HEAD	"HEAD"
# define H_TAIL	"TAIL"
# define H_SRATE	"SRATE"
# define H_NCHANS	"NCHANS"
# define H_FORMAT	"FORMAT"
# define H_SAMPLES	"SAMPLES"
# define H_REMARK	"REMARK"
# define H_XMAX	"XMAX"
# define H_XMIN	"XMIN"
# define H_YMAY	"YMAX"
# define H_YMIN	"YMIN"
# define H_WINDOWSIZE	"WINDOWSIZE"
# define H_FILENAME	"FILENAME"
.DE
Values to be associated with these names follow:
.DS
/* PROCOM global header value vocabulary */
# define H_FLOATSAM	"FLOATSAM"
# define H_SHORTSAM	"SHORTSAM"
# define H_DFORMAT	H_FLOATSAM
# define H_DNCHANS	"1"
.DE
At CARL, the sampling rate values are defined as:
.DS
# define DHISR	"49152"	/* string fast sampling rate */
# define DLOSR	"16384"	/* string slow sampling rate */
# define DEFSR	DLOSR	/* string default sampling rate */
.DE
.NH
DATA FORMATS FOR MANAGING HEADERS AND BUFFERS
.PP
.RT
The term
.I header
refers to a stream of characters on a floatsam file.
A
.I property 
list
is the digested version of a header which has been read in and removed from 
a floatsam file.
The following definition is used to refer to properties stored on a
property list.
.DS
struct proplist {
	char 	*propname, 
		*propval;
	struct proplist 
		*nextprop, 
		*lastprop;
};

typedef struct proplist PROP;
.DE
.B propname
is the NULL-terminated string name of the property and
.B propval
is the same for the value.  Note the typedef for PROP.
.PP
There is an array of structures that manage the headers and the
buffers for files.  The correct index into these
arrays is derived from the UNIX file descriptor passed to the routines
that manage these arrays.
The declaration for an element of these arrays is:
.DS
struct fltbuf {
	float	*fbuf;		/* sample buffer */
	int	bufsiz;		/* buffer size */
	int	pos;		/* sample index */
	int	cpos;		/* char index for header */
	int	n;		/* count from last read/write call */
	char	prop;		/* property list read/written */
	struct	proplist *p;	/* bi-directional linked list of properties */
};
.DE
.NH
PROCEDURES FOR MANAGING PROPERTY LISTS AND HEADERS
.LP
.DS
char *
getprop(iop, name)
	FILE *iop; 
	char *name;
.DE
.B getprop() 
returns the 
.I value
of the property named
.B name,
from the property list associated with file descriptor 
.B iop
if both the file descriptor and the name exist.
The value returned is a pointer to the string on the header.
It is not a copy of the string.
It returns NULL on failure.
.B N.B.:
.B getheader()
must be called on the i/o stream before 
.B getprop()
can be called.
.LP
.DS
putprop(iop, name, value)
	FILE *iop; 
	char *name, *value;
.DE
.B putprop()
creates a property structure and
puts it at the absolute end of the property list for the file named in
.B iop.
It returns 0 if it succeeded, -1 if it did not.
N.B. it does nothing special about keeping the property inside of
the HEAD and TAIL sentinel properties, or creating HEAD and TAIL
properties for a list if they do not exist.  
Its use is not recommended
unless you are building a totally custom property list.  Use
.B addprop() 
instead.
.LP
.DS
addprop(iop, name, value)
	FILE *iop; 
	char *name, *value;
.DE
.B addprop()
creates a property structure and inserts it just before the TAIL
sentinel property, inventing the HEAD and TAIL sentinel properties
if they do not yet exist.  This is the method of choice for adding
properties to a list.
.LP
.DS
rmprop(iop, n)
	FILE *iop; 
	char *n;
.DE
.B rmprop()
removes the named property from the property list of the 
.B iop.
It swallows the link, freeing all memory.  If the removed property
is the HEAD, then the fltbuf structure pointing to this list of
properties is updated to point to the new head of the list.
.LP
.DS
printprop(proplist, outp)
	PROP *proplist;
	FILE *outp;
.DE
.B printprop()
prints the property list in
.B proplist
on the stream named in
.B outp.
This routine is used to format a property list for viewing, 
or store it in a file separate from
its associated data stream.  
N.B.: It is not to be used to put a header
on an output file, since it does not align the buffer for the following
floatsam data.
.LP
.DS
putheader(iop)
	FILE *iop;
.DE
.B putheader()
writes the property list (created previously by calls to 
.B putprop())
on the file.
.LP
.DS
PROP *
getheader(iop)
	FILE *iop;
.DE
.B getheader()
looks to see if a header has already been read from this file pointer.
If so, it returns the address of the 
base of the bi-directional linked list of properties.  
If no header exists for this file yet
it examines the buffer of data for this file (fetching a new buffer
if necessary) and examines it for
the presence of a header.  If it finds one, it is read and parsed.
It returns NULL if there is an i/o error in reading the
header or if there was no header.  (Examine 
.B errno
for the presence of i/o errors.)
The pointer returned can be handed to, e.g., 
.B printprop().
.LP
.DS
cpioheader(ip, op)
	FILE *ip;
	FILE *op;
.DE
.B cpioheader()
is for use in programs that wish simply to copy a header unmodified from
input to output file.  An input header is looked for, and if found,
is copied and written to the named output file.
Note, this function is invoked automatically when stdin and
stdout are read and written, unless disabled.
.LP
.DS
cpoheader(pl, op)
	PROP *pl;
	FILE *op;
.DE
.B cpoheader()
takes a pointer to a property list,
.B pl,
and copies it to the named output file, but
.B doesn't
.B write
.B it.
A subsequent call to
.B putheader()
is required to actually write the header.
.LP
.DS
noautocp()
.DE
.B noautocp()
controls automatic copying of a header on 
.B stdin
to
.B stdout.
Ordinarily, when a header is seen on
.B stdin
and
.B stdin
is
connected to
.B stdout,
(that is, when the first call is made to 
.B fputfloat()
with iop == stdout,
and a header has been read in on 
.B stdin)
then the header on 
.B stdin
is copied automatically from
.B stdin
to
.B stdout.
This can be disabled by calling the routine
.B noautocp().
This routine is a toggle.  Calling it once disables the copy,
calling it again reinables it.
.LP
.DS
stdheader(iop, name, srate, nchans, format)
	FILE *iop;
	char *name, *srate, *nchans, *format;
.DE
.B stdheader()
creates a standard header.  The value arguments should be strings
derived from the standard property list.  The header is not written.
A subsequent call to 
.B putheader()
or any sample output routine will do so.
.NH
LOW LEVEL FUNCTIONS
.PP
The following are wizard-level functions for mucking about with property
lists.
.DS
PROP *
getpaddr(iop, name)
	FILE *iop;
	char *name;

PROP *
getplist(iop)
	FILE *iop;

putplist(prop, iop)
	PROP *prop;
	FILE *iop;
.DE
.B getpaddr()
returns the address of the named property on the list for 
.B iop.
.B getplist()
returns the address of the head of the property list for iop.
It is equivalent to saying
.B getpaddr(iop, H_HEAD).
.B putplist()
takes the pointer to the property list in
.B prop
and writes it into the
.B p
subfield of the 
.B fltbuf
structure for the file associated with
.B iop.
If the pointer written is not NULL, and no header has yet been
read or written for this file, the header created by this call to 
.B putplist()
is set up to be the header read with
.B getprop()
or
.B getheader()
or written with
.B putprop()
or
.B putheader().
.NH
IMPLEMENTATION
.PP
The implementation has met its specification in all respects.  
In particular, the performance of the new
.B fputfloat()
and
.B fgetfloat()
is about 11% faster than the old versions.
.B fgetfbuf()
and
.B fputfbuf()
are in turn about 5 times faster than the new versions of
.B fputfloat()
and
.B fgetfloat().
.PP
Implementation of the header management routines has not been optimized for
speed, but comparatively little data is dealt with in headers and
this is not felt to be much of a problem.  On the other hand, the
sample data i/o portions have been quite optimized.
.NH
SCREW CASES
.PP
The transition from reading the header to reading the data is a bit
tricky because of the different storage classes of char, short and
float.  The header reading routines attempt to allign the data stream
after reading the header by examining the H_FORMAT property.  
In the absence of a header, the data type is assumed to be float.
Note that there is a problem where one calls 
.B getheader()
explicitly, and there is no header, and the following data is shorts.
The problem is that 
.B getheader()
does not know how to allign for subsequent i/o if it does not find
a header, so it then relies on the default alignment.
If you want the default alignment to be shorts, call the routine
.DS
set_sample_size(size)
	short size;
.DE
which sets the appropriate default sample size in bytes.
If this routine is called before calling
.B getheader()
then the default alignment will be whatever size you set.
.PP
Another problem is where a process wants to be able to do i/o by
other means than the facilities provided, as with write(2).
For sample input, there is no simple
alternative to using procom mechanisms whenever
the data being read might have a header.
For data output however, the following will work:  build a header,
as with 
.B stdheader(),
write it with
.B putheader(),
then call
.B flushfloat()
or
.B flushshort().
The flush will write the header on the pipe, and deallocate the header
buffer.  Subsequent calls to write(2) or whatever, will correctly
append data to the end of the header.
.NH
ACKNOWLEDGEMENTS
.PP
I would like to thank Steve Haflich at MIT-EMS and Adrian Fried
at IRCAM for their valuable advice and suggestions during the
design phase of this software.
