README for src/contrib/atrade
Version: @(#)README     1.4 1/9/97

Architecture Trade Capability
Copyright (c) 1995-1997 Sanders, a Lockheed Martin Company

Acknowledgement

This work was performed by Sanders, a Lockheed Martin Company, as part
of the Sanders RASSP program under contract N00014-93-C-2172 to the
Naval Research Laboratory, 4555 Overlook Avenue, SW, Washington, DC
20375-5326. The Sponsoring Agency is: Advanced Research Projects
Agency, Electronic System Technology Office, 3701 North Fairfax
Drive, Arlington, VA 22203-1714. The Sanders RASSP team consists of
Sanders, Motorola, Hughes, and ISX.

Permission is hereby granted, without written agreement and without
license or royalty fees, to use, copy, modify, and distribute this
software and its documentation for any purpose, provided that the
above copyright notice and the above acknowledgement and following two
paragraphs appear in all copies of this software.

IN NO EVENT SHALL SANDERS OR THE UNIVERSITY OF CALIFORNIA BE LIABLE TO
ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL
DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION,
EVEN IF SANDERS OR THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF
THE POSSIBILITY OF SUCH DAMAGE.

SANDERS AND THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIM ANY
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE
PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND SANDERS AND THE
UNIVERSITY OF CALIFORNIA HAVE NO OBLIGATION TO PROVIDE MAINTENANCE,
SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

1.0  Introduction

At Sanders, we have developed a proof of concept architectural trade
capability using Ptolemy's Discrete Event domain.  It is meant as a
first cut at a usable capability, and is provided for demonstration
purposes only.  There are many simplifications made in order to get
the tool working for a few example cases.  It being provided to
members of the Ptolemy community interested in this type of
architectural trade capability work.

A custom graphical front-end to Ptolemy has been developed that allows
a user to sketch a target architecture in one window and quickly map
the stars in a SDF graph in another window to the processors in the
architecture.  Extensions to the DE domain have been implemented to
allow a performance-level model of the architecture to be
simulated. These extensions create a DE domain model representing the
mapping of the algorithm to the architecture and use the Ptolemy
kernel to simulate the performance. The product of the simulation is a
Gantt chart showing the execution of stars over time as well as a
thermometer display of estimates of certain other system level metrics
(weight, size, power, reliability, etc.).  This capability has been
developed as a front-end architectural trade tool for the Sanders
RASSP Program (see above acknowledgement).

2.0  Composition of Architecture Trade Capability

This capability (denoted atrade) consists of extensions to the DE and
SDF domains in the form of new particles and stars.  A custom GUI,
called pigi+, provides the user interface to the capability and works
with the Ptolemy kernel via the ptcl interface.  Since we are using
only the SDF and DE domains and we are using pipes to interface with
Ptolemy, we have chosen to build a custom executable called
pipeptcl.ptiny, based on ptcl.ptiny.  Some minor modifications were
made to the ptcl interface in order to accommodate communication
between ptcl and pigi+ via standard pipe mechanisms, and these are
used in building pipeptcl.ptiny.

Because pigi+ has been built using commercial libraries (RogueWave and
Motif), only a binary which runs under SunOS and Solaris is included
in the distribution.  However, source code for the interface to
pipeptcl.ptiny has been provided.


3.0  Installing atrade

-- The atrade tar file should be installed so that this README file is
        at $PTOLEMY/src/contrib/atrade/README
        
-- Update $PTOLEMY/src/kernel/PortHole.cc with the atrade/kernel/PortHole.cc
        file and rebuild the kernel.  The makefile rule 'updatePtolemy'
        will do this for you, run:
                cd $PTOLEMY/src/contrib/atrade
                make updatePtolemy

-- Create a obj.$PTARCH/contrib/atrade tree for your binaries by running
        MAKEARCH:
                $PTOLEMY/src/contrib/atrade/MAKEARCH

-- Build and install the atrade de libraries and pipeptcl:
                cd $PTOLEMY/obj.$PTARCH/contrib/atrade
		make install

-- cd back to the atrade/bin directory, set an environment variable,
                then start up the pigi+ program
                cd $PTOLEMY/src/contrib/atrade/gui/bin
                setenv PIGI+_HOME `pwd`
        
        pigi+ searches the path for the pipeptcl.ptiny binary to run.

4.0  Getting started

You then invoke pigi+ from the command line

	pigi+

and you will get a small "Architecture Trade" GUI with three buttons.
If you focus your mouse over each button, after a second or so a
banner identifying the button is shown.  From left to right, the
buttons do the following:

Algorithm Schematic:  contains the SDF graph that is to be mapped;
used to draw, save, or load SDF graphs

Architecture Schematic:  contains the architecture that is to be used
for the mapping; used to draw, save, or load architectures

Map Algorithm to Architecture:  brings up dialog box which is used to
specify the system specification and parts data files, and to simulate
the specified mapping

Typically, you first create the algorithm.  This may be best done
first by using pigi and then redrawn within the "Algorithm Schematic"
window of pigi+.  Unfortunately, there is not a way of
importing/exporting graphs between pigi and pigi+.  Next, an
architecture is created using the "Architecture Schematic".  Finally,
algorithmic blocks are grouped and mapped onto the architecture, and
the performance simulation is created and executed, and the results
are displayed via a Gantt chart and thermometer display.  The next
sections describe these steps in more detail.

4.1  Algorithm Schematic

This window is used to draw the SDF graphs which will later be mapped
onto the target architectures.  Algorithms are drawn using a menu and
a set of seven buttons at the top of the Algorithm Schematic window.
Hierarchial graphs are not supported by pigi+.

4.1.1  Menu

Under "File", the user may open a previously saved graph, save the
current graph, or close the schematic window.

Under "Execution", the user may set the run length of the SDF graph by
selecting "Runtime Parameters" and entering a value in the dialog box.

4.1.2  Buttons

There are seven buttons used in creating SDF graphs.  A banner
identifying each button appears after the mouse focuses on the button
for a second or so.  From left to right, the buttons do the following:

Run Graph:  executes the SDF graph by creating and simulating the
universe via the ptcl interface; the run length is set via the dialog
box reached by selecting "Runtime Parameters" under "Execution" on the
menu bar

Star Palette: provides the palette of stars which are used to create
the graphs; you must reselect the SDF domain the first time the
palette is brought up for it to display the available stars; new stars
will appear in the upper left portion of the drawing area and should
be moved before the next star is created; the three buttons (galaxy,
galaxy input, galaxy output) are not currently implemented and should
not be used

Toggle Grid Display:  allows grid display to be toggled on and off

Toggle Connection Display:  allows connections on ports to be toggled
between being displayed or hidden

Pointer: changes the mode of the mouse to pointer so that the state
parameters of the stars may be displayed and edited (left button),
that the stars be selected and dragged around (middle button) or
deleted (right button); once stars have been mapped, the first right
button click deletes the selection box and the second right button
click will delete the star

Wire Tool:  changes the mode of the mouse to draw connections (wires)
between the stars

Selection Tool:  changes the mode of the mouse to group one or more
stars for mapping purposes; the resultant selection box is always
rectangular and is drawn starting at the upper left to the lower right

4.2  Architecture Schematic

This window has the same menu and buttons as for the "Algorithm
Schematic" window.  The main difference here is that the DE domain
stars are used in creating architectures.  Thus, upon selection of the
Star Palette, the DE domain should be chosen.  The main stars of
interest include:  Processor, I860, SHARC, Raceway, VMEBus.  Another
difference is that the "Runtime Parameters" here defines the end time
of the DE simulation.

4.3  Algorithm to Architecture Mapping

In the "Algorithm Schematic" window, one or more functional blocks
must be grouped using the Selection mode.  The groups are numbered
(starting with zero) and increment as new groups are defined.  All
functional blocks should be a part of a group before the performance
model is run.  In the "Architecture Schematic" window, each processor
star (Processor, I860, or SHARC) must be selected individually using
the Selection mode.  Another set of numbers are assigned to these
selections, also starting at zero.  The functional blocks in group X
in the "Algorithm Schematic" are in effect mapped to the processor in
group X in the "Architecture Schematic".  Thus, the performance model
will simulate the execution of these functional blocks on that
processor using the defined cost functions (see section 4.4.3).  Once
the mapping is complete, the performance simulation is ready to run.

4.4  Performance Simulation (Map Algorithm to Architecture)

The corresponding dialog box invoked by selecting this window from the
GUI brings up a dialog box which is used to specify the system
specification and parts data files, and to simulate the specified
mapping.  Default files have been provided.  Details on the format of
these files are provided in section 4.4.4.  By selecting the "Map"
button, pigi+ invokes Ptolemy via pipeptcl.ptiny (using the ptcl
interface), creates the DE domain performance model of the
architecture, algorithm, and mapping, and then executes the model.
Results are stored to a temporary file in /usr/tmp/gantt.log as the
model executes.  You may need to insure that write permissions allow
you to write to this area.  At the end of the simulation, pigi+
provides two windows: a gantt chart and a system thermometer display.

4.4.1  Gantt chart

The Gantt chart displays the activity on each bus and processor as a
function of time.  The gantt chart window can be resized using the
standard window resizing, and the four direction buttons can be used
to expand or contract the gantt display within the window.  The left
and middle buttons can each be used to measure time, and "snap" to the
nearest event on the row where the mouse is focused.  The times t1 and
t2 are displayed in red at the bottom of the window for the markers
corresponding to the left and middle buttons, and the difference
between the two markers is also displayed.  The right button is used
to toggle a 100 us grid on and off.  For the busses, blue denotes the
bus is busy, and the number indicates the destination of the current
bus traffic (processor number).  For the processors, yellow is used to
denote the reception or transmission of data and a light green-blue is
used to indicate that processing is taking place.  The name of the
function being processed (suffixed by a unique numeric identifier) is
displayed on each block.

4.4.2  Thermometer Display

In selecting an architecture and a mapping, performance is usually
very important.  However, certain other system metrics must often be
considered as well. As a result, in addition to the Gantt chart
provided after each simulation run, the architectural trade tool also
gives some feedback on the following system metrics via a thermometer
display: function, environment, interfaces, schedule, cost, processor,
interconnect, software, size, weight, power, reliability, testability,
maintainability, fault tolerance, scalability, and standards. These
system metrics are estimated using simple models with stored
manufacturer specifications, historical data, and certain information
from the mapping and performance simulation.

The user provides system specifications for each of these metrics, in
terms of minimum, nominal, and maximum values.  In addition, the user
also specifies the relative importance of the metrics to each other
using a numeric weighting.  The estimated system metrics are graphed
against the given system specifications using a thermometer bar graph
display. The height of the individual thermometer bar denotes the
relative importance assigned to each metric.  All thermometers are
normalized so that the center matches the nominal specified value for
each metric.  A thermometer bar filled in to the left of center
indicates that the system metric does not meet the nominal value while
a bar reaching to the right of center shows that the nominal value has
been satisfied. Some of the metrics are displayed using a reverse
scale so that the thermometers are consistent in showing shortfalls--a
value to the left of center always indicates a shortfall, regardless
of whether the actual numeric value is lower or higher than the
nominal value (e.g. power versus reliability). The display has the
option of showing the minimum and maximum specification values for the
metrics as a range around the nominal value.  Because the reported
metric values are estimates, a measure of the relative accuracy of the
calculations can also be shown as a range around the reported
metric. The purpose of these estimates is to provide a first level
measure of system metrics to aid in selecting an architecture instead
on concentrating entirely upon the performance results.

4.4.3  Cost Functions

Cost functions are used to model the computational and memory costs of
executing a functional block.  These functions include a constant
overhead term and a variable term.  The variable term in the cost
function allows the computational and memory costs to be expressed in
terms of the state variables of the functional star they represent.
Typically, the variable term is used to scale the costs as a function
of the amount of data being processed.  In the cost files, the letters
w, x, y, z, are used as follows:

	w : constant overhead computational cost in processor cycles
	x : variable computational cost in processor cycles as
	    a function of the state variables
	y : constant overhead memory cost in bytes
	z : variable computational cost in bytes as a function of the
	    state variables

The cost file has one line defining each of these, followed by four
lines, each with one of the letters.  pigi+ calls the Unix utility bc
in order to do these computations.  For the SDF star SDFOrthogonalize,
here is the corresponding Orthogonalize.cost file:

w = 30
x = 4 + (28*blockSize)
y = 20
z = 20
w
x
y
z

Currently, the memory cost functions are not used or implemented.
Only the cost functions for computations are implemented.

4.4.4  Specification and Parts Files

The specification file is used to specify the system specifications in
18 areas.  An example file is given at
$PTOLEMY/src/contrib/atrade/gui/bin/std.spec.  For each entry in this
file, there is the name of the specification, the type of
specification (intended use, performance, or supportability), the
units of the specification values, the relative system weighting, a
numeric specification value, a minimum specification value, and a
maximum specification value.

The parts file is used to specify the system specifications of the
parts used in the system.  An example file is given at
$PTOLEMY/src/contrib/atrade/gui/bin/std.parts.  A separate line is
used for each part.  The line starts with part name (terminated by a
semicolon), followed by one or more system specifications separated by
colons.  For each system specification, a minimum, nominal, and maximum
value is specified.


5.0  Directory Description

$PTOLEMY/src/contrib/atrade
	contains this README file as well as a postscript version of
	the 1996 International Conference on Acoustics, Speech, and
	Signal Processing (ICASSP) 4-page paper describing this
	capability (arch_trade.ps) and a GIF file showing an example
	mapping (example.gif)

$PTOLEMY/src/contrib/atrade/gui/kernel
        contains a new version of $PTOLEMY/src/kernel/PortHole.ccn
        This version merely modifies the PortHole::print method by adding
        two lines:

            // added to visibility into PortHole objects via ptcl interface
            out << "(numTokens = " << numberTokens << ")";
        
$PTOLEMY/src/contrib/atrade/gui/src
        contains some key sources used to build pigi+ (our GUI),
        namely those that interface with ptcl

$PTOLEMY/src/contrib/atrade/gui/bin
        contains pigi+ executable, and some configuration files for
        pigi+; to run it, you must set the following environment
	variable to where its installed, e.g.
        
        setenv PIGI+_HOME $PTOLEMY/src/contrib/atrade/gui/bin

$PTOLEMY/src/contrib/atrade/gui/bin/cost
        contains cost functions used by pigi+

$PTOLEMY/src/contrib/atrade/gui/bin/icons
        contains icons used by pigi+

$PTOLEMY/src/contrib/atrade/gui/bin/schematics/*
        contains architecture and SDF graphs used by pigi+

$PTOLEMY/src/contrib/atrade/de/kernel/*
        contains new particles (*.h and *.cc) for atrade (these
        changes can co-exist with the current DE domain kernel)

$PTOLEMY/src/contrib/atrade/de/stars
        contains new DE stars (*.pl) for atrade

$PTOLEMY/src/contrib/atrade/sdf/stars
	contains new SDF stars (*.pl) used in atrade examples

$PTOLEMY/src/contrib/atrade/pipeptcl
        contains files needed to build pipeptcl.ptiny, which is a
	version of ptcl.ptiny that can communicate over a pipe.  This
	directory uses libptcl, and differs from the vanilla ptcl in
	that pipeptclAppInit.cc is used instead of ptclAppInit.cc.

6.0  Troubleshooting

pigi+ is a prebuilt SunOS binary with certain paths hardcoded in.  It
should also run under Solaris.  You may need to set a few environment
variables.

        1) If you get a segmentation error upon startup:
                ptuser@watson 149% ./pigi+
                Warning: locale not supported by Xlib, locale set to C
                Warning: X locale modifiers not supported, using default
                Segmentation Fault
        
        pigi+ looks for /usr/lib/X11/nls
        At UC Berkeley, we had to do:
        setenv XNLSPATH /usr/sww/sunos-X11R5/lib/X11/nls

        2) After starting pigi+, nothing happens:
                ptuser@watson 161% ./pigi+

        Make sure that the pipeptcl.ptiny that is in your path is a
	link to pipeptcl.ptiny.  See the installation instructions.

7.0  Example Run

Startup pigi+ and press left button on the main dialog to bring up the
algorithm schematic window.  Under "File", chose "Open" and select
gramschmidt.drw.

Now press the middle button on the main dialog to bring up the
architecture schematic window.  Under "File", select "Open" and select
sharc_vme.drw.  Now chose "Execution" then "Run Length" and set this
parameter to 0.002 (this prevents the simulation from running for too
many iterations).

Back to the algorithm window, select the rightmost button to activate
the Selection Tool.  Now draw a separate box around each Normalization
block (blue box) and Orthogonalization block (green circle), for a
total of five red dashed boxes numbered 0 to 4.

In the architecture schematic window, activate the Selection Tool and
do the same thing for each of the Processors.  Now each functional
block (Normalization or Orthogonalization) are mapped to a processor.

Now press the right button on the main dialog to bring up the Mapping
Parameters dialog box.  The default files will work fine.  The
algorithm mapping onto the architecture can now be simulated by
pressing the Map button.  Within a few seconds, a thermometer display
and a gantt chart will appear.

You may try reducing the number of processors in the architecture and
mapping more than one functional block to the processors.  Different
mappings will yield different results.