A UniProcessor keeps two target pointers: one for multiprocessor target (
mtarget),and the other for the processor (
targetPtr).They are both protected members.
MultiTarget* mtarget;The pointer to the processor can be obtained by a public method:
CGTarget* target();The pointers to the multiprocessor target and to the ParProcessors class that this UniProcessor belongs to, are set by the following method:
void setTarget(MultiTarget* t, ParProcessors* p);
void resetMembers();These methods set the information for the associated node: the pointer to the node, duration, and a flag to tell whether it is an idle node or not. In the first method, the idle flag is set
void setMembers(ParNode* n, int dur);
FALSE. The constructor also resets all internal information of the class.
ParNode getNode();The above methods return the pointer to the node associated with this class, its duration, and the flag to say
TRUEif it represents an idle time slot.
NodeSchedule* nextLink();These methods return the next and the previous link in the linked list.
int availTime;Is a protected member to indicate the time when the processor available. There are public methods to access this member:
void setAvailTime(int t);This protected member points to the NodeSchedule appended last to the linked list. There are two public methods to access a NodeSchedule:
NodeSchedule* getCurSchedule();The first method just returns
NodeSchedule* getNodeSchedule(ParNode* n);
curSchedulemember while the second one returns the NodeSchedule associated with the argument ParNode.
When a ParNode is runnable earlier than the available time of the processor, we want to check whether there is an idle slot before
availTimeto fit the ParNode in the middle of the schedule:
int filledInIdleSlot(ParNode*, int start, int limit = 0);The first two arguments are the pointer to the ParNode to be scheduled and the earliest time when the node can be scheduled. Without the third argument given explicitly, this method returns the earliest time that the processor is available to schedule the node. If the third argument is given, the available time of the processor should be less than
limit.If this method could not find an idle slot to schedule the node, it returns -1. Otherwise, it returns the possible scheduling time of the node.
int schedInMiddle(ParNode* n, int when, int);Schedule the node,
wheninside an idle-time slot of the processor. The third argument indicates the duration of the node. This method returns the completion time of the schedule if scheduling is succeeded. If it fails to find an idle-time slot at
whento accommodate the node, it returns -1.
If a node is to be appended at the end of the schedule in a processor,
void appendNode(ParNode* n, int val);Does that blindly. To schedule a non-idle node, we have to use
int schedAtEnd(ParNode* n, int start, int leng);In case
startis larger than the processor available time, this method put an idle time slot in the processor and calls
appendNode.And, it sets the schedule information of the node, and increases
availTimeof the processor.
void addNode(ParNode* node, int start);This method is given a ParNode and its runnable time, and schedule the node either inside an idle time slot if possible, or at the end of the schedule list. The methods described above are used in this method.
void scheduleCommNode(ParNode* n, int start);When we schedule the given communication node,
start,we also have to check the communication resource whether the resources are available or not. For that purpose, we detect the time slot in this processor to schedule the node, and check whether the same time slot is available in the communication resources: we use
scheduleCommof the multiprocessor target to check it. If we find a time slot available for this processor and the communication resources, we schedule the communication node.
int getStartTime();Returns the earliest time when the processor is busy.
All methods described in this sub-section are public.
void createSubGal();Is the main public method for sub-universe creation. It first creates a galaxy data structure,
subGal,a private member. Then, it clones stars of at least one of whose invocations is assigned to the processor. Make a linked list for all assigned invocations (nodes) of each star in order of increasing invocation number, and set the pointer of cloned star. As for a wormhole, we create a CGWormStar instead of cloning the wormhole. A CGWormStar class will replace a wormhole in the sub-universe. If an original star is not supported by the processor target (for example, with heterogeneous scheduling ), we create a star with the same name as the original star in the target domain.
In the next step, we connect the cloned stars by referring to the original galaxy. If a star is at the wormhole boundary in the original graph, we connect the cloned star to the same event horizon; by doing so the wormhole in the original graph is connected to the sub-universe. If the star at the wormhole boundary is scheduled on more than one processors (or not all invocations are assigned to the same processor), the wormhole in the original graph will be connected to the last created sub-universe.
If an arc connects two stars who have some invocations assigned to the same processor, we examine whether all of the two stars' invocations are assigned to the same processor. If they are, we just connect the cloned stars in the sub-universe. If they aren't, we have a cloned star of either one star whose invocations are assigned to the current sub-universe. In this case, we create a send star (
createSendmethod of the multiprocessor target) or a receive star (
createReceiveof the target), based on whether the cloned star is a source or destination of the connection . We create a communication star and set the communication star pointer of the communication nodes in the APEG , by
matchCommNodesmethod of ParProcessors class. If the partner communication star was already created in another sub-universe, we pair the send and receive stars by
pairSendReceivemethod of the multiprocessor target .
The last case is when an arc connects two stars whose invocations are distributed over more than one processor. If no invocation of the destination star is assigned to this processor, we call the
void makeBoundary(ParNode* src, PortHole* orgP);The first argument points to the earliest invocation of the star assigned to this processor, and the second is the pointer to the output porthole in the original SDF graph as the source of the connection. We examine all assigned invocations to check whether the destination invocations are assigned to the same processor. If they are, we create one send star and connect it to the cloned star. Otherwise, we create a Spread star to distribute the output samples to multiple processors. We connect the cloned star to the Spread star, and the Spread star to multiple send stars.
Otherwise, we call the
void makeConnection(ParNode* dest, ParNode* src, PortHole* ref, ParNode* firstS);The first argument is the pointer to the first assigned invocation of the destination star while the second one is the source node connected to the first argument. The third argument is the pointer to the destination porthole of the connection in the original graph. The last argument is the pointer to the first invocation of the source star. Note that the last argument node may not be assigned to the current processor. This method examines all assigned invocations of the destination star to identify the sources of the samples to be consumed by the cloned star in this processor. If the number of sources are more than one, we create a Collect star and connect the Collect star to the cloned destination star in the sub-universe. For each source in the other processors, we create a receive star and connect it to the Collect star. Similarly, we examine all assigned invocations of the source star to identify the destinations of the samples to be produced by the source star in this processor. If the number of destinations is more than one, we create a Spread star and connect it to the source star. For each destination in the other processors, we create a send star and connect it to the Spread star. As a result, it may occur that to connect two stars in the sub-universe, we need to splice a Spread and a Collect star on that connection.
createCollectmethod of the multiprocessor target. The following illustrates when we need to create a Spread or a Collect star in a sub-universe.
Suppose we have star A connected to star B in original graph. Star A produces two samples and star B consumes one. Then, one invocation of star A is connected to two invocations of star B. If one invocation of star A and only one of invocation of star B are assigned to the current processor. Then, we need to connect the cloned stars of A and B in the sub-universe. We can not connect stars A and B in the sub-universe directly since among two samples generated by star A, one sample should be transferred to another processor through a send star. In this case, we connect a Spread star to star A, and one send star and star B to the Spread star in the sub-universe. Then, star A produces two samples to the Spread star while the Spread star
spreadthe incoming two samples to the send star and star B one sample each. The number of output portholes and the sample rate of each porthole are determined during the sub-universe creation. If there is no initial delay on the connection and neither star A nor B needs to access the past samples, the Spread star does not imply additional copying of data in the generated code.
Similarly, we need to connect a Collect star to the destination star if samples to that star come from more than one sources.
void prepareCodeGen();This method performs the following tasks before generating code.
(1) Initialize the sub-universe, which also initialize the cloned stars.
(2) Convert a schedule (or a linked list of ParNodes) obtained from the parallel scheduler to a SDFSchedule class format (list of stars). The code generation routine of the processor target assumes the SDFSchedule class as the schedule output.
(3) Simulate the schedule to compute the maximum number of samples to be collected at runtime if we follow the schedule. This information is used to determine the buffer size to be allocated to the arcs.
void simRunSchedule();Performs the step (3). It is a protected member.
StringList& generateCode();This method generate code for the processor target by calling
int genCodeTo(Target* t);This method is used to insert the code of the sub-universe to the argument target class. It performs the almost same steps as
prepareCodeGenand then calls
insertGalaxyCodeof the processor target class instead of
void putFree(NodeSchedule* n);If a NodeSchedule is removed from the list, it is put into a pool of NodeSchedule objects. When we need a NodeSchedule object, a NodeSchedule in the pool is extracted and initialized. We deallocate all NodeSchedules in the pool by the third method.
void removeLink(NodeSchedule* x);Removes the argument NodeSchedule from the scheduled list.
int sumIdle;Indicates the sum of the idle time slots after scheduling is completed. The value is valid only after
displaymethod is called.
subGaland to delete all NodeSchedule objects associated with this processor.
Galaxy* myGalaxy();The above methods return the pointer to the sub-universe, the index of the current processor, the number of scheduled nodes, and the sum of idle time after scheduling is completed (or
DoubleLinkList :: size;
void initialize();This method puts all NodeSchedules in the list to the pool of free NodeSchedules and initialize protected members.
void copy(UniProcessor* org);Copies the schedule from the argument UniProcessor to the current processor.
StringList displaySubUniv();The first method display the sub-universe. The second method displays the schedule in the textual form while the third one forms a string to be used by Gantt-chart display program.
StringList display(int makespan);
int writeGantt(ostream& os, const char* universe, int numProcs, int span);
ParNode* nextNode();Returns the ParNode in the list.