Edges

Edges represent data flows between components. To be able to connect components with edges activate the edge tool located at the upper part palette window.

Edges between components are always bound to components ports. Components read data records from input port(s) and write them to their output port(s). Some of the components provide specific number of ports, others provide infinite number of ports. Filter component for example has exactly two output ports. Writing valid records on port 0, invalid records on port 1. Readers on the other hand may have infinite number of output edges, broadcasting parsed records on all of them. Meaning of ports may be found in components overview.


When new edge should be created between components with multiple ports, you can assign specific port where the edge should be bound. Components providing infinite number of input/output ports create new port every time you connect new edge to the component.

Important: Port order may have special meaning for components. Hash Join component for example creates its hash table from records on the slave port(0) and uses the master(1) port's record to lookup value in the hash table.

Every edge carries metadata information. The metadata may be assigned to an edge in component's editor in Ports tab or in the properties view. When en edge is assigned metadata, these can be propagated to all adjacent edges recursively. The rules of propagation are following:

  • propagation must be started from edge with assigned metadata information
  • when a component is reached (by input edge), propagation continues on all its output edges
  • propagation stops when component that may change data record structure is encountered (e.g. reformat, join)


Types of Edges

Every edge in a graph can have different behaviour. There can be four different types of edges:

  • Direct edge - this edge is in-memory buffered for better performance, however the buffer is limited in size - the size is determined by Defaults.Data.DIRECT_EDGE_INTERNAL_BUFFER_SIZE constant.
  • Fast propagate direct edge - alternative implementation of direct edge for fast record propagate to reader component. Does not contain any buffer. It sends each data record to the input port on the next component as soon it receives it from the output port of the previous component.
  • Buffered edge - this edge buffers data in-memory and if this buffer is exhausted then on disk to allow unlimited buffering for writer. It internally allocates two buffers (for reading, writing) of BUFFERED_EDGE_INTERNAL_BUFFER_SIZE.
  • Phase connection edge - this edge can't be assigned by user but it is assigned between two nodes automatically if the nodes have different phase number.

default value and preferences - to be continued

Node: the previous version of Engine and Gui supported fastPropagate attribute that assigned an edge the fast propagate behaviour. This feature was preserved because of backward compatibility but if the edgeType attribute have a value, the fastPropagate attribute is ignored.

Debug the Edges

Turning On the Debug

If you obtain incorrect or unexpected results when running some of your graphs and you want to know what erros occure and where they occure, you must debug the graph.
You need to guess where the problem may arise from and, sometimes, you also need to specify what records should be saved to debug files.
If you do not process great number of records, you do not need to limit those that should be saved to debug files, however, in case that you parse big numbers of records, this option may be useful.

  • To debug the graph, right-click the edges that are under your suspicion. Now, for each of these edges, select the Enable debug option from the context menu. After that, a bug icon appears on the edge meaning that a debugging will be performed upon the graph execution.
  • The same can be done if you click the edge and switch to the Properties tab of the Tabs pane. There you only need to set the Debug mode attribute to true. By default, it is set to false. Again, a bug icon appears on the edge.

When you run the graph, for each debug edge, one debug file will be created. After that, you only need to view and study the data records from these debug files (.dbg extension).

Limiting the Debug Data Records

If you do not do anything else than select the edges that should be debugged, all data records that will go through such edges will be saved to the debug files.
Nevertheless, as has been mentioned above, you can restrict those data records that should be saved to debug files.
This must be done in the Properties tab of any debug edge.

  • Debug max. records This way you put a maximum limit to the number of saved records.
  • Debug sample data If set to true, data records are selected from all the records flowing through the edge and not only from those at its beginning or end. Data records will be saved at random, some of them will be omitted, others will be saved to the debug file. If you set this attribute to true, the Debug max. records attribute value will only be the treshold that would limit how many data records could be saved to a debug file. If you do not set this attribute or set it to false, the number of saved debug records will equal exactly to the Debug max. records (if more data records flow through the edge). Remember that data records are selected more frequently from the beginning or the end, if the value is false or true, respectively.
  • Debug filter expression If you set some filter expression, those records that satisfy the defined expression are the only ones that are saved to debug files.
  • Debug last records By default, data records from the end of the flow (last data records) are saved to debug files. If you set this attribute to false, data records from the beginning of the flow (first data records) are saved to debug files.

Viewing the Debug Data Records

After run of the graph, you can view and study the debug data on the debug edges. You only need to right-click some of the debug edges and select the View data option from the context menu.
In the View data wizard that will open, you can specify the number of data records that should be displayed from all the saved debugged data records. (By default, 10 records are displayed).
And, you can also define another filter expression that will limit the number of displayed data records.

Turning Off the Debug

If you want to turn off the debug, you can click the Graph editor in any place outside the components and the edges, switch to the Properties tab and set the Debug mode attribute to false. This way you are turning off the debug of all edges at the same time.
Moreover, if you do not define the Debug max. records attribute for some edge, you can specify it in this Properties tab But remember that if any edge has its own Debug max. records attribute value defined, the global value of this attribute will be ignored and that of the edge will be applied.

Remember that all data records are processed by the graph and saved to the outputs as if no debug were performed.

graph_elements/edges.txt · Last modified: 2009/09/16 12:27 (external edit)
Back to top
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0