Note: if you are interested in CloverETL Engine internals, check out Engine Overview presentation.
Transformation graph is both abstraction and class which performs some operation. Graph keeps track of all Nodes, Edges, metadata objects. It is also accompanied by class which enables reading the definition of graph from XML file and building everything dynamically.
Part of the graph is also WatchDog (running as a separate thread) which plays a role of dispatcher who sees to all other components of the graph.
There can be several graph objects created / running at the same time.
Following piece of code illustrates situation when we build graph in code:
// create Graph + Nodes + connections (edges) // since version 2.6 // engine initialization - should be called only once EngineInitializer.initEngine(pluginsRootDirectory, configFileName, logHost); // runtime customization GraphRuntimeContext runtimeContext = new GraphRuntimeContext(); // create new instance of transformation graph class TransformationGraph graph = new TransformationGraph(); // create graph phase Phase phase = new Phase(1); // create simple metadata DataRecordMetadata metadata = new DataRecordMetadata("RecordMedatada0", DataRecordMetadata.DELIMITED_RECORD); metadata.addField(new DataFieldMetadata("FieldMetadata0", "\n")); // or load metadata from file metadata=MetadataFactory.fromFile(graph, fmtMedataFileName); // create edges Edge inEdge=new Edge("InEdge",metadata); Edge outEdge=new Edge("OutEdge",metadata); Edge middleEdge=new Edge("OutEdge0",metadata); // create nodes Node nodeOne=new SimpleCopy("SimpleCopy1"); Node nodeTwo=new SimpleCopy("SimpleCopy2"); Node nodeParser=new DataReader("DataReader1", inputFileName); Node nodeWriter=new DataWriter("DataWriter1", outputFileName, "UTF-8", true); // add phase to graph; graph has to have at least one phase graph.addPhase(phase); // add nodes to phase - all nodes in one phase are executed concurrently // phases are executed sequentially - in order defined by their number phase.addNode(nodeOne); phase.addNode(nodeTwo); phase.addNode(nodeParser); phase.addNode(nodeWriter); // assign ports/nodex (input & output) // this links together components - creates data flows nodeParser.addOutputPort(0, inEdge); nodeOne.addInputPort(0, inEdge); nodeOne.addOutputPort(0, middleEdge); nodeTwo.addInputPort(0, middleEdge); nodeTwo.addOutputPort(0, outEdge); nodeWriter.addInputPort(0, outEdge); // add Edges & Nodes to graph graph.addEdge(inEdge); graph.addEdge(outEdge); graph.addEdge(middleEdge); // engine initialization EngineInitializer.initGraph(graph, runtimeContext); // graph running IThreadManager threadManager = new SimpleThreadManager(); WatchDog watchDog = new WatchDog(graph, runtimeContext); Future<Result> futureResult = threadManager.executeWatchDog(watchDog); //wait for end of graph Result result = futureResult.get(); if (result != Result.FINISHED_OK) { System.out.println("Something was wrong."); System.out.println(watchDog.getErrorMessage()); watchDog.getCauseException().printStackTrace(); } else { System.out.println("Everthing was OK."); } //do you want to run the same graph instance again? //just create new WatchDog instance and trigger it how many times you want watchDog = new WatchDog(graph, runtimeContext); futureResult = threadManager.executeWatchDog(watchDog); result = futureResult.get(); ...
This example shows how to save some work and load graph definition from XML file:
// engine customization GraphRuntimeContext runtimeContext = new GraphRuntimeContext(); // engine initialization - should be called only once EngineInitializer.initEngine(pluginsRootDirectory, configFileName, logHost); // graph loading TransformationGraph graph = TransformationGraphXMLReaderWriter.loadGraph(in, runtimeContext.getAdditionalProperties()); // engine initialization EngineInitializer.initGraph(graph, runtimeContext); // graph running IThreadManager threadManager = new SimpleThreadManager(); WatchDog watchDog = new WatchDog(graph, runtimeContext); threadManager.executeWatchDog(watchDog);
For more details about loading graph definition from XML and initializing graph before run, see org.jetel.main.runGraph class of CloverETL engine.
<!ELEMENT Graph (Global , Phase+)> <!ATTLIST Graph name ID #REQUIRED debugMode NMTOKEN (true | false) #IMPLIED debugDirectory CDATE #IMPLIED> <!ELEMENT Global (Property*, Metadata+, Connection*, Sequence*, LookupTable*)> <!ELEMENT Property (#PCDATA)> <!ATTLIST Property name CDATA #IMPLIED value CDATA #IMPLIED fileURL CDATA #IMPLIED> <!ELEMENT Metadata (#PCDATA)> <!ATTLIST Metadata id ID #REQUIRED fileURL CDATA #IMPLIED connection CDATA #IMPLIED sqlQuery CDATA #IMPLIED> <!ELEMENT Connection (#PCDATA)> <!ATTLIST Connection id ID #REQUIRED type NMTOKEN #REQUIRED <!ELEMENT Sequence (#PCDATA)> <!ATTLIST Sequence id ID #REQUIRED type NMTOKEN #REQUIRED > <!ELEMENT LookupTable (#PCDATA)> <!ATTLIST LookupTable id ID #REQUIRED type NMTOKEN #REQUIRED > <!ELEMENT Phase (Node+ , Edge+)> <!ATTLIST Phase number NMTOKEN #REQUIRED> <!ELEMENT Node (#PCDATA)> <!ATTLIST Node id ID #REQUIRED type NMTOKEN #REQUIRED enabled NMTOKEN (enabled | disabled) #IMPLIED passThroughOutputPort NMTOKEN #IMPLIED passThroughInputPort NMTOKEN #IMPLIED > <!ELEMENT Edge (#PCDATA)> <!ATTLIST Edge id ID #REQUIRED metadata NMTOKEN #REQUIRED fromNode NMTOKEN #REQUIRED toNode NMTOKEN #REQUIRED debugMode NMTOKEN (true | false) #IMPLIED fastPropagate NMTOKEN (true | false) #IMPLIED>