Other components

This components perform various operations like data verification, command performing and so on.



Common attributes



Attribute Description Exp.
id component identification string
type component type. This attribute is automatically generated from gui. string


Check Foreign Key

Checks a defined foreign key against a table of primary keys to verify whether the foreign key is valid. If the foreign key is not found among primary keys, default foreign key is substituted. The resulting foreign record is broadcast to all connected outputs. The table containing the “primary” keys may contain duplicates but they will be ignored.

Input ports: * 0 - foreign records.

  • 1 - primary records.

Output ports:

  • [0] - foreign records with eventually new key
  • [1] - (optional) foreign records with invalid key


Xml attributes:

Attribute Mandatory Description Default ETL Version since
id yes component identification
type yes component type CHECK_FOREIGN_KEY
primaryKey yes , since 2.5 version deprecated (see foreignKey) Name(s) of column(s) in input [1] that contain the primary key(s) (field names separated by [:;|])
foreignKey yes Name(s) of column(s) in input [0] that contain the foreign key(s) (field names separated by [:;|]). Since 2.5 version it is advocated to use new firm syntax for foreign key: $foreignField1=$primaryField1;$foreignField2=$primaryField2;….
defaultForeignKey yes Key that should be substituted if the foreignKey doesn't exist in primaryKey. The order of the fields is as specified in foreignKeys (fields separated by [:;|]).
hashTableSize no should be larger than the number of unique primary keys 512 or properties

Example:

  <Node id="CHECKFOREIGN" 
        type="CHECK_FOREIGN_KEY" 
        primaryKey="CustomerID"
        foreignKey="CustomerID"
        defaultForeignKey="-1"
  />
  <Node defaultForeignKey="-;-;" foreignKey="$city=$city;$country=$country;#" id="CHECK_FOREIGN_KEY0" type="CHECK_FOREIGN_KEY"/>



DB Execute

This component executes specified command (SQL/DML) against specified DB.

Input ports:

  • one input port defined/connected (optional) - input records for stored procedure or (since 2.7 ver.) sql statements

Output ports:

  • [0] (optional) - stored procedure output parameters and/or query result set
  • [1] (optional) - logging error - fields with autofilling=“ErrCode” and/or autofilling=“ErrText” are filled by error code and/or error message of sql error respectively - since 2.6 version


Xml attributes:

Attribute Mandatory Description Default ETL Version since
id yes component identification
type yes component type DB_EXECUTE
dbConnection yes id of the Database Connection object to be used to access the database
url if no sqlQuery url location of the query. The query will be loaded from the file referenced by the url. Since 2.7 version standard clover fileURL syntax is supported. Query can be read from a file or from the input port (using port-reading attribute, e.g.: port:$0.fieldName:discrete).
sqlQuery if no url SQL/DML/DDL statement(s) which has to be executed on database. In case you want to call stored procedure or function with parameters or producing output data set, it has to be in form: {[? = ]call procedreName([?[,?…]])} (Note: remember to close statement in curly brackets), when input/output parameters has to be set in proper attribute. If several statements should be executed, separate them by [;] (semicolon - default value; see sqlStatementDelimiter). They will be executed one by one. stored procedures/functions with parameters since 2.4
inParameters no when calling stored procedure/function with input parameters. Maps out which input fields would be treated as proper input parameters. Parameters are counted from 1. Form: 1:=$inField1;…n:=$infieldN. Can't be set if statements are read from input port. 2.4
outParameters no when calling stored procedure/function with output parameters or returning value. Maps out which output fields would be treated as proper output parameters. Parameters are counted from 1. If function return a value, this is the first parameter Form: 1:=$outField1;…n:=$outfieldN 2.4
outputFields no when stored procedure/function returns set of data its output will be parsed to given output fields. This is list of output fields delimited by semicolon. 2.4
sqlStatementDelimiter no delimiter of sql statement in sqlQuery attribute ;
printStatements no Specifies whether SQL commands are outputted to stdout. No
inTransaction no one of: ONE,SET,ALL,NEVER_COMMIT specifying whether statement(s) should be executed in transaction. For
* ONE - commit is perform after each query execution
* SET - for each input record there are executed all statements. After set of statements there is called commit, so if error occurred during execution of any statement, all statements for this record would be rolled back
* ALL - commit is called only after all statements, so if error occurred all operations would be rolled back
* NEVER_COMMIT - commit isn't call at all (may be called from different component in different phase) - since 2.6.1 ver.
Works only if database supports transactions.
SET since 2.4. For older boolean with default NO
callStatement no boolean value (Y/N) - specifies whether SQL commands should be treated as stored procedure calls - using JDBC CallableStatement. No
errorActions no defines if graph is to stop, when query throws SQLException. Available actions are: STOP or CONTINUE. For CONTINUE action, error message is logged to console or file (if errorLog attribute is specified) and for STOP error is populated and graph execution is stopped. Error action can be set for each error code (value1=action1;value2=action2;…) or for all values the same action (STOP or CONTINUE). It is possible to define error actions for some error values and for all other values (MIN_INT=myAction). -1=CONTINUE;MIN_INT=STOP 2.6
errorLog no path to the error log file. 2.6

Example:

  <Node id="DATABASE_RUN"
        type="DB_EXECUTE"
        dbConnection="NorthwindDB"
        sqlQuery="drop table employee_z"
  />
 
  <Node id="DATABASE_RUN" 
        type="DB_EXECUTE"
        dbConnection="NorthwindDB"
        errorActions="CONTINUE">
        <SQLCode>
            create table testTab (
                name varchar(20)
            );
            insert into testTab ('nobody');
            insert into testTab ('somebody'); 
        </SQLCode>
  </Node>
 
  <Node dbConnection="Connection1" id="DB_EXECUTE1" type="DB_EXECUTE">
    <attr name="sqlQuery">create table proc_table (
  	id INTEGER,
 	    string VARCHAR(80),
 	    date DATETIME
    );
    CREATE PROCEDURE SPDownload
      @last_dl_ts DATETIME
    AS
    BEGIN
      SELECT id, string, date
        FROM proc_table
           WHERE date >=  @last_dl_ts
    END;></attr>
 </Node>
 
<Node callStatement="true" dbConnection="Connection1" id="DB_EXECUTE2" inParameters="1:=$date" 
 			outputFields="id;string;date" type="DB_EXECUTE" sqlQuery="{call SPDownload(?)}" >
 
<Node dbConnection="Connection0" errorActions="MIN_INT=CONTINUE;" id="DB_EXECUTE0" printStatements="true" 
 			type="DB_EXECUTE" url="port:$0.field1:discrete"/>



HTTP Connector

This component is simple HTTP requestor. Connector is dedicated to generate HTTP requests and to receive responses from web servers.

Generally there are two ways how to specify a request content and how to handle a response:

  1. single HTTP connect - If you need just one HTTP interaction, you can use 'Input file URL' for request content and 'Output file URL' as a storage for response. The request content can be also passed via 'Request content' attribute.
  2. multiple HTTP interaction are possible through the use of input port and output port - 'Input field' attribute specifies field name of input record where is the request content. 'Output field' attribute specifies field name of outptu record where the reposonse will be stored. For each incoming record one HTTP interaction is perfomed and one output record sent. Both fields has to be string data field.
  3. no combination of two previous approaches is possible
Attribute Mandatory Description Default ETL Version since
id yes component identification
url yes URL of remote HTTP server, where the requests are sent 2.8
requestMethod no HTTP request method - GET or POST GET 2.8
requestProperties no additional HTTP request properties; given properties will be part of header of HTTP request *) GET 2.8
inFileUrl no body of the single HTTP request 2.8
outFileUrl no body of the single HTTP response 2.8
appendOutput no whether to append data to output file false 2.8
charset no character encoding of the input/output file ISO-8859-1 2.8
inputField no name of input string field with request body 2.8
outputField no name of output string field with response body 2.8
responseAsFileName no if true, output fields will be populated with the names of temporary files, where the response bodies are stored; this is necessary in case response body is too big to be stored in a single string data field; developer is responsible to delete all these temp files false 2.8
responseDirectory no directory where the temporary files will be created 2.8
responseFilePrefix no file name prefix for temporary files 2.8

*) properties are accepted in key=value form i.e.

attribute1=value1
attribute2=value2
attribute3=value3

Example:

<Node guiName="HTTP connector" id="HTTP_CONNECTOR0" 
        outFileUrl="${DATAOUT_DIR}/result.txt" 
        requestMethod="POST" 
        type="HTTP_CONNECTOR" 
        url="http://almighty.pri.univie.ac.at/~mangler/helloService.php">
        <attr name="requestContent">
                <![CDATA[<?xml version="1.0" encoding="UTF-8"?>
                <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
                <soapenv:Body>
                <ns1:sayHelloTo soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:ns1="urn:helloService">
                <name xsi:type="xsd:string">Agata</name>
                </ns1:sayHelloTo>
                </soapenv:Body>
                </soapenv:Envelope>]]>
        </attr>
</Node>

Java Execute

This component executes Java code. It dooesn't handle any input and output. This component is a wrapper around class implementing org.jetel.component.JavaRunnable interface, which contains Java code to be executed (method run()).

Input ports:

  • none

Output ports:

  • none
Attribute Mandatory Description Default ETL Version since
id yes component identification
runnableClass no, if runnable or runnable URL name of the class to be used
runnable no, if runnable URL or runnable Class contains definition of class with Java source
runnableURL no, if runnableClass or runnable path to the file with code to be executed
charset no encoding of extern source
properties no string containing key value pairs in a form key1=value1;key2=velu2;key3=value3;… they can be retrieved in Java code via org.jetel.component.JavaRunnable interface in init() method as a Java Properties object

Example:

<Node id="JAVA_EXECUTE0"
      type="JAVA_EXECUTE"
      properties="testingProperty=Testing output string">
<attr name="runnable"><![CDATA[
import org.jetel.component.BasicJavaRunnable;
 
public class testJavaRunnable extends BasicJavaRunnable {
 
	public void run() {
		System.out.println("Basic example of Java Execute, value of testingProperty = " 
				+ parameters.getProperty("testingProperty"));
	}
}
]]></attr>
</Node>

Lookup Table Reader/Writer

Writes/reads records to/from lookup table. Depending on ports connected records are read put to lookup table and/or read from it.

Input ports:

  • if no input port connected, read from lookup table and write to output ports, output port(s) must be defined/connected.
  • if one input port connected, read from one input port and write to lookup table.

Output ports:

  • if no output port connected, write to lookup table, input port must be defined/connected.
  • if at least one output port connected, write to output port(s), read from lookup table.


Xml attributes:

Attribute Mandatory Description Default ETL Version since
id yes component identification
type yes component type LOOKUP_TABLE_READER_WRITER
lookupTable yes name of lookup table for reading or writing records
freeLookupTable no idicates if close lookup table after finishing execute() method. All records, which are stored only in memory will be lost false

Example:

  <Node id="READ" type="LOOKUP_TABLE_READER_WRITER" lookupTable="LookupTable1" />



Run Graph

since 2.4

This component runs specified graphs. Filenames of the graphs to be executed can be supplied by the input edge. The first field in input metadata must be String. In case the input is not connected a single filename must be specified as the attribute graphName.

Execution modes

Graph can be executed

  1. in the same JVM: Graph is run by existing executor in current JVM instance. Log output is flushed together with current graph.
  2. in separated JVM: Graph is run as external process. Log output is written into specified file. Executed JVM will have the same classpath as current JVM.
    In this mode, supplying the command line for clover is necessary (at least the -plugins argument).
    Command line arguments can be supplied by cloverCmdLineArgs attribute or by second field in input port (Supplying by field can be made only in in/out mode).
    If cloverCmdLineArgs attribute is defined and input port is connected then field from input port has higher priority. Thus when some graph defined in input port hasn't assigned command line arguments cloverCmdLineArgs attribute is used.
    When command line arguments is defined by input port then different command line arguments can be used for each graph.
    Note: using input port to defined command line arguments is implemented from version 2.5.

Adding memory to individual graphs

If you know that your graphs need more memory, set the alternativeJavaCmdLine attribute to “java -Xmx1g -cp” or equivalent. Remember that sameInstance must be set to false, otherwise this alternativeJavaCmdLine setting would be ignored.
If you set sameInstance to true, you should set the maximum needed memory in the Open Run Dialog before running the graph.
Remember that memory cannot be set as Command line argument.

Processing modes

Choosing of processing mode depend on attribute graphName. If attribute graphName is defined then pipeline mode is used else in-out mode is used.

Pipeline mode

In this mode graph to execute is specified by attribute graphName. If a input is connected then it's ignored. Each RunGraph node may be used as resolution node, because it chooses output port according to graph execution success/fail.

Input ports: no input

Output ports:

  • port 0 - one record will be put on this port only if graph execution success
  • port 1 - one record will be put on this port only if graph execution fails

In-out mode

In this mode input is connected and attribute graphName is not set. The first field in input metadata must be String containing path to graph file definition. Thus it executes graph for each incoming record.
If input metadata contains more then one filed then the second field must be String and it contains clover command line arguments.

Input ports: port 0 - first field of record must be String path to graph file definition.
Second field is optional - it must be String and contains clover command line arguments.

Output ports: port 0 - one record for each graph execution is generated; it may describe fail or success otherwise


Xml attributes

Attribute Mandatory Description Default ETL Version since
id yes component identification
type yes component type RUN_GRAPH
graphName no path to XML graph definition
sameInstance no if it's true, graph will be executed in the same instance of JVM; otherwise it will be executed as external process true
ignoreGraphFail no if more graphs is executed and any of them fails, then execution next graphs depends on this attribute.
Note: used only if in/out mode is used.
false 2.5
logFile no path to file for logging output of external process 2.8 - before it was usable only if sameInstance was true

Additional attributes for execution in separated instance of JVM

if sameInstance is true, all following attributes are ignored

Attribute Mandatory Description Default ETL Version since
logAppend no if true, log will be appended to existing, otherwise existing log will be overwritten false
alternativeJavaCmdLine no command line to execute external process java -cp
graphExecClass no full class name which runs graph org.jetel.main.runGraph
cloverCmdLineArgs no additional parameters for graph exec class

Example:

<Node id="RUN1" type="RUN_GRAPH" graphName="${WORKSPACE}/graphXmlWriter.grf" />


Output metadata

<Metadata id="Metadata0">
<Record name="outdata" recordSize="-1" type="delimited">
<Field delimiter=";" name="graph" nullable="true" type="string"/>
<Field delimiter=";" name="result" nullable="true" type="string"/>
<Field delimiter=";" name="description" nullable="true" type="string"/>
<Field delimiter=";" name="message" nullable="true" type="string"/>
<Field delimiter="\n" name="duration" nullable="true" type="decimal"/>
</Record>
</Metadata>

Output metadata fields

Field Description
graph path to file with executed graph
result “Finished successfully” or “Aborted” or “Error”
description text description useful when graph fails
message string value of org.jetel.graph.Result
duration graph execution duration in milliseconds


Sequence checker

since 2.6

Checks incoming records based on specified key whether all records are in the properly order. The key is a name (or combination of names) of field(s) from the input record. The sort order is either Ascending (default) or Descending.

Input ports: * one input port defined/connected.

Output ports:

  • 0-n output ports defined/connected.


Xml attributes:

Attribute Mandatory Description Default ETL Version since
id yes component identification
type yes component type EXT_SORT
sortKey yes sequence of field names followed by (a) or (d) meaning that the field is sorted in ascending or descending order, respectively. The individual expressions are separated by :;| {colon, semicolon, pipe}. Example: Surname(a);FirstName(a);Salary(d).
sortOrder no one of “Ascending|Descending” {the fist letter is sufficient} Ascending
uniqueKeys no true/false permit duplicate keys false
equalNULL no specifies whether two fields containing NULL values are considered equal. true 2.8.1
locale no locale to be used when sorting using I18N rules. If not specified, then system default is used. Example: “fr”
useI18N no true/false perform sorting according to national rules - e.g. Czech or German handling of characters like “i”,”Ă­”. Default is false.¨Use it only if you are sorting data according to key which can contain accented characters or you want sorter to follow certain locale specific rules. false


Example:

  <Node id="SEQUENCE_CHECKER0" 
        type="SEQUENCE_CHECKER"
        sortKey="LastName" />



Speed Limiter

The same as Simple Copy, but it waits specified amount of milliseconds before each record is sent to output. All records from input port are copied onto all connected output ports.

Input ports: * one input port defined/connected.

Output ports:

  • at least one output port defined/connected.


Xml attributes:

Attribute Mandatory Description Default ETL Version since
id yes component identification
type yes component type SPEED_LIMITER
delay yes length of delay between records in milliseconds

Example:

  <Node id="SPEED_LIMITER0" type="SPEED_LIMITER" delay="2000" />



System Execute

This component executes the specified command and arguments in a separate process and send output from process to output port. If there is specified command interpreter, command string is written to tmp file and there is called this interpreter to execute system script. If there is not connected output port, output from process (together with error) can be send to a file.

Input ports:

  • one input port defined/connected (optional) - input records for system command (process' input).

Output ports:

  • one input port defined/connected (optional) - output stream from system command (process' output).


Xml attributes:

Attribute Mandatory Description Default ETL Version since
id yes component identification
type yes component type SYS_EXECUTE
command yes command to be execute by system. If there is specified interpreter, command is saved in tmp file and there is called interpreter,which executes this script
capturedErrorLines no number of lines that are print out if command finishes with errors
outputFile no path to the output file
append no whether to append data at the end if output file exists or replace it (values: true/false) false
interpreter no It has to have form:“interpreter name [parameters] ${} [parameters]”, where in place of ${} System Execute is to put the name of script file
workingDirecory no this component's working directory current directory 3.0
environment no system-dependent mapping from variables to values. Mappings are separated by :;| {colon, semicolon, pipe}. By default the new value is appended to the environment of the current process. It can be changed by adding !false after the new value, eg.: PATH=/home/user/mydir appends /home/user/mydir to the existing PATH, but PATH=/home/user/mydir!false replaces the old value by the new one (/home/user/mydir). 3.0
deleteBatch no whether to delete or not temporary batch file true 3.0
charset no encoding used for formating/parsing data for input/from output of system process ISO-8859-1 3.0

Example:

  <Node id="SYS_EXECUTE0" type="SYS_EXECUTE">
        <attr name="capturedErrorLines">3</attr>
        <attr name="command">rm test.txt</attr>
  </Node>
 
  <Node id="SYS_EXECUTE0" type="SYS_EXECUTE">
        <attr name="interpreter">sh ${}</attr>
        <attr name="outputFile">data/data.txt</attr>
        <attr name="command">touch data/data.txt
                             cat
                             dir /home/username
        </attr>
  </Node>
 
  <Node id="SYS_EXECUTE0" type="SYS_EXECUTE">
        <attr name="interpreter">sh ${} /tmp</attr>
        <attr name="command">cat
                             ls -l $1
        </attr>
 </Node>
 
 <Node append="false" id="SYS_EXECUTE0" interpreter="sh ${}" outputFile="data-tmp/data.txt" type="SYS_EXECUTE">
       <attr name="environment">PATH=/mybin;MY_VARIABLE=my_value</attr>
       <attr name="deleteBatch">false</attr>
       <attr name="workingDirectory">/home/user/home/tmp</attr>
       <attr name="command">echo $PATH
                            echo $MY_VARIABLE
                            touch data/data.txt
                            cat
                            dir .
       </attr>
</Node>
 
When you want to use input records as arguments of your command:
<Node id="SYS_EXECUTE0" interpreter="xargs sh ${}" type="SYS_EXECUTE">
       <attr name="command">mv -t processed-in $@</attr>
</Node>

In following example, the record with file name is sent to System Execute component, that process' the file, creating a xml file as output. This resulting file is sent to XMLExtract component. You can send whole file (if it is small) as one record with following metadata:

<Metadata id="Metadata1" >
  <Record name="oneRecord" type="delimited">
    <Field eofAsDelimiter="true" name="data" nullable="true" type="string"/>
  </Record>
</Metadata>
or cut the file to parts if it is bigger:
<Metadata id="Metadata1" >
  <Record name="oneRecord" type="delimited" recordDelimiter="\n">
    <Field eofAsDelimiter="true" name="data" nullable="true" type="string"/>
  </Record>
</Metadata>
Then you need to set fileURL attribute of your Reader properly (see File URL attribute). If you can parse each portion of data separately, use discrete key word (eg. port:$0.fieldName:discrete), if it is not possible use stream key word (eg. port:$0.fieldName:stream) - then all data are concatenated and parsed as one data source.



components/others.txt · Last modified: 2010/07/14 10:39 by avackova
Back to top
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0