This components perform various operations like data verification, command performing and so on.
Common attributes
| Attribute | Description | Exp. |
|---|---|---|
id | component identification | string |
type | component type. This attribute is automatically generated from gui. | string |
Checks a defined foreign key against a table of primary keys to verify whether the foreign key is valid. If the foreign key is not found among primary keys, default foreign key is substituted. The resulting foreign record is broadcast to all connected outputs. The table containing the “primary” keys may contain duplicates but they will be ignored.
Input ports: * 0 - foreign records.
Output ports:
Xml attributes:
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
type | yes | component type | CHECK_FOREIGN_KEY | |
primaryKey | yes , since 2.5 version deprecated (see foreignKey) | Name(s) of column(s) in input [1] that contain the primary key(s) (field names separated by [:;|]) | ||
foreignKey | yes | Name(s) of column(s) in input [0] that contain the foreign key(s) (field names separated by [:;|]). Since 2.5 version it is advocated to use new firm syntax for foreign key: $foreignField1=$primaryField1;$foreignField2=$primaryField2;…. | ||
defaultForeignKey | yes | Key that should be substituted if the foreignKey doesn't exist in primaryKey. The order of the fields is as specified in foreignKeys (fields separated by [:;|]). | ||
hashTableSize | no | should be larger than the number of unique primary keys | 512 or properties |
Example:
<Node id="CHECKFOREIGN" type="CHECK_FOREIGN_KEY" primaryKey="CustomerID" foreignKey="CustomerID" defaultForeignKey="-1" /> <Node defaultForeignKey="-;-;" foreignKey="$city=$city;$country=$country;#" id="CHECK_FOREIGN_KEY0" type="CHECK_FOREIGN_KEY"/>
This component executes specified command (SQL/DML) against specified DB.
Input ports:
Output ports:
Xml attributes:
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
type | yes | component type | DB_EXECUTE | |
dbConnection | yes | id of the Database Connection object to be used to access the database | ||
url | if no sqlQuery | url location of the query. The query will be loaded from the file referenced by the url. Since 2.7 version standard clover fileURL syntax is supported. Query can be read from a file or from the input port (using port-reading attribute, e.g.: port:$0.fieldName:discrete). | ||
sqlQuery | if no url | SQL/DML/DDL statement(s) which has to be executed on database. In case you want to call stored procedure or function with parameters or producing output data set, it has to be in form: {[? = ]call procedreName([?[,?…]])} (Note: remember to close statement in curly brackets), when input/output parameters has to be set in proper attribute. If several statements should be executed, separate them by [;] (semicolon - default value; see sqlStatementDelimiter). They will be executed one by one. | stored procedures/functions with parameters since 2.4 | |
inParameters | no | when calling stored procedure/function with input parameters. Maps out which input fields would be treated as proper input parameters. Parameters are counted from 1. Form: 1:=$inField1;…n:=$infieldN. Can't be set if statements are read from input port. | 2.4 | |
outParameters | no | when calling stored procedure/function with output parameters or returning value. Maps out which output fields would be treated as proper output parameters. Parameters are counted from 1. If function return a value, this is the first parameter Form: 1:=$outField1;…n:=$outfieldN | 2.4 | |
outputFields | no | when stored procedure/function returns set of data its output will be parsed to given output fields. This is list of output fields delimited by semicolon. | 2.4 | |
sqlStatementDelimiter | no | delimiter of sql statement in sqlQuery attribute | ; | |
printStatements | no | Specifies whether SQL commands are outputted to stdout. | No | |
inTransaction | no | one of: ONE,SET,ALL,NEVER_COMMIT specifying whether statement(s) should be executed in transaction. For * ONE - commit is perform after each query execution * SET - for each input record there are executed all statements. After set of statements there is called commit, so if error occurred during execution of any statement, all statements for this record would be rolled back * ALL - commit is called only after all statements, so if error occurred all operations would be rolled back * NEVER_COMMIT - commit isn't call at all (may be called from different component in different phase) - since 2.6.1 ver. Works only if database supports transactions. | SET | since 2.4. For older boolean with default NO |
callStatement | no | boolean value (Y/N) - specifies whether SQL commands should be treated as stored procedure calls - using JDBC CallableStatement. | No | |
errorActions | no | defines if graph is to stop, when query throws SQLException. Available actions are: STOP or CONTINUE. For CONTINUE action, error message is logged to console or file (if errorLog attribute is specified) and for STOP error is populated and graph execution is stopped. Error action can be set for each error code (value1=action1;value2=action2;…) or for all values the same action (STOP or CONTINUE). It is possible to define error actions for some error values and for all other values (MIN_INT=myAction). | -1=CONTINUE;MIN_INT=STOP | 2.6 |
errorLog | no | path to the error log file. | 2.6 |
Example:
<Node id="DATABASE_RUN" type="DB_EXECUTE" dbConnection="NorthwindDB" sqlQuery="drop table employee_z" /> <Node id="DATABASE_RUN" type="DB_EXECUTE" dbConnection="NorthwindDB" errorActions="CONTINUE"> <SQLCode> create table testTab ( name varchar(20) ); insert into testTab ('nobody'); insert into testTab ('somebody'); </SQLCode> </Node> <Node dbConnection="Connection1" id="DB_EXECUTE1" type="DB_EXECUTE"> <attr name="sqlQuery">create table proc_table ( id INTEGER, string VARCHAR(80), date DATETIME ); CREATE PROCEDURE SPDownload @last_dl_ts DATETIME AS BEGIN SELECT id, string, date FROM proc_table WHERE date >= @last_dl_ts END;></attr> </Node> <Node callStatement="true" dbConnection="Connection1" id="DB_EXECUTE2" inParameters="1:=$date" outputFields="id;string;date" type="DB_EXECUTE" sqlQuery="{call SPDownload(?)}" > <Node dbConnection="Connection0" errorActions="MIN_INT=CONTINUE;" id="DB_EXECUTE0" printStatements="true" type="DB_EXECUTE" url="port:$0.field1:discrete"/>
This component is simple HTTP requestor. Connector is dedicated to generate HTTP requests and to receive responses from web servers.
Generally there are two ways how to specify a request content and how to handle a response:
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
url | yes | URL of remote HTTP server, where the requests are sent | 2.8 | |
requestMethod | no | HTTP request method - GET or POST | GET | 2.8 |
requestProperties | no | additional HTTP request properties; given properties will be part of header of HTTP request *) | GET | 2.8 |
inFileUrl | no | body of the single HTTP request | 2.8 | |
outFileUrl | no | body of the single HTTP response | 2.8 | |
appendOutput | no | whether to append data to output file | false | 2.8 |
charset | no | character encoding of the input/output file | ISO-8859-1 | 2.8 |
inputField | no | name of input string field with request body | 2.8 | |
outputField | no | name of output string field with response body | 2.8 | |
responseAsFileName | no | if true, output fields will be populated with the names of temporary files, where the response bodies are stored; this is necessary in case response body is too big to be stored in a single string data field; developer is responsible to delete all these temp files | false | 2.8 |
responseDirectory | no | directory where the temporary files will be created | 2.8 | |
responseFilePrefix | no | file name prefix for temporary files | 2.8 |
*) properties are accepted in key=value form i.e.
attribute1=value1 attribute2=value2 attribute3=value3
Example:
<Node guiName="HTTP connector" id="HTTP_CONNECTOR0" outFileUrl="${DATAOUT_DIR}/result.txt" requestMethod="POST" type="HTTP_CONNECTOR" url="http://almighty.pri.univie.ac.at/~mangler/helloService.php"> <attr name="requestContent"> <![CDATA[<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <ns1:sayHelloTo soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:ns1="urn:helloService"> <name xsi:type="xsd:string">Agata</name> </ns1:sayHelloTo> </soapenv:Body> </soapenv:Envelope>]]> </attr> </Node>
This component executes Java code. It dooesn't handle any input and output. This component is a wrapper around class implementing org.jetel.component.JavaRunnable interface, which contains Java code to be executed (method run()).
Input ports:
Output ports:
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
runnableClass | no, if runnable or runnable URL | name of the class to be used | ||
runnable | no, if runnable URL or runnable Class | contains definition of class with Java source | ||
runnableURL | no, if runnableClass or runnable | path to the file with code to be executed | ||
charset | no | encoding of extern source | ||
properties | no | string containing key value pairs in a form key1=value1;key2=velu2;key3=value3;… they can be retrieved in Java code via org.jetel.component.JavaRunnable interface in init() method as a Java Properties object |
Example:
<Node id="JAVA_EXECUTE0" type="JAVA_EXECUTE" properties="testingProperty=Testing output string"> <attr name="runnable"><![CDATA[ import org.jetel.component.BasicJavaRunnable; public class testJavaRunnable extends BasicJavaRunnable { public void run() { System.out.println("Basic example of Java Execute, value of testingProperty = " + parameters.getProperty("testingProperty")); } } ]]></attr> </Node>
Writes/reads records to/from lookup table. Depending on ports connected records are read put to lookup table and/or read from it.
Input ports:
Output ports:
Xml attributes:
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
type | yes | component type | LOOKUP_TABLE_READER_WRITER | |
lookupTable | yes | name of lookup table for reading or writing records | ||
freeLookupTable | no | idicates if close lookup table after finishing execute() method. All records, which are stored only in memory will be lost | false |
Example:
<Node id="READ" type="LOOKUP_TABLE_READER_WRITER" lookupTable="LookupTable1" />
since 2.4
This component runs specified graphs. Filenames of the graphs to be executed can be supplied by the input edge. The first field in input metadata must be String. In case the input is not connected a single filename must be specified as the attribute graphName.
Graph can be executed
If you know that your graphs need more memory, set the alternativeJavaCmdLine attribute to “java -Xmx1g -cp” or equivalent. Remember that sameInstance must be set to false, otherwise this alternativeJavaCmdLine setting would be ignored.
If you set sameInstance to true, you should set the maximum needed memory in the Open Run Dialog before running the graph.
Remember that memory cannot be set as Command line argument.
Choosing of processing mode depend on attribute graphName. If attribute graphName is defined then pipeline mode is used else in-out mode is used.
In this mode graph to execute is specified by attribute graphName. If a input is connected then it's ignored. Each RunGraph node may be used as resolution node, because it chooses output port according to graph execution success/fail.
Input ports: no input
Output ports:
In this mode input is connected and attribute graphName is not set.
The first field in input metadata must be String containing path to graph file definition.
Thus it executes graph for each incoming record.
If input metadata contains more then one filed then the second field must be String and it contains clover command line arguments.
Input ports: port 0 - first field of record must be String path to graph file definition.
Second field is optional - it must be String and contains clover command line arguments.
Output ports: port 0 - one record for each graph execution is generated; it may describe fail or success otherwise
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
type | yes | component type | RUN_GRAPH | |
graphName | no | path to XML graph definition | ||
sameInstance | no | if it's true, graph will be executed in the same instance of JVM; otherwise it will be executed as external process | true | |
ignoreGraphFail | no | if more graphs is executed and any of them fails, then execution next graphs depends on this attribute. Note: used only if in/out mode is used. | false | 2.5 |
logFile | no | path to file for logging output of external process | 2.8 - before it was usable only if sameInstance was true |
Additional attributes for execution in separated instance of JVM
if sameInstance is true, all following attributes are ignored
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
logAppend | no | if true, log will be appended to existing, otherwise existing log will be overwritten | false | |
alternativeJavaCmdLine | no | command line to execute external process | java -cp | |
graphExecClass | no | full class name which runs graph | org.jetel.main.runGraph | |
cloverCmdLineArgs | no | additional parameters for graph exec class |
Example:
<Node id="RUN1" type="RUN_GRAPH" graphName="${WORKSPACE}/graphXmlWriter.grf" />
<Metadata id="Metadata0"> <Record name="outdata" recordSize="-1" type="delimited"> <Field delimiter=";" name="graph" nullable="true" type="string"/> <Field delimiter=";" name="result" nullable="true" type="string"/> <Field delimiter=";" name="description" nullable="true" type="string"/> <Field delimiter=";" name="message" nullable="true" type="string"/> <Field delimiter="\n" name="duration" nullable="true" type="decimal"/> </Record> </Metadata>
Output metadata fields
| Field | Description |
|---|---|
graph | path to file with executed graph |
result | “Finished successfully” or “Aborted” or “Error” |
description | text description useful when graph fails |
message | string value of org.jetel.graph.Result |
duration | graph execution duration in milliseconds |
since 2.6
Checks incoming records based on specified key whether all records are in the properly order. The key is a name (or combination of names) of field(s) from the input record. The sort order is either Ascending (default) or Descending.
Input ports: * one input port defined/connected.
Output ports:
Xml attributes:
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
type | yes | component type | EXT_SORT | |
sortKey | yes | sequence of field names followed by (a) or (d) meaning that the field is sorted in ascending or descending order, respectively. The individual expressions are separated by :;| {colon, semicolon, pipe}. Example: Surname(a);FirstName(a);Salary(d). | ||
sortOrder | no | one of “Ascending|Descending” {the fist letter is sufficient} | Ascending | |
uniqueKeys | no | true/false permit duplicate keys | false | |
equalNULL | no | specifies whether two fields containing NULL values are considered equal. | true | 2.8.1 |
locale | no | locale to be used when sorting using I18N rules. If not specified, then system default is used. Example: “fr” | ||
useI18N | no | true/false perform sorting according to national rules - e.g. Czech or German handling of characters like “i”,”Ă”. Default is false.¨Use it only if you are sorting data according to key which can contain accented characters or you want sorter to follow certain locale specific rules. | false |
Example:
<Node id="SEQUENCE_CHECKER0" type="SEQUENCE_CHECKER" sortKey="LastName" />
The same as Simple Copy, but it waits specified amount of milliseconds before each record is sent to output. All records from input port are copied onto all connected output ports.
Input ports: * one input port defined/connected.
Output ports:
Xml attributes:
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
type | yes | component type | SPEED_LIMITER | |
delay | yes | length of delay between records in milliseconds |
Example:
<Node id="SPEED_LIMITER0" type="SPEED_LIMITER" delay="2000" />
This component executes the specified command and arguments in a separate process and send output from process to output port. If there is specified command interpreter, command string is written to tmp file and there is called this interpreter to execute system script. If there is not connected output port, output from process (together with error) can be send to a file.
Input ports:
Output ports:
Xml attributes:
| Attribute | Mandatory | Description | Default | ETL Version since |
|---|---|---|---|---|
id | yes | component identification | ||
type | yes | component type | SYS_EXECUTE | |
command | yes | command to be execute by system. If there is specified interpreter, command is saved in tmp file and there is called interpreter,which executes this script | ||
capturedErrorLines | no | number of lines that are print out if command finishes with errors | ||
outputFile | no | path to the output file | ||
append | no | whether to append data at the end if output file exists or replace it (values: true/false) | false | |
interpreter | no | It has to have form:“interpreter name [parameters] ${} [parameters]”, where in place of ${} System Execute is to put the name of script file | ||
workingDirecory | no | this component's working directory | current directory | 3.0 |
environment | no | system-dependent mapping from variables to values. Mappings are separated by :;| {colon, semicolon, pipe}. By default the new value is appended to the environment of the current process. It can be changed by adding !false after the new value, eg.: PATH=/home/user/mydir appends /home/user/mydir to the existing PATH, but PATH=/home/user/mydir!false replaces the old value by the new one (/home/user/mydir). | 3.0 | |
deleteBatch | no | whether to delete or not temporary batch file | true | 3.0 |
charset | no | encoding used for formating/parsing data for input/from output of system process | ISO-8859-1 | 3.0 |
Example:
<Node id="SYS_EXECUTE0" type="SYS_EXECUTE"> <attr name="capturedErrorLines">3</attr> <attr name="command">rm test.txt</attr> </Node> <Node id="SYS_EXECUTE0" type="SYS_EXECUTE"> <attr name="interpreter">sh ${}</attr> <attr name="outputFile">data/data.txt</attr> <attr name="command">touch data/data.txt cat dir /home/username </attr> </Node> <Node id="SYS_EXECUTE0" type="SYS_EXECUTE"> <attr name="interpreter">sh ${} /tmp</attr> <attr name="command">cat ls -l $1 </attr> </Node> <Node append="false" id="SYS_EXECUTE0" interpreter="sh ${}" outputFile="data-tmp/data.txt" type="SYS_EXECUTE"> <attr name="environment">PATH=/mybin;MY_VARIABLE=my_value</attr> <attr name="deleteBatch">false</attr> <attr name="workingDirectory">/home/user/home/tmp</attr> <attr name="command">echo $PATH echo $MY_VARIABLE touch data/data.txt cat dir . </attr> </Node> When you want to use input records as arguments of your command: <Node id="SYS_EXECUTE0" interpreter="xargs sh ${}" type="SYS_EXECUTE"> <attr name="command">mv -t processed-in $@</attr> </Node>
In following example, the record with file name is sent to System Execute component, that process' the file, creating a xml file as output. This resulting file is sent to XMLExtract component. You can send whole file (if it is small) as one record with following metadata:
<Metadata id="Metadata1" > <Record name="oneRecord" type="delimited"> <Field eofAsDelimiter="true" name="data" nullable="true" type="string"/> </Record> </Metadata>or cut the file to parts if it is bigger:
<Metadata id="Metadata1" > <Record name="oneRecord" type="delimited" recordDelimiter="\n"> <Field eofAsDelimiter="true" name="data" nullable="true" type="string"/> </Record> </Metadata>Then you need to set fileURL attribute of your Reader properly (see File URL attribute). If you can parse each portion of data separately, use discrete key word (eg. port:$0.fieldName:discrete), if it is not possible use stream key word (eg. port:$0.fieldName:stream) - then all data are concatenated and parsed as one data source.