Default Properties
CloverETL framework can be configured through defaultProperties resource file which is located in “org/jetel/data/” subdirectory of CloverETL engine. Various parameters stored in this file are loaded by the engine at run-time and are used during the transformation execution. These parameters can not be altered once the engine is running. These predefined values stored in the file can be overloaded by custom configuration file on startup, where could be predefined a subset of the properties. Path to configuration file could be defined by two different ways with growing priority:
1) In the system property cloveretl.properties can be specified an java resource file.
2) Through command line attribute -config can be passed a custom configuration file.
Following text describes some of the values stored in defaultProperties and their meaning. The file itself contains comments which may provide additional information.
Here are some CloverETL run time constants which limit the maximum sizes of different data objects
| Constant name | Value | Description |
|---|---|---|
Record.MAX_RECORD_SIZE | 12288 | This limits the maximum size of data record in binary form. The binary form is the form used by Clover when manipulating data. Parsers are here to convert text representation or database representation of data records to Clover's internal. Some data can have larger representation in text form - dates, numbers and some shorter - strings, for example (java stores strings in unicode - 16bits per character) If you start getting buffer overflow or similar errors, increase this value. The limit is theoretically 2^31 , but you should keep it under 64K. Clover at runtime allocates several internal buffers of “MAX_RECORD_SIZE”, thus increasing this value increases also memory utilization. |
DataParser.FIELD_BUFFER_LENGTH | 4096 | This is a constant for textual data parsers (and also formatters). It determines what is the maximum size of one particular data field in text format. If you have data containing long text strings, increase this value. The impact on memory utilization is low as each parser your graph uses allocates only one such buffer. |
Size limits for data type (inherent, can not be changed)
| Data type | size |
|---|---|
| String | limited only by available memory. Theoretical limit 2^31. |
| Integer | 4 bytes, minimal value which can be represented is -2^31,maximum is 2^31-1. |
| Date | 8 bytes (aka long) |
| Number | 8 bytes (aka double) |
| Long | 8 bytes, minimal value -263 , maximum 263-1 |
| Decimal | depends on defined length & precision |
| Binary | limited only by available memory. Theoretical limit 2^31. |
| Constant name | Value | Description |
|---|---|---|
DEFAULT_INTERNAL_IO_BUFFER_SIZE | 32768 | This constant determines the internal buffer clover components allocate for I/O operations. Again, increasing this value does not have big impact on overall memory utilization as only few such buffers are used at runtime. There is no sense in increasing this value to speed up something. It has been tested that the performance improvement is negligible. However, if you increase the size of “MAX_RECORD_SIZE” , make sure this value is minimally “2*MAX_RECORD_SIZE”. |
DEFAULT_IOSTREAM_CHANNEL_BUFFER_SIZE | 2048 | When creating InputStream or OutputStream objects, what is the size of their internal buffer. Used mainly in creating Channels from these streams - in DelimitedDataReader,FixLenReader,.. |
Lookup.LOOKUP_INITIAL_CAPACITY | 512 | The initial capacity (#records) of a lookup table (or internal hash-table for HashJoin) when created without specifying the size at object creation. |
| Constant name | Value | Description | Version since |
|---|---|---|---|
DEFAULT_DATE_FORMAT | yyyy-MM-dd | The default mask used when converting DATE value from/to STRING. Only date/day information is considered. | |
DEFAULT_DATETIME_FORMAT | yyyy-MM-dd HH:mm:ss | The default mask used when converting DATE value from/to STRING. Date and Time information is considered. | |
DEFAULT_TIME_FORMAT | HH:mm:ss | The default mask used when converting DATE value from/to STRING. Only Time information is considered. | |
DEFAULT_REGEXP_TRUE_STRING | T|TRUE|YES|Y||t|true|1|yes|y | The default mask used when converting BOOLEAN true value from/to STRING. | 2.4 |
DEFAULT_REGEXP_FALSE_STRING | F|FALSE|NO|N||f|false|0|no|n | The default mask used when converting BOOLEAN false value from/to STRING. | 2.4 |
DatafoFmatter.DEFAULT_CHARSET_ENCODER | ISO-8859-1 | The default character-encoder to be used | |
DataParser.DEFAULT_CHARSET_DECODER | ISO-8859-1 | The default character-decoder to be used |