Skip to main content

add_key

The add_key processor adds a deterministic key derived from the properties of the DataEntity or items in a DataWindow. It adds the key to both the incoming document and its metadata. Used for indexing or re-indexing data.

Usage

Add a key to a record

Example of a job using the add_key processor

{
"name" : "testing",
"workers" : 1,
"lifecycle" : "persistent",
"assets" : [
"standard"
],
"operations" : [
{
"_op": "test-reader"
},
{
"_op": "add_key",
"key_name": "_key",
"key_fields": [
"name",
"age"
],
"minimum_field_count": 1,
"hash_algorithm": "md5"
}
]
}

The output from the example job

const data = [
DataEntity.make({ name: 'joe', age: 34 }),
]

const results = await processor.run(data);

results = [{ name: 'joe', age: 34, _key: '42mFPfdm-kTh7Q_2E_VjvQ' }]

Parameters

ConfigurationDescriptionTypeNotes
_opName of operation, it must reflect the exact name of the fileStringrequired
key_nameName of field that will store the key value, this applies to the document and its metadataStringdefaults to _key
key_fieldsList of fields whose values will be used to create the key, if left blank then all the fields will be usedArray of Stringsdefaults to an empty array
invert_key_fieldsIf set to true the processor will use the fields not listed in key_fields to create the keyBooleandefaults to false
hash_algorithmAlgorithm used to hash the field valuesValid options md4, md5, sha1, sha256, sha512, and whirlpooldefaults to md5
minimum_field_countThe number of fields required to make the key. Fields that are empty or undefined are excluded from the key values. If the minimum count is not met for a record then it will not be returned by the processorNumberdefaults to 0
preserve_original_keyCopies the incoming records metadata _key value to _original_key, can be useful when re-indexing a data setBooleandefaults to false
delete_originalCopies the metadata _key to _delete_id in the metadata. This allows a teraslice job to index data with a new key while deleting records by their old key in one pass, must be paired with the elasticsearch-assets bulk_senderBooleandefaults to false
truncate_locationGeo-point fields whose value should be truncated for keying purposes. Supports the geo-point formats listed here. It does not alter the value in the incoming documentString Arraydefaults to an empty array
truncate_location_placesThe number of digits to keep after the decimal for the lat, lon values of a geo-point if truncate_location is trueNumberdefaults to 4