Shared Transformer Rules

Overview

This article contains relevant information regarding the following Shared Transformer rules:


add_metadata

This rule allows user to add the ability to merge rich dimensions mapping tables when metadata is set to MergeMapping and value is set to true.

Transformer example:

    "add_merge_mapping_metadata": {
      "useSharedRule": true,
      "ruleName"     : "add_metadata",
      "params"       : {
        "metadata": "MergeMapping",
        "value"   : "true"
      }             
    },

calculate_number_of_days

It calculates the number of days between two dates and saves this as a stringified integer in a context key with a name of your choosing. -1 if error

Param Description
date1date2 Cannot be null or blank; date1 and date2 are the dates we'd like to find the difference in
date1_formatdate2_format Cannot be null or blank; date1 and date2 should have corresponding date formats as specified in date1_format and date2_format
contextkey_name Cannot be null or blank 

The return value for this rule is null.

Transformer example:

  "calculate_days_to_respond": {
	"useSharedRule": true,
	"ruleName"     : "calculate_number_of_days",
	"params"       : {
		"date1"          : "Contacted_Date",
		"date2"          : "Response_Date",
		"date1_format"   : "yyyyMMdd",
		"date2_format"   : "yyyy-MM-dd",
		"contextkey_name": "respondDays"
    }
  },

 


clean_column_values

It cleans multiple columns with the same regular expression.

Param Description
columns The list of column names you want to cleanse
replace The regex you want to find the match then apply the replace
with Optional. If not specified, replace with blank. After finding the match by using regex, replace the match using this attribute.
case_insensitive Optional. The default is false. If true, the regex will find the match in a case insensitive way.
delimiter Optional. If not specified the split will occur on ,

Transformer example:

"cleanse_hyphen_values": {
      "useSharedRule": true,
      "ruleName"     : "clean_column_values",
      "params"       : {
        "columns": "Planned Units,Planned Spend,Actualized Spend,Actual Delivery",
        "replace": "-",
        "with"   : "0"
      }
    },

 


concatenated_dimensions

It concatenates the given columns in the file and put the concatenated dimension to the first column in the output file.

Param Description
headerExpr Expression to find the headers
columnsToConcat Columns to concatenate values
concatenatedHeader Output header name 
delimiter Delimiter used for concatenated string

Transformer example:

    "concatenated_dimensions": {
      "useSharedRule": true,
      "ruleName"     : "concatenated_dimensions",
      "noMatch"      : "ERROR",
      "params"       : {
        "headerExpr"        : ".*campaignId.*parentCampaignId.*",
        "columnsToConcat"   : "targetDemo,targetStartAge,targetEndAge",
        "concatenatedHeader": "concatenatedTargetDemo",
        "delimiter"         : "_"
      }
    },

Input example:

targetDemo = "A"
targetStartAge = "B"
targetEndAge = "C"

Output example:

concatenatedTargetDemo = "A_B_C"


context_key_cleansing

This rule will look at the rowLevelCache contextKey that is specified.
Any key-value pair that is put after contextKey param will be treated as the find value, and replace value, respectively.

We can have as many key-value pairs as we'd like to clean

Param Description
contextKey contextKey from rowLevelCache that will be cleansed

Transformer example:

    "context_key_cleansing": {
      "useSharedRule": true,
      "ruleName"     : "context_key_cleansing",
      "noMatch"      : "IGNORE",      
      "params"       : {
        "contextKey": "ad_key",
        "MPPA"      : "Multi Product Post Ad",
        "PPLA"      : "Page Post Link Ad",
        "PPPA"      : "Page Post Photo Ad",
        "PPVA"      : "Page Post Video Ad"
      }
    },

 


convert_date_values

This rule converts the date format of the desiredColumn param in the raw file. For example, if the raw file's column A has a format of MM/dd/yyyy and we'd like the format to be yyyy-MM-dd then originalDateFormat is MM/dd/yyyy, desiredDateFormat is yyyy-MM-dd, desiredColumn is A. Neither of these params can be null. 

Param Description
originalDateFormat Date format of desiredColumn from the input file
desiredDateFormat Date format for the output file
desiredColumn Column from the input file that contains date to be converted
dimensionName Desired dimension name in the output file

Transformer example:

    "convert_dimension_video_createddate": {
      "useSharedRule": true,
      "ruleName"     : "convert_date_values",
      "noMatch"      : "IGNORE",
      "params"       : {
        "originalDateFormat": "MMM dd yyyy",
        "desiredDateFormat" : "yyyy-MM-dd",
        "desiredColumn"     : "Created Date",
        "dimensionName"     : "dimension(Video_CreatedDate)"
      }
    },

 


embeddedlink_to_string

This rule converts an embedded link in an input file and turns it into a string to be transformed.

Param Description
urlExpr The URL that is specified in the input file, if the URL cannot be extracted, try 'inspect element'
sheetIndex

Defaulted to 0, and is 0 based

i.e. if we want the second sheet then sheetIndex = 1

accountName

Cannot be blank or null; Name of account that will be using an input file:

i.e.,

https://https://app.beckon.com/{accountName}/home

transformerName Cannot be blank or null; Name of the transformer that will be used to transform downloaded file from URL

Transformer example:

    "extract_file_from_email" : {
      "useSharedRule": true,
      "ruleName"     : "embeddedlink_to_string",
      "params"       : {
        "urlExpr"        : ".*href=.*(http.*sprinklr.*EXPORT.*zip).*>Download Report.*",
        "sheetIndex"     : "0",
        "accountName"    : "beckon",
        "transformerName": "sprinklr_facebook"
      }     
    },

 


enrich_with_context_key

Same as save_context_key_as_dimension shared rule. The only difference is this rule will only run once for transforming which means the dimension populated by this rule will only contain one value.

Param Description
contextKey Name of context key created
dimensionName Dimension name to import into the Beckon app

Transformer example:

  "extract_report_name": {
	"useSharedRule": true,
	"ruleName"     : "enrich_with_context_key",
		"params": {
		"contextKey": "fileName",
		"dimensionName": "dimension(Report File)"
    }
  },

 


excel_csv_to_beckon_csv

This converts excel format complying csv format to beckon supporting csv format
Beckon csv uses a double quote as the escape character.
Param headerExpr cannot be null and has to be findable

Param Description
headerExpr The header in the .csv file that needs to be converted

Transformer example:

  "convert_to_beckon_csv": {
	"useSharedRule": true,
	"ruleName"     : "excel_csv_to_beckon_csv",
	"params"       : {
      "headerExpr": ".*Date.*"
    }
  },

 


expression_metric

Write expressions to create a metric.

Param Description
expression

The expression to create the metric.

You can use $row['COLUMN_NAME'] and $contextKey['KEY_NAME'] in the expression. Also, you can write any values in the expression too.

For example, $row['spend'] / 1000000'

metric_name

The metric HID of you want to assign to

null_handler

Specify if the column or the contextKey contains a 'bad' value, what the script should do.

For example, if contextKey click_rate_key contains letters instead of numbers, REPLACE_ZERO will change the value to 0 instead. Or, if choosing NO_RETURN, the script will exit.

Transformer example:

  "expression_metric": {
    "useSharedRule": true,
    "ruleName"     : "expression_metric",
    "noMatch"      : "IGNORE",
    "params"       : {
      "expression"  : "$row['column1'] * $row['column2'] + $contextKey['key1']",
      "metric_name" : "Spend",
      "null_handler": "c1=>NO_RETURN,key1=>REPLACE_ZERO"
    }
  }

filter_by_dimension_value

Beckon's built-in SimpleFilterRule is able to filter by values in a particular column in the input file. The SimpleFilterRule will return an error in file transformation when the column is not found in the file.

By using this shared rule, users are able to successfully transform files even when the column is not available in the input file. This rule will SKIP_ROW if the dimension equals to the given value.

Param Description
dimension

The dimension to apply the filter.

value

The value to filter on. If the given value matches the value in the output, the row is excluded from the output file.

Transformer example:

    "ignore_test_countries": {
      "useSharedRule": true,
      "ruleName"     : "filter_by_dimension_value",
      "params"       : {
        "dimension": "dimension(Country)",
        "value"    : "Test"
      }
    },

filter_keep_only

Used for keeping rows with certain values only. Unlike the filter rule where it excludes rows if the value is found, this rule includes only rows where the value is found.

Param Description
column

The column to check for the given values in list. Use either column or contextkey.

contextkey  The contextkey to check for the given value in list. Use either column or contextkey.
list

Rows with the values in this param will be kept. Everything else will be skipped.

list_sep

Optional. By default, the list separator is a , . If a keyword or regex is used, you will need to use a double back slash to escape the list_sep.

For example: | is the preferred separator, you'll have to escape it by using \\|

Transformer example:

    "filter_keep_US_only": {
      "useSharedRule": true,
      "ruleName"     : "filter_keep_only",
      "params"       : {
        "column": "Country",
        "list"  : "USA"
      }
    },

  


historic_load_keep_first_n_dates

This rule applies after transforming the data. The rule goes through the transformed file and includes data for only the first number of days specified in numDays.

Param Description
numDays The first n days you want to keep in the output file

For example,

numDays = 3
Date in the transformed file is 1/1 - 1/7
The output file will only contain data for 1/1, 1/2 and 1/3. 

Transformer example:

    "keep_monday_and_tuesday_only": {
      "useSharedRule": true,
      "ruleName"     : "historic_load_keep_first_n_dates",
      "params"       : {
        "numDays"  : "2"
      }
    },

map_countries_only (DEPRECATED)

Currently used only in accounts prior to 2017. Please refer to dimension cleansing for this purpose.

This rule maps country-like column content into Beckon supported country names. This rule is used to normalize country names across multiple data sources.

For example:

Data Source 1 input value = "USA"
Data Source 2 input value = "United States"
Data Source 3 input value = "US"

Output value = "United States"


map_country_dimensions (DEPRECATED)

Currently only used in accounts prior to 2017. Please refer to dimension cleansing for this purpose.

Similar to map_countries_only rule, this rule maps country-like column content into Beckon supported country names. This rule is used to normalized country names across multiple data sources with an additional dimension for country codes.

Example:

dimension(Country) = United States
dimension(Country Code) = US

 


map_us_state_dimensions (DEPRECATED)

Currently only used in accounts prior to 2017. Please refer to dimension cleansing for this purpose.

This rule maps state-like column content into Beckon supported state names. This rule is used to normalize state names across multiple data sources.

For example:

Data Source 1 input value: "IA"
Data Source 2 input value: "Iowa"

Output value = "Iowa"

 


beckon_mapping_scripted_read_through

This rule takes a raw file and turns it into a map that the user specifies with given parameters.
The params, need to be specified in the connector properties and not in the transformer.

Param

("Advanced" section of the Connector)

Description

mapping.merge_column_input_property

Optional, and is used to specify which columns should be merged

mapping.merge_column_output_property

Required if the previous param exists, and is used to state final merged column's name

mapping.input_header_property

Required, describes which raw file columns are wanted

mapping.output_header_property

Required, output header of the transformed file. If mapping.merge_column_input_property is not specified then mapping.input_header_property and mapping.output_header_property need to be equal in length.

mapping.primary_column_property

Optional, this selects a key for the map, if not stated, the default is the first column.
Please keep the order of mapping.input_header_property and mapping.output_header_property the same otherwise there's no way of knowing each column's alias. For columns that are merged, no need to repeat that reference.

Beckon's standard Twitter Ads data utilizes this particular rule for joining various campaign name and ad name mappings with the rich dimensions feature.

Example of Beckon's Twitter Ads use case:

Input file header              : accountId, campaignId, id, name, objective
Cells to merge                 : accountId, campaignId
Dimension name for merged value: TW_Ad_profileID_campaignID
Output file header             : TW_Ad_profileID_campaignID, TW_Ad_Adset_ID, TW_Ad_Adset_Name, TW_Ad_Objective

The column to use as the key for Rich Dimensions mapping: TW_Ad_Adset_ID

What it looks like in the "Advanced" tab in the connector settings:

Screen_Shot_2018-10-05_at_11.14.36_AM.png

What it looks like in the transformer Beckon_mapping_tx:

    "scripted_read_through" : {
      "useSharedRule": true,
      "ruleName"     : "beckon_mapping_scripted_read_through"
    }

 


mapping_tx_dedupe_pretx_rule

This rule is used for transforming a raw file into a deduplicated mapping file.

Param Description
targetCol Required. This represents the column to check for and remove duplicated rows. The column begins with 0.
textBlob Optional. This identifies the columns that contain foreign characters that would need to be searched and replaced with. The column begins with 0.
illegalChars

Optional. List all characters that should be replaced in the column specified in textBlob.

Eg: illegalChars: `,

replaceWith Optional. If not specified, defaults to blank.
duplicateAction

Optional. If unspecified, default is ERROR.

Allowed param value:

ERROR Immediately raises an error when duplicate values are found
FIRST_DUPE If duplicates are found, use the first value
LAST_DUPE If duplicates are found, use the last value

Transformer example:

    "mapping_tx_dedupe_pretx_rule": {
      "useSharedRule": true,
      "ruleName"     : "mapping_tx_dedupe_pretx_rule",
      "params"       : {
        "targetCol"      : 0,
        "duplicateAction": "ERROR"      
      }
    }

percentage_metric_alias

This rule will alias a metric and divide the value by 100 converting it to a float number.

For example, if the metric value is 89, the script will then provide a final metric value of 0.89. 

Param Description
Param key is the column name from the input file

Use the column from the input file as the key and the metric HID as the value.

Eg: If the column name in the input file is "spend ($)" and the desired metric HID in Beckon is "Ad Spend", then the rule in the transformer looks like:

  "percentage_metric_alias": {
    "useSharedRule": true,
    "ruleName": "percentage_metric_alias",
    "params": {
      "spend ($)": "Ad Spend"
    }
  }

preprocess_search_and_replace

This is a pre-processing rule, which means it takes place before the file is transformed. The rule finds a string and replaces the value in the input file before the transformation begins.

Param Description
search

The string to search for

replace

Value to replace string with 

Transformer example:

  "preprocess_search_and_replace": {
	"useSharedRule": true,
	"ruleName"     : "preprocess_search_and_replace",
	"params"       : {
		"search" : "NULL",
		"replace": "0"
    }
  }
 

pretxform_fix_double_quote (DEPRECATED)

Please use the rule excel_csv_to_beckon_csv instead.

This rule is used to convert double quotes to slash quotes so that the file can be read successfully.


reduce_columns

This rule is used to determine the columns to keep vs exclude from the input file.

Param Description
headerExpr

The regex for determining which row to use as the header for the data file

keepColumns

Use either keepColumns or excludeColumns.

List the columns to keep.

excludeColumns

Use either keepColumns or excludeColumns.

List the columns to exclude.

Transformer example:

    "remove_unused_columns": {
      "useSharedRule": true,
      "ruleName"     : "reduce_columns",
      "params"       : {
		"headerExpr"    : ".*dcm_advertiser_id.*",
		"keepColumns"   : "dcm_advertiser_id,dcm_advertiser_name",
		"excludeColumns": "dcm_placement"
      }
    },
 

rename_headers

This rule is used to either downcase or rename headers after finding them using a list of regular expressions.

Param Description
headerExpr

The regex for determining which row to use as the header for the data file

findRegexes

Header(s) to search for

cleanHeaders

Rename header(s) that have been found

Transformer example:

    "header_rule": {
      "useSharedRule": true,
      "ruleName"     : "rename_headers",
      "params"       : {
        "headerExpr"  : ".*id.*",
        "findRegexes" : "id,name",
        "cleanHeaders": "Twitter-Ads_CampaignID,Ad-Campaign"
      }
    },

save_context_key_as_dimension

This rule can be used to create a dimension HID using a context key.

Param Description
contextKey

Context key is created using the built in transformer rule class RowExtractionRule.

dimensionName

The dimension HID to save contextKey as

Transformer example:

  "store_click_through_url": {
	"useSharedRule": true,
	"noMatch"      : "IGNORE",
	"ruleName"     : "save_context_key_as_dimension",
	"params"       : {
		"contextKey"   : "click_through_url",
		"dimensionName": "dimension(Ads_Click-Through_URL)"
    }
  },

save_dimension_to_rulecontext

This rule is used to save the specified dimension as a contextKey for use in a different rule instead of using the RowExtractionRule to create a contextKey.

Param Description
dimensionName

The dimension HID to save as a contextKey

contextKeyName

The name of the contextKey

Transformer example:

    "extract_brand_for_pagetype_mapping": {
      "useSharedRule": true,
      "noMatch"      : "IGNORE",
      "ruleName"     : "save_dimension_to_rulecontext",
      "params"       : {
        "dimensionName" : "dimension(Brand-Product)",
        "contextKeyName": "ga_brand"
      }
    },

save_keys_to_dimension

Similar to save_context_key_as_dimension. The difference is that this rule can be used to create multiple dimensions using multiple contextKey.

Param Description
Param key is the contextKey

Value should be the dimension HID that you would like to alias using the contextKey.

Transformer example:

    "ad_name_keys_to_dimensions": {
      "useSharedRule": true,
      "ruleName"     : "save_keys_to_dimension",
      "params"       : {
        "brand"       : "dimension(Product Brand)",
        "fiscalYear"  : "dimension(Fiscal Year)",
        "campaign"    : "dimension(Campaigns)",
        "social_buy"  : "dimension(Buy Type)",
        "adTarget"    : "dimension(Ad Target)",
        "creativeType": "dimension(Creative Type)",
        "adType"      : "dimension(Ad Type)",
        "videoLength" : "dimension(Video Length)"
      }
    },

save_merged_columns

This rule is used to merge values from multiple columns into a single string and is saved into a contextKey that can be used in a different rule.

Param Description
context_key

Name of the contextKey

columns

List and order of columns to merge

key_delimiter

The delimiter for merged columns.

Eg: If the values in the columns to merge are A, B and C and the delimiter is _, the output value is A_B_C

case

upper: converts string to upper case

lower: converts strong to lower case

Transformer example:

    "extract_key_for_paid_type_mapping": {
      "useSharedRule": true,
      "ruleName"     : "save_merged_columns",
      "params"       : {
        "context_key"  : "social_paid_type",
        "columns"      : "Is Dark Post,Is Promotable Only(Dark) Tweet,Is Sponsored",
        "key_delimiter": "_",
        "case"         : "upper"
      }
    },

 


save_merged_context_keys

This rule is used to merge multiple contextKeys into a single new contextKey.

Param Description
context_keys

Name of the contextKeys to merge

new_context_key

Name of merged contextKey

params_delimiter

The delimiter used in context_keys to separate the values

new_key_delimiter

The delimiter for values of merged contextKeys

    "merge_context_keys": {
      "useSharedRule": true,
      "ruleName"     : "save_merged_context_keys",
      "params"       : {
        "context_keys"     : "ga_pagepath,ga_brand",
        "new_context_key"  : "ga_pagetypemap",
        "params_delimiter" : ",",
        "new_key_delimiter": "_"
      }
    },

strip_bom_from_header

This rule is used to strip the BOM from the first column of the header row. Use in place of the class SimpleHeaderRule.

Param Description
header_expr

Regex used to determine the row to use as a header

lowercase

Default is false. If true, converts the column to lowercase.

case_insensitive

Default is false. If true, the regex for finding the header ignores case

    "header_rule": {
      "useSharedRule": true,
      "ruleName"     : "strip_bom_from_header",
      "params"       : {
        "header_expr"     : ".*Date.*",
        "lowercase"       : false,
        "case_insensitive": false
      }
    },

truncate_all_dimensions

Beckon has a limit of 510 characters for a dimension instance value. If a dimension instance value has more than 510 characters in it, the import job will fail. This rule will truncate all dimension instance values that are over 509 characters.

Transformer example - this rule can be copied and pasted into the transformer as is:

    "truncate_all_dimensions": {
      "useSharedRule": true,
      "ruleName"     : "truncate_all_dimensions"
    }

 


truncate_dimension_by_param

This rule is used to truncate values after a specific character. For example, if you have the following value:

googleads.com/dataPage/campaignID=12345_utm=doubleclick_email?index.6789@XeYTuVB

And you want everything to be captured up to the ?, this rule will be able to exclude everything after the ?. The final output of the value will be:

googleads.com/dataPage/campaignID=12345_utm=doubleclick_email

Param Description
column

Identify the column to check for values

truncateDelimiters

Characters to truncate after. Users can have multiple characters separated by a comma.

keepDelimiters

Default is false. If true, the character specified in truncateDelimiters will be included in the output.

Using the example provided,

If false, output value will be

googleads.com/dataPage/campaignID=12345_utm=doubleclick_email

If true, output value will be

googleads.com/dataPage/campaignID=12345_utm=doubleclick_email?

Transformer example:

  "truncate_dimension_by_param": {
	"useSharedRule": true,
	"ruleName"     : "truncate_dimension_by_param",
	"params"       : {
		"column"            : "ga_pagePath",
		"truncateDelimiters": "?",
		"keepDelimiters"    : false
    }
  },

truncate_dimension

Similar to truncate_all_dimensions. The only difference is that this rule only truncates the dimension values for a specific dimension that is identified by the user.

Param Description
dimension

Determine the dimension to truncate

max_size

Optional. The default is 510. Users can choose to have a dimension instance value that is shorter than 510.

Transformer example:

    "truncate_source_dimension": {
      "useSharedRule": true,
      "ruleName"     : "truncate_dimension",
      "params"       : {
        "dimension": "dimension(Landing_Page)",
        "max_size" : 400
      }
    },

Return to top of page 

Comments

0 comments

Article is closed for comments.