Substring
The Substring node is used to include only a substring consisting of a given number of characters from a string column. Unlike String Left and String Right, you must define the starting position and the direction. This function is useful if the columns contains long strings, of which only parts are useful or relevant to end users. For example, a column listing transaction Ids may contain strings that are 20 characters in length. Possibly, only a small substring denotes the store in which the transaction occurred and the salesperson, for instance. If that is the only part of the string that end users are interested in, then it makes sense to generate a new column containing only the relevant substring.
Take the Substring
Start by connecting the Substring node to the Select node of the relevant table. Go to the Properties panel to define the substring:
Select Column: choose the column from which to take the substring. Only columns that are assigned to the 'string' type will be listed here.
Start Position: enter a numeric value to define the starting position of the substring. The first character is zero (0). For instance, if the string is 'abcdefg' and 'abc' is not required, the start position should be '3'.
Direction: set the direction from which the start position of the substring should be determine; either from the start of the string of from the end.
Length: the number of characters to include in the substring. This can be all of the remaining characters, or a specified number of characters.
Column Output: decide whether or not to keep the original column.
New Column Name: name the new column.
In this example, let's say that we have a column of id's of customers belonging to our VIP loyalty program. The 2nd, 3rd, and 4th digits of these id's indicates the store in which the customer was signed up to the program. For instance, in the ID '1234567', the store ID is '234'
We want to create a new Customer ID column showing only the section of the ID that denotes the store. To do this, we connect the Substring node to the table, and set Customer ID as the 'Select Column'. To take the required substring, the start position is set to 1, and the number of characters is set to 3.
In the resulting column (green highlight) we see that only the 2nd, 3rd, and 4th digits are listed.