Extracting values from inconsistent patterns using regular expressions
Sometimes, you are looking for patterns on a hit that are not consistent. For example, an error message could be formatted like the following:
<error id="35">Coupon Code is Invalid<\error>
The id="35"
part is variable, since it represents
the actual message itself. If you only want to retrieve the text part
(Coupon Code is Invalid
), you can't use a Basic Mode
hit attribute, since the hit attribute requires strictly consistent
patterns to match. You must use a hit attribute to return a much larger
string of text and then extract the wanted portions to be the value.
So, first make a hit attribute that matches the consistent parts of the pattern:
This configuration returns 35">Coupon Code is Invalid
as
the value. However, The wanted value may be just the Coupon
Code is Invalid
message.
To limit the pattern to only match for the message, you must apply a regular expression to the pattern to extract the wanted text. Below is the modified JavaScript™:
function NS$E__ERROR_IN_RESPONSE_WITH_REGEXP__634290902173496582()
{
if ($P["NS.P__ERROR_IN_RESPONSE__634290901401436582"].patternFound())
{
$P["NS.P__ERROR_IN_RESPONSE__634290901401436582"].lastValue().
match('.*?">(.*)$ ');
$F.setFact
("NS.F_E__ERROR_IN_RESPONSE_WITH_REGEXP__634290902173496582", RegExp.$1);
}
}
The regular expression is defined in the following snippet:
$P["NS.P__ERROR_IN_RESPONSE__634290901401436582"].lastValue().
match('.*?">(.*)$ ');
- The
match('.*?">(.*)$')
part runs the regular expression on value that is returned by the last match of hit attributeNS.P__ERROR_IN_RESPONSE__634290901401436582
on the hit.- If you wanted to run the RegEx on the first value, replace
lastValue()
withfirstValue().
- Since the starting value is
35">Coupon Code is Invalid
and we want {{Coupon Code is Invalid,}}, we want to match on the part after35">
. That is what the.*?">
part of the regex code does. The(.*)$
is the part that is extracted.
- If you wanted to run the RegEx on the first value, replace
$F.setFact("NS.F_E__ERROR_IN_RESPONSE_WITH_REGEXP__634290902173496582",
RegExp.$1);
- For the fact value, this snippet defines a setFact except for
the modifier:
RegExp.$1
, which takes the first extracted value of the regular expression operation.- Regular expressions can theoretically extract multiple values.
To use the second extracted value, insert
RegExp.$2
. To use the third value, insertRegExp.$3
.
- Regular expressions can theoretically extract multiple values.
To use the second extracted value, insert
- RegExp is a global variable on an event. You do not have to declare it or set it. It is set automatically.
The basic syntax is as follows:
function <EVENT>()
{
if <CONDITION>
{
<OBJECT>.match('<REGULAR EXPRESSION>');
$F.setFact("<FACT>", RegExp.$<EXTRACTED VALUE#>);
}
}
The RegExp variable gets it values from the nearest previous match function. Suppose the code looks like the following:
<OBJECT 1>.match('<REGULAR EXPRESSION 1>');
$F.setFact("<FACT>", RegExp.$1>);
<OBJECT 2>.match('<REGULAR EXPRESSION 2>');
$F.setFact("<FACT>", RegExp.$1>);
The second RegExp.$1
reference uses the first
match from the second regular expression for object 2, instead of
the match from the regular expression for object 1.