Extract Regex Values (Regex Match) in Blue Prism

Kristian picture Kristian · Sep 5, 2018 · Viewed 12k times · Source

In Blue Prism, I need to identify specific elements of a Data Item (text), in order to use the information later in my process.

The text string reads:

REKVISITION_NR: 1234567 Dato: 23-07-2018 Rekvirent: ABC, DEF GHI, JKL 60, 8600 MNO Sted: JKL 60, 8600 MNO, Kl.:14:00:00, Bestilt_tid: 60 min Tolkensnavn: PQR STU Koert_fra: VXY , 8600 Silkeborg Vedr.: Z CPR: 123456-7890 Sprog: Arabisk Type: Personlig fremmøde Godkendt: 24-07-2018

As you can see, each element has these traits (e.g. Kl.:14:00:00 or Sprog: Arabisk):

  • A string name (starting with an uppercase letter)
  • Optionally, a period character (.)
  • A colon character (:)
  • Optionally, a space character ( )
  • The value part of the string
  • A space character ( ), which is followed by the next element.

I believe I should use the Business Object Utility - Strings' action Extract Regex Values, but have not sucessfully been able to match any data that can be copied into the Named Values-collection. However, I have found that ([A-Z])\w+\.?: ?(\w(\d\-){0,3})+ brings me some of the way in terms of matching. I want the solution to copy the field names and values into the Named Values collection generated by the action.

Final notes: I am using Blue Prism 6.2.1, and the action's underlying code is based on VB.net's Regex.Match method.

Answer

Marek Stejskal picture Marek Stejskal · Sep 5, 2018

What you seem to be missing are the actual Named Groups. To capture the values in a Blue Prism collection, you need to make sure that you assign proper group names like this:

(?<YourGroupName>[A-Z])

Here's the regex pattern that you may use, although you need to verify if it really works for your case in all possible scenarios.

(?<Name>\b\S*?):\s(?<Value>.*?)\s*(?=(?:\b\S*?:\s)|$)

You can also check and test it here.

EDIT: But please be aware that Blue Prism's original code for extracting multiple values to a collection is barely usable, you may be better off at modifying it or creating your own. For example, what I would expect from such an action is a collection where each row will be a pattern match, with each column being a named group. Sadly, that's not how the default action works.

EDIT: enter image description here