What is the capacity of a BluePrism Internal Work Queue?

codenameGamer picture codenameGamer · Nov 21, 2016 · Viewed 7.5k times · Source

I am working in BluePrism Robotics Process Automation and trying to load an excel sheet with more than 100k records (It might go upwards of 300k in some cases). I am trying to load internal work queue of BluePrism, but I get an error as quoted below:

'Load Data Into Queue' ERROR: Internal : Exception of type 'System.OutOfMemoryException' was thrown.

Is there a way to avoid this problem, in the way where I can free up more memory? I plan to process records one by one from queue, and put them into new excel sheets categorically. Loading all that data in a collection and looping over it may be memory consuming, so I am trying to find out a more efficient way.

I welcome any and all help/tips. Thanks!

Answer

Alex P picture Alex P · Jun 10, 2018

Basic Solution: Break up the number of Excel rows you are pulling into your Collection data item at any one time. The thresholds for this will depend on your resource system memory and architecture, as well as structure and size of the data in the Excel Worksheet. I've been able to quickly move 50k 10-column-rows from Excel to a Collection and then into the Blue Prism queue very quickly.

You can set this up by specifying the Excel Worksheet range to pull into the Collection data item, and then shift that range each time the Collection has been successfully added to the queue.

After each successful addition to the queue and/or before you shift the range and/or at a predefined count limit you can then run a Clean Up or Garbage Collection action to free up memory.

You can do all of this with the provided Excel VBO and an additional Clean Up object.

Keep in mind: Even breaking it up, looping over a Collection this large to amend the data will be extremely expensive and slow. The most efficient way to make changes to the data will be at the Excel Workbook level or when it is already in the Blue Prism queue.

Best Bet: esqew's alternative solution is the most elegant and probably your best bet.

Jarrick hit it on the nose in that Work Queue items should provide the bot with information on what they are to be working on and a Control Room feedback space, but not the actual work data to be implemented/manipulated.

In this case you would want to just use the items Worksheet row number and/or some unique identifier from a single Worksheet column as the queue item data so that the bot can provide Control Room feedback on the status of the item. If this information is predictable enough in format there should be no need to move any data from the Excel Worksheet to a Collection and then into a Work Queue, but rather simply build the queue based on that data predictability.

Conversely you can also have the bot build the queue "as it happens", in that once it grabs the single row data from the Excel Worksheet to work it, can as well add a queue item with the row number of the data. This will then enable Control Room feedback and tracking. However, this would, in almost every case, be a bad practice as it would not prevent a row from being worked multiple times unless the bot checked the queue first, at which point you've negated the speed gains you were looking to achieve in cutting out the initial queue building in the first place. It would also be impossible to scale the process for multiple bots to work the Excel Worksheet data efficiently.