TIBCO Businesses Work tutorial: How to Read or Parse large file without creating memory issue in TIBCO BW

techieswiki

7 years ago

In my previous post, I have explained how to read a file in TIBCO BW process. The code will work perfectly for a smaller size file. If we need to process a file with large size, say more than 50 MB the BW process may fail with java.lang.OutOfMemoryError with respect to the heap space allowed by the system. In this post, I am going to explain how to avoid this by optimizing the code.

Before we start I would suggest using the TIBCO File Adapter for processing the large files. If the file adapter is not available or you don’t want to use it try below methods to save the memory.

Parse Data

The input for this activity is placed in a process variable and takes up memory as it is being processed. We can configure the “Parse Data” activity to process the file in batches to reduce the memory usage. i.e if you have 10000 records in the file, you can configure the Parse Data activity to process 500 records in each iteration. To achieve this we need to make a few configurations the Parse Data palette.

Create a Parse Data activity.
Select the Parse Data activity and click the group icon on the toolbar to create a group containing the Parse Data activity.
Specify “Repeat Until True Loop” as the Group action, and specify an index name (for example, “lineCount”).
The loop should exit when the EOF output item for the Parse Data activity is set to true. For example, the condition for the loop could be set to the following: string($ParseData/Output/EOF) = string(true())
Set the noOfRecords input item for the Parse Data activity to the number of records you wish to process for each execution of the loop. If you do not check the Manually Specify Start Record field on the Configuration tab of the Parse Data activity, the loop processes the specified noOfRecords with each iteration until there are no more input records to parse.

You can optionally check the Manually Specify Start Record field to specify the startRecord on the Input tab. If you do this, you must create an XPath expression to properly specify the starting record to read with each iteration of the loop. For example, the count of records in the input starts at zero, so the startRecord input item could be set to the current value of the loop index minus one. For example, $i – 1. The procedure above is a general guideline for creating a loop group for parsing a large set of input records in parts. You may want to modify the procedure to include additional processing of the records, or you may want to change the XPath expressions to suit your business process.

INPUT

OUTPUT

Heap Space

We need to consider increasing the Heap Space for the integration in a gentle manner. You need to consider other jobs running on the same server before you assign a value for Heap Space. For heap size go to process service instance of that project->Monitoring->server settings->maximum heap size in admin GUI and specify a value there.

EnableMemorySavingMode

Would suggest though setting EnableMemorySavingMode or EnableMemorySavingMode.<processName> to true. This will ensure that memory is released with each iteration.

Hope this will help you. Let me know your queries and feedback in comments.