How can i modify text in a stream efficiently, in a BizTalk pipeline component?

Rise_against picture Rise_against · Jan 7, 2011 · Viewed 7.2k times · Source

I have a stream that contains text, now I want to edit some text (replace some values) in that stream.

What is the most efficient way to do this, so without breaking the stream? I want to use this in a custom pipeline component for BizTalk.

public IBaseMessage Execute(IPipelineContext pContext, IBaseMessage pInMsg)
{
    string msg = "";
    using (VirtualStream virtualStream = new VirtualStream(pInMsg.BodyPart.GetOriginalDataStream()))
    {
        using(StreamReader sr = new StreamReader(VirtualStream))
        {
            msg = sr.ReadToEnd();
        }

        // modify string here
        msg = msg.replace("\r\n","");

        while (msg.Contains(" <"))
           msg = msg.Replace(" <", "<");

        VirtualStream outStream = new VirtualStream();
        StreamWriter sw = new StreamWriter(outStream, Encoding.Default);
        sw.Write(msg);
        sw.Flush();
        outStream.Seek(0, SeekOrigin.Begin);

        pInMsg.BodyPart.Data = outStream;
        pContext.ResourceTracker.AddResource(outStream);
    }

    return pInMsg;
}

This is the code, but as you can see I am breaking the stream when I do sr.ReadToEnd().

Is there a beter way to do this?

Answer

Maxime Labelle picture Maxime Labelle · Jan 20, 2011

The fact that you are using Stream classes in your pipeline component does not make it a Streaming pipeline component per-se as you have intuitively wondered.

The most appropriate way is to split the responsabilities into two components:

  • First, you create a customer System.IO.Stream class - that is a class that wraps the original incoming stream and exposes a streaming interface. In this class, you would effectively process the bytes as they are read by the calling code. This class should have no dependency over BizTalk and you should be able to create a sample unit-test program for this class outside BizTalk.

For the first case, I recommend you browse to one of several articles with source code samples.

  • Second, the pipeline component itself, whose sole responsability is to substitute the incoming stream with an instance of your custom stream. This is an idiomatic pattern that you'll find in good pipeline components. In particular, during the execution of the Execute method, you should not read the original incoming stream. The reading will happen - automagically - by itself when the Messaging Agent takes over.

The following snippet should be the canonical source code for an Executemethod, barring additional code for error handling, of course:

IBaseMessage IComponent.Execute(IPipelineContext pContext, IBaseMessage pInMsg)
{
   // assign a new CustomStream to the incoming message

    System.IO.Stream stream = pInMsg.BodyPart.GetOriginalDataStream();
    System.IO.Stream customStream = new CustomStream(stream);

    // return the message for downstream pipeline components (further down in the pipeline)

    pInMsg.BodyPart.Data = customStream;
    pContext.ResourceTracker.AddResource(customStream);

    return pInMsg;
}

See? No reading whatsoever in the preceding method. The entire processing should happen during the (repated) calls of the Read method in your custom Stream class.

As I wrote in my response to the following question, I strongly encourage you to check out the whole series of posts Nic Barden has done about Developing Streaming Pipeline Components.