Why are Subjects not recommended in .NET Reactive Extensions?

Anthony picture Anthony · Jan 18, 2013 · Viewed 33.8k times · Source

I am currently getting to grips with the Reactive Extensions framework for .NET and I am working my way through the various introduction resources I've found (mainly http://www.introtorx.com)

Our application involves a number of hardware interfaces that detect network frames, these will be my IObservables, I then have a variety of components that will consume those frames or perform some manner of transform on the data and produce a new type of frame. There will also be other components that need to display every n'th frame for example. I am convinced that Rx is going to be useful for our application, however I am struggling with the implementation details for the IObserver interface.

Most (if not all) of the resources I have been reading have said that I should not implement the IObservable interface myself but use one of the provided functions or classes. From my research it appears that creating a Subject<IBaseFrame> would provide me what I need, I would have my single thread that reads data from the hardware interface and then calls the OnNext function of my Subject<IBaseFrame> instance. The different IObserver components would then receive their notifications from that Subject.

My confusion is coming from the advice give in the appendix of this tutorial where it says:

Avoid the use of the subject types. Rx is effectively a functional programming paradigm. Using subjects means we are now managing state, which is potentially mutating. Dealing with both mutating state and asynchronous programming at the same time is very hard to get right. Furthermore, many of the operators (extension methods) have been carefully written to ensure that correct and consistent lifetime of subscriptions and sequences is maintained; when you introduce subjects, you can break this. Future releases may also see significant performance degradation if you explicitly use subjects.

My application is quite performance critical, I am obviously going to test the performance of using the Rx patterns before it goes in to production code; however I am worried that I am doing something that is against the spirit of the Rx framework by using the Subject class and that a future version of the framework is going to hurt performance.

Is there a better way of doing what I want? The hardware polling thread is going to be running continuously whether there are any observers or not (the HW buffer will back up otherwise), so this is a very hot sequence. I need to then pass the received frames out to multiple observers.

Any advice would be greatly appreciated.

Answer

Lee Campbell picture Lee Campbell · Jan 22, 2013

Ok, If we ignore my dogmatic ways and ignore "subjects are good/bad" all together. Let us look at the problem space.

I bet you either have 1 of 2 styles of system you need to ingrate to.

  1. The system raises an event or a call back when a message arrives
  2. You need to poll the system to see if there are any message to process

For option 1, easy, we just wrap it with the appropriate FromEvent method and we are done. To the Pub!

For option 2, we now need to consider how we poll this and how to do this effciently. Also when we get the value, how do we publish it?

I would imagine that you would want a dedicated thread for polling. You wouldn't want some other coder hammering the ThreadPool/TaskPool and leaving you in a ThreadPool starvation situation. Alternatively you don't want the hassle of context switching (I guess). So assume we have our own thread, we will probably have some sort of While/Sleep loop that we sit in to poll. When the check finds some messages we publish them. Well all of this sounds perfect for Observable.Create. Now we probably cant use a While loop as that wont allow us to ever return a Disposable to allow cancellation. Luckily you have read the whole book so are savvy with Recursive scheduling!

I imagine something like this could work. #NotTested

public class MessageListener
{
    private readonly IObservable<IMessage> _messages;
    private readonly IScheduler _scheduler;

    public MessageListener()
    {
        _scheduler = new EventLoopScheduler();

        var messages = ListenToMessages()
                                    .SubscribeOn(_scheduler)
                                    .Publish();

        _messages = messages;
        messages.Connect();
    }

    public IObservable<IMessage> Messages
    {
        get {return _messages;}
    }

    private IObservable<IMessage> ListenToMessages()
    {
        return Observable.Create<IMessage>(o=>
        {
                return _scheduler.Schedule(recurse=>
                {
                    try
                    {           
                        var messages = GetMessages();
                        foreach (var msg in messages)
                        {
                            o.OnNext(msg);
                        }   
                        recurse();
                    }
                    catch (Exception ex)
                    {
                        o.OnError(ex);
                    }                   
                });
        });
    }

    private IEnumerable<IMessage> GetMessages()
    {
         //Do some work here that gets messages from a queue, 
         // file system, database or other system that cant push 
         // new data at us.
         // 
         //This may return an empty result when no new data is found.
    }
}

The reason I really don't like Subjects, is that is usually a case of the developer not really having a clear design on the problem. Hack in a subject, poke it here there and everywhere, and then let the poor support dev guess at WTF was going on. When you use the Create/Generate etc methods you are localizing the effects on the sequence. You can see it all in one method and you know no-one else is throwing in a nasty side effect. If I see a subject fields I now have to go looking for all the places in a class it is being used. If some MFer exposes one publicly, then all bets are off, who knows how this sequence is being used! Async/Concurrency/Rx is hard. You don't need to make it harder by allowing side effects and causality programming to spin your head even more.