"yield" keyword for C++, How to Return an Iterator from my Function?

moooeeeep picture moooeeeep · Aug 10, 2012 · Viewed 13k times · Source

Consider the following code.

std::vector<result_data> do_processing() 
{
    pqxx::result input_data = get_data_from_database();
    return process_data(input_data);
}

std::vector<result_data> process_data(pqxx::result const & input_data)
{
    std::vector<result_data> ret;
    pqxx::result::const_iterator row;
    for (row = input_data.begin(); row != inpupt_data.end(); ++row) 
    {
        // somehow populate output vector
    }
    return ret;
}

While I was thinking about whether or not I could expect Return Value Optimization (RVO) to happen, I found this answer by Jerry Coffin [emphasis mine]:

At least IMO, it's usually a poor idea, but not for efficiency reasons. It's a poor idea because the function in question should usually be written as a generic algorithm that produces its output via an iterator. Almost any code that accepts or returns a container instead of operating on iterators should be considered suspect.

Don't get me wrong: there are times it makes sense to pass around collection-like objects (e.g., strings) but for the example cited, I'd consider passing or returning the vector a poor idea.

Having some Python background, I like Generators very much. Actually, if it were Python, I would have written above function as a Generator, i.e. to avoid the necessity of processing the entire data before anything else could happen. For example like this:

def process_data(input_data):
    for item in input_data:
        # somehow process items
        yield result_data

If I correctly interpreted Jerry Coffins note, this is what he suggested, isn't it? If so, how can I implement this in C++?

Answer

Konrad Rudolph picture Konrad Rudolph · Aug 10, 2012

No, that’s not what Jerry means, at least not directly.

yield in Python implements coroutines. C++ doesn’t have them (but they can of course be emulated but that’s a bit involved if done cleanly).

But what Jerry meant is simply that you should pass in an output iterator which is then written to:

template <typename O>
void process_data(pqxx::result const & input_data, O iter) {
    for (row = input_data.begin(); row != inpupt_data.end(); ++row)
        *iter++ = some_value;
}

And call it:

std::vector<result_data> result;
process_data(input, std::back_inserter(result));

I’m not convinced though that this is generally better than just returning the vector.