I just find out that standard std deque is really slow when comparing with my "home-made" version of stack that use pre-allocated array.
This is code of my stack:
template <class T>
class FastStack
{
public:
T* st;
int allocationSize;
int lastIndex;
public:
FastStack(int stackSize);
FastStack();
~FastStack();
inline void resize(int newSize);
inline void push(T x);
inline void pop();
inline T getAndRemove();
inline T getLast();
inline void clear();
};
template <class T>
FastStack<T>::FastStack()
{
lastIndex = -1;
st = NULL;
}
template <class T>
FastStack<T>::FastStack(int stackSize)
{
st = NULL;
this->allocationSize = stackSize;
st = new T[stackSize];
lastIndex = -1;
}
template <class T>
FastStack<T>::~FastStack()
{
delete [] st;
}
template <class T>
void FastStack<T>::clear()
{
lastIndex = -1;
}
template <class T>
T FastStack<T>::getLast()
{
return st[lastIndex];
}
template <class T>
T FastStack<T>::getAndRemove()
{
return st[lastIndex--];
}
template <class T>
void FastStack<T>::pop()
{
--lastIndex;
}
template <class T>
void FastStack<T>::push(T x)
{
st[++lastIndex] = x;
}
template <class T>
void FastStack<T>::resize(int newSize)
{
if (st != NULL)
delete [] st;
st = new T[newSize];
}
.
I run this simple benchmark for deque:
std::deque<int> aStack;
int x;
HRTimer timer;
timer.Start();
for (int i = 0; i < 2000000000; i++)
{
aStack.push_back(i);
x = aStack.back();
if (i % 100 == 0 && i != 0)
for (int j = 0; j < 100; j++)
aStack.pop_back();
}
double totalTime = timer.Stop();
stringstream ss;
ss << "std::deque " << totalTime;
log(ss.str());
.
Using std stack with vector as container (as 'Michael Kohne' suggested)
std::stack<int, std::vector<int>> bStack;
int x;
HRTimer timer;
timer.Start();
for (int i = 0; i < 2000000000; i++)
{
bStack.push(i);
x = bStack.top();
if (i % 100 == 0 && i != 0)
for (int j = 0; j < 100; j++)
bStack.pop();
}
double totalTime = timer.Stop();
stringstream ss;
ss << "std::stack " << totalTime;
log(ss.str());
.
And the same one for my FastStack:
FastStack<int> fstack(200);
int x;
HRTimer timer;
timer.Start();
for (int i = 0; i < 2000000000; i++)
{
fstack.push(i);
x = fstack.getLast();
if (i % 100 == 0 && i != 0)
for (int j = 0; j < 100; j++)
fstack.pop();
}
double totalTime = timer.Stop();
stringstream ss;
ss << "FastStack " << totalTime;
log(ss.str());
.
The results after 4 runs are as follows:
deque 15.529
deque 15.3756
deque 15.429
deque 15.4778
stack 6.19099
stack 6.1834
stack 6.19315
stack 6.19841
FastStack 3.01085
FastStack 2.9934
FastStack 3.02536
FastStack 3.00937
The results are in seconds and I was running the code on Intel i5 3570k (on default clock). I used VS2010 compiler with all optimizations that are available.
I know that my FastStack has a lot of limitations but there are plenty situations, where it could be used and when it can give nice boost! (I used it in one project where I get 2x speed up comparing to std::queue).
So now my question is:
Are there any other "inhibitors" in C++ that everybody use but no one knows about them?
EDIT: I don't want to be offensive, I'm just curious if you know some unknown "speedups" like this.
You are comparing apples and oranges. The dequeue is DOUBLE-ENDED, which requires it to be quite a bit different internally than the simple stack you've implemented. Try running your benchmark against an std::stack<int, std::vector<int> >
instead and see how you do. std::stack is a container adapter, and if you use a vector as the underlying container, it should be nearly as fast as your home-grown implementation.
The one down side is that the std::stack implementation doesn't have a way for you to pre-set the size, so in situations where you KNOW what the max size needs to be, it'll be a little slower initially as it has to re-allocate several times. After that, it will be pretty much the same.