I'm fairly new to python and pandas, and I'm wondering if anyone knows if there are any libraries for python build on top of pandas which would take a time series of orders which have the following columns: timestamp, id, price, size, exchange
Each record adjusts the total per price and exchange by the size to give you a current view, i.e. records might look like:
9:00:25.123, 1, 1.02, 100, N
9:00:25.123, 2, 1.02, -50, N
9:00:25.129, 3, 1.03, 50, X
9:00:25.130, 4, 1.02, 150, X
9:00:25.131, 5, 1.02, -5, X
I want to be able, for any time, get the current view of the market. So for example if I made the call for the market at 9:00:25.130, I would get:
1.02, N, 50
1.02, X, 150
1.03, X, 50
A query for 9:00:25.131 would return
1.02, N, 50
1.02, X, 145
1.03, X, 50
There may be a million or more of these records, iterating over all of the records for every request would take a long time, particularly if you were trying to look at times later in on the day. I suppose one could create "snapshots" on some time interval and use them like key frames in mpeg playback, and I could code it myself, but I think that book building/ playback, is such a common need for folks using pandas with financial data that their might already be libraries out there to do this.
Any ideas, or do I roll my own?
I know this is old but it's instructive to see the benefits and limits of pandas
I built a trivial jupyter notebook to show how an order book like you describe could be built to be used as you requested.
The core is a loop that updates the state of the order book and saves it for amalgamation into a pandas Dataframe:
states = []
current_timestamp = None
current_state = {}
for timestamp, (id_, price, exch, size) in df.iterrows():
if current_timestamp is None:
current_timestamp = timestamp
if current_timestamp != timestamp:
for key in list(current_state):
if current_state[key] == 0.:
del current_state[key]
states.append((current_timestamp, dict(**current_state)))
current_timestamp = timestamp
key = (exch, price)
current_state.setdefault(key, 0.)
current_state[key] += size
states.append((timestamp, dict(**current_state)))
order_book = pd.DataFrame.from_items(states).T
However: note how the book state has to be built up outside of pandas, and that a pandas.DataFrame of order book state isn't so well suited to model order book per-level priority or depth (Level 3 data), which can be a major limitation depending on how accurately you want to model the order book.
Order books and the orders and quotes that update them (both of which you group into the term "request") in the real world have fairly complex interactions. These interactions are governed by the rules of the exchange that manages them, and these rules change all the time. Since these rules take time to model correctly, are worth understanding to very few, and old sets of rules are usually not even of much academic interest, the only places one would tend to find these rules codified into a library are the places not very interested in sharing them with others.
To understand the theory behind a simple ("stylised") model of an order book, its orders and quotes thereupon, see the paper "A stochastic model for order book dynamics" by Rama Cont, Sasha Stoikov, Rishi Talreja, Section 2:
2.1 Limit order books
Consider a financial asset traded in an order-driven market. Market participants can post two types of buy/sell orders. A limit order is an order to trade a certain amount of a security at a given price. Limit orders are posted to a electronic trading system and the state of outstanding limit orders can be summarized by stating the quantities posted at each price level: this is known as the limit order book. The lowest price for which there is an outstanding limit sell order is called the ask price and the highest buy price is called the bid price. [...more useful description]
2.2. Dynamics of the order book
Let us now describe how the limit order book is updated by the inflow of new orders. [...] Assuming that all orders are of unit size [...],
• a limit buy order at price level p<p_A(t) increases the quantity at level p: x → x_{p−1}
• a limit sell order at price level p>p_B(t) increases the quantity at level p: x → x_{p+1}
• a market buy order decreases the quantity at the ask price: x → x_{p_A(t)−1}
• a market sell order decreases the quantity at the bid price: x → x_{p_B(t)+1}
• a cancellation of an oustanding limit buy order at price level p<p_A(t) decreases the quantity at level p: x → x_{p+1}
• a cancellation of an oustanding limit sell order at price level p>p_B(t) decreases the quantity at level p: x → x_{p−1}
The evolution of the order book is thus driven by the incoming flow of market orders, limit orders and cancellations at each price level [...]
Some libraries where you can see people's attempts at modeling or visualising a simple limit order book are:
And there is a good quant.stackoverflow.com question and answers here.