influxdb: Write multiple points vs single point multiple times

HaggarTheHorrible picture HaggarTheHorrible · Dec 16, 2016 · Viewed 7.3k times · Source

I'm using influxdb in my project and I'm facing an issue with query when multiple points are written at once

I'm using influxdb-python to write 1000 unique points to influxdb.

In the influxdb-python there is a function called influxclient.write_points()

I have two options now:

  1. Write each point once every time (1000 times) or
  2. Consolidate 1000 points and write all the points once.

The first option code looks like this(pseudo code only) and it works:

thousand_points = [0...9999
while i < 1000:
    ...
    ...
    point = [{thousand_points[i]}]  # A point must be converted to dictionary object first
    influxclient.write_points(point, time_precision="ms")
    i += 1

After writing all the points, when I write a query like this:

SELECT * FROM "mydb"

I get all the 1000 points.

To avoid the overhead added by every write in every iteration, I felt like exploring writing multiple points at once. Which is supported by the write_points function.

write_points(points, time_precision=None, database=None, retention_policy=None, tags=None, batch_size=None)

Write to multiple time series names.

Parameters: points (list of dictionaries, each dictionary represents a point) – the list of points to be written in the database

So, what I did was:

thousand_points = [0...999]
points = []
while i < 1000:
    ...
    ...
    points.append({thousand_points[i]})  # A point must be converted to dictionary object first
    i += 1

influxclient.write_points(points, time_precision="ms")

With this change, when I query:

SELECT * FROM "mydb"

I only get 1 point as the result. I don't understand why.

Any help will be much appreciated.

Answer

Simon Fraser picture Simon Fraser · Dec 16, 2016

You might have a good case for a SeriesHelper

In essence, you set up a SeriesHelper class in advance, and every time you discover a data point to add, you make a call. The SeriesHelper will batch up the writes for you, up to bulk_size points per write