Reputation: 11034
I am trying to use InfluxDB's Python client's to retrieve data stored on InfluxDB, but can't more than 10k lines. The examples I am (unsuccessfully) following are here. In summary:
import influxdb
dfclient = influxdb.DataFrameClient('localhost', 8086, 'root', 'root', 'mydb')
q = "select * from some_measurement"
df = dfclient.query(q, chunked=True) # Returns only 10k points
The issue seems to be related to InfluxDB's internal limitations documented here (namely, the max-row-limit
configuration option). I am going through the sources to try to find out how to get a DataFrame larger than 10k lines, but any help in solving this issue would be highly appreciated.
Upvotes: 9
Views: 10582
Reputation: 922
have you attempted to set the chunked flag on your query to receive the data back in chunks. This can be done using a query like the following:
influxdb.DataFrameClient(host='localhost', port=8086, username='root', password='root', database=None, ssl=False, verify_ssl=False, timeout=None, use_udp=False, udp_port=4444, proxies=None)
you can read more about it here in section 1.2.3
Upvotes: 3
Reputation: 11034
The problem is caused by the DataFrameClient
's query
simply ignoring the chunked
argument [code].
The workaround I found out is not use the standard InfluxDBClient
instead. The code shown in the question becomes:
import influxdb
client = influxdb.InfluxDBClient('localhost', 8086, 'root', 'root', 'btc')
q = "select * from some_measurement"
df = pd.DataFrame(client.query(q, chunked=True, chunk_size=10000).get_points()) # Returns all points
It is also worth highlighting that from v1.2.2 the max-row-limit
setting (i.e. the default value for chunk_size
in the above code) has been change from 10k to unlimited.
Upvotes: 10