Dhiraj
Dhiraj

Reputation: 3696

Additional hot cache by disabling ingestion time policy

I have enabled ingestion time policy on my table and then I am keeping latest 1 day of data for the table in hot cache. I am trying to assess how much extra cache I can get for actual data if I turn off the ingestion time policy. So basically I can imagine that since the ingestion time policy is enabled, for every row I have an extra column value ingestion_time which is of datetime type. So if this occupies N bytes to store a single datetime value , then if I have X rows in hot cache , I am wasting my cache by X*N bytes -- which could otherwise be used by some additional actual data (considering ingestion utilization is over 100% already) So I am trying to assess how much cache I am wasting by enabling ingestion time policy.

Upvotes: 0

Views: 183

Answers (1)

Yoni L.
Yoni L.

Reputation: 25965

a. I wouldn't necessarily call it a "waste"

  • Unless you're not using the ingestion_time() in any of your queries. in which case - you can decide you want to disable the policy, which is enabled by default. (Though you may change your mind later, when you actually need it, and regret that decision)

b. In order to get an estimation of the sizes (i.e. disk cache utilization), you can run .show table TABLENAME to see stats (including original/compressed/index sizes) for each column.

  • Beware: this isn't a lightweight command, so don't run it too frequently.
  • Look at a datetime-typed column whose values are close in time to when the data gets ingested as reference - so that it resembles the content of ingestion_time(). You can compare that to other columns in the table.
  • The actual size on disk is ExtentSize.

c. Remember that Kusto/ADX is (mostly, excluding RowStore used in streaming ingestion) a column store technology - data is stored in columns, not in rows (as your original question implies). You can read more about the technology in this whitepaper.

Upvotes: 2

Related Questions