Reputation: 5890
I tried to create my Druid schema, and I refereed a example as following:
{"dimensionsSpec": {"dimensions": ["timestamp”,"netname"] },
"columns": ["second_time","timestamp"],
"delimiter": "/001"
}
My question is that, if I indicated dimensions, why should I indicate columns again. Btw, should I put timestamp(it is seconds) in the dimension? since my granularity is MINUTE.
Upvotes: 0
Views: 431
Reputation: 2452
There is no need to specify columns attribute in your ingestion spec. dimensionSpec and metricsSpec are enough. here's the sample example of ingestion spec:
"dimensionsSpec" : {
"dimensions": [
"srcIP",
{ "name" : "srcPort", "type" : "long" },
{ "name" : "dstIP", "type" : "string" },
{ "name" : "dstPort", "type" : "long" },
{ "name" : "protocol", "type" : "string" }
]
}
Druid has excellent documentation, here are good reference links about how to write ingestion spec: Writing Druid Ingestion Spec, Imply Ingestion Spec Docs
Answer to your 2nd question:
There is no need to include timestamp in dimension list. To specify granularity you can use granularitySpec. Here's the example:
"granularitySpec" : {
"type" : "uniform",
"segmentGranularity" : "HOUR",
"queryGranularity" : "MINUTE"
"rollup" : true
}
Note that there are two types of granularity you can specify here, segmentGranularity refers to what size of time interval should a single segment contain data for and queryGranularity is used while querying to druid table
Upvotes: 1