TxAG98
TxAG98

Reputation: 1120

OrientDB 2.2.7 ETL for CSV not loading DateTime field?

I'm trying to get a simple example to load using the ETL loader but I must be missing something. I've followed various threads on Stack Overflow and have been going by the documentation on extractors, but I'm coming up short in my attempt.

Here's my data: vertices.csv

label,data,Date
v01,0.1234,2015-01-01 02:30
v02,0.5678,2015-02-20 15:32
v03,0.9012,2015-03-30 11:00

I am setting two JSON files to try and load this into a PLOCAL database:

vertices.json

{
    "config": {
        "log": "debug",
        "fileDirectory": "./",
        "fileName": "vertices.csv"
    }
}

and commonVertices.json

{
    "begin": [ { "let": { "name": "$filePath",  "expression": "$fileDirectory.append($fileName )" } } ],
    "config": { "log": "info" },
    "source": { "file": { "path": "$filePath" } },
    "extractor": { "csv": { "ignoreEmptyLines": true,
                            "nullValue": "N/A",
                            "columnsOnFirstLine": true,
                            "dateFormat": "yyyy-mm-dd HH:MM",
                            "columns": ["label:string","weight:float","Date:datetime"]
                          }
                 },
    "transformers": [
            { "vertex": { "class": "myVertex" } },
            { "code":   { "language": "Javascript", "code": "print('    Current record: ' + record); record;" } }
        ],
    "loader": {
        "orientdb": {
            "dbURL": "plocal:test.orientdb",
            "dbType": "graph",
            "batchCommit": 1000,
            "classes": [ { "name": "myVertex", "extends", "V" } ],
            "indexes": [ { "class": "myVertex", "fields":["label:string","Date:datetime"], "type":"NOTUNIQUE" } ]
        }
    }
}

I'm loading it using the oetl.sh with the command:

$ oetl.sh commonVertices.json vertices.json

The output, with debug information, is here:

> oetl.sh commonVertices.json vertices.json
OrientDB etl v.2.2.7 (build 2.2.x@rdcab5af4dce4b538bdb4b372abba46e3fc9f19b7; 2016-08-11 15:17:33+0000) www.orientdb.com
[csv] INFO column types: {weight=FLOAT, Date=DATETIME, label=STRING}
BEGIN ETL PROCESSOR
[file] INFO Reading from file ./vertices.csv with encoding UTF-8
Started execution with 1 worker threads
[orientdb] DEBUG orientdb: found 9 vertices in class 'null'
[orientdb] DEBUG orientdb: found metadata field 'null'
Start extracting
[csv] DEBUG document={weight:0.1234,Date:null,label:v01}
[csv] DEBUG document={weight:0.5678,Date:null,label:v02}
[1:vertex] DEBUG Transformer input: {weight:0.1234,Date:null,label:v01}
[csv] DEBUG document={weight:0.9012,Date:null,label:v03}
[1:vertex] DEBUG Transformer output: v(myVertex)[#25:3]
[1:code] DEBUG Transformer input: v(myVertex)[#25:3]
    Current record: myVertex#25:3{weight:0.1234,Date:null,label:v01} v1
[1:code] DEBUG executed code=OCommandExecutorScript [text=print('    Current record: ' + record); record;], result=myVertex#25:3{weight:0.1234,Date:null,label:v01} v1
[1:code] DEBUG Transformer output: myVertex#25:3{weight:0.1234,Date:null,label:v01} v1
[2:vertex] DEBUG Transformer input: {weight:0.5678,Date:null,label:v02}
[2:vertex] DEBUG Transformer output: v(myVertex)[#26:3]
[2:code] DEBUG Transformer input: v(myVertex)[#26:3]
    Current record: myVertex#26:3{weight:0.5678,Date:null,label:v02} v1
[2:code] DEBUG executed code=OCommandExecutorScript [text=print('    Current record: ' + record); record;], result=myVertex#26:3{weight:0.5678,Date:null,label:v02} v1
[2:code] DEBUG Transformer output: myVertex#26:3{weight:0.5678,Date:null,label:v02} v1
[3:vertex] DEBUG Transformer input: {weight:0.9012,Date:null,label:v03}
[3:vertex] DEBUG Transformer output: v(myVertex)[#27:3]
[3:code] DEBUG Transformer input: v(myVertex)[#27:3]
    Current record: myVertex#27:3{weight:0.9012,Date:null,label:v03} v1
[3:code] DEBUG executed code=OCommandExecutorScript [text=print('    Current record: ' + record); record;], result=myVertex#27:3{weight:0.9012,Date:null,label:v03} v1
[3:code] DEBUG Transformer output: myVertex#27:3{weight:0.9012,Date:null,label:v03} v1
[orientdb] INFO committing
Pipeline worker done without errors:: true
all items extracted
END ETL PROCESSOR
+ extracted 3 rows (0 rows/sec) - 3 rows -> loaded 3 vertices (0 vertices/sec) Total time: 149ms [0 warnings, 0 errors]

It loads... but the date fields aren't getting populated with any data as shown by this query:

orientdb {db=test.orientdb}> SELECT FROM myVertex

+----+-----+--------+------+----+-----+
|#   |@RID |@CLASS  |weight|Date|label|
+----+-----+--------+------+----+-----+
|0   |#25:0|myVertex|0.1234|    |v01  |
|1   |#26:0|myVertex|0.5678|    |v02  |
|2   |#27:0|myVertex|0.9012|    |v03  |
+----+-----+--------+------+----+-----+

3 item(s) found. Query executed in 0.003 sec(s).

So far, in tinkering around, it seems that the ETL will import dates if you leave the "dateFormat" and "columns" fields out of the commonVertices.json file, but by doing so it may import the DATE but it will not import the time.

I'm a bit stuck on this one, it looks like a bug to me but I'm new with OrientDB so hopefully just a user error that has a simple solution.

As always, any help is greatly appreciated!

Upvotes: 0

Views: 133

Answers (1)

Alessandro Rota
Alessandro Rota

Reputation: 3570

I have tried with

"extractor": { "csv": { "ignoreEmptyLines": true,
                            "nullValue": "N/A",
                            "columnsOnFirstLine": true,
                            "dateFormat": "yyyy-MM-dd hh:mm"
                          }
                 },

and it worked

enter image description here

Hope it helps.

Upvotes: 1

Related Questions