agxcv
agxcv

Reputation: 31

Read CSV and Upload Data to Elasticsearch

I am iterating the rows one by one of a csv file and I want to insert it into es. I'm new to both python and elastic search.How to convert one csv row and insert it into es one by one

import csv
import json

from elasticsearch import Elasticsearch

es = Elasticsearch(
  [{'host': 'localhost', 'port': 9200}])
 print(es)


def csv_reader(file_obj, delimiter=','):
   reader = csv.reader(file_obj)
   i = 1
   results = []
   for row in reader:
    print(row)
    es.index(index='product', doc_type='prod', id=i, 
   body=json.dump([row for row in reader], file_obj))
    i = i + 1
    results.append(row)
    print(row)


 if __name__ == "__main__":
  with open("/home/Documents/csv/acsv.csv") as f_obj:
    csv_reader(f_obj)

But I'm getting this error:

Traceback (most recent call last):

File "/home/PycharmProjects/CsvReaderForSyncEs/csvReader.py", line 25, in csv_reader(f_obj)

File "/home/PycharmProjects/CsvReaderForSyncEs/csvReader.py", line 17, in csv_reader

es.index(index='product', doc_type='prod', id=i, body=json.dump([row for row in reader], file_obj))

File "/usr/lib/python2.7/json/init.py", line 190, in dump fp.write(chunk)

IOError: File not open for writing

Upvotes: 3

Views: 12271

Answers (3)

Ngoc Pham
Ngoc Pham

Reputation: 1458

can you try this. Change reader to DictReader and json.dumps(row). DictReader make input data is python dict. And for in is loop each row in reader, you just try push row is enough

es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
print(es)

def csv_reader(file_obj, delimiter=','):
    reader = csv.DictReader(file_obj)
    i = 1
    results = []
    for row in reader:
        print(row)
        es.index(index='product', doc_type='prod', id=i,
                         body=json.dumps(row))
        i = i + 1

        results.append(row)
        print(row)

if __name__ == "__main__":
    with open("/home/Documents/csv/acsv.csv") as f_obj:
        csv_reader(f_obj)

Upvotes: 0

Ashwani Shakya
Ashwani Shakya

Reputation: 439

Try bulk API.

import csv
from elasticsearch import helpers, Elasticsearch

def csv_reader(file_name):
    es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
    with open(file_name, 'r') as outfile:
        reader = csv.DictReader(outfile)
        helpers.bulk(es, reader, index="index_name", doc_type="type")

for more information about bulk API https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

Upvotes: 9

piedra
piedra

Reputation: 1453

The problem is that you are passing file_obj as a parameter for json.dump but the file is only opened for reading. Check the mode parameter for the open function in this link.

Also check the first parameter for the json.dump function, [row for row in reader] gets all the rows in the csv file, but probably you just want to pass one row, so the parameter should be row.

And json.dump writes to a file, probably you should use the json.dumps function, check here

Upvotes: 0

Related Questions