Dinosaurius
Dinosaurius

Reputation: 8638

Nifi gets crashed when injecting the data

I use ListenHTTP as an input point for the Nifi process.

I send the data from CSV file that totally takes around 100Mb:

import requests
import csv
import pandas as pd
import json
import time

url = 'http://localhost:8085/contentListener'

df = pd.read_csv('demo_dataset.csv')

for i in df.index:
    data = df.iloc[i].to_json()
    r = requests.post(url, data=data, allow_redirects=True)
    time.sleep(0.1)

The problem is that Nifi gets crashed after processing around 3000 entries. Then I should restart it (before restarting I also manually empty the logs and flowfile_repository folders).

Is there any parameter in the Nifi processor ListenHTTP or Nifi itslef that would help solving this issue?

Upvotes: 1

Views: 694

Answers (1)

Andy
Andy

Reputation: 14194

The question is a little unclear -- Apache NiFi crashes after handling 3000 100Mb files, or 3000 lines from a single file? In the first case, I would imagine that is a storage/heap problem. Can you please provide a stacktrace from $NIFI_HOME/logs/nifi-app.log?

If you are able to use NiFi 1.2.0+, I would recommend using the record processors to do your operations, as the performance is much better and the flow is easier to design. You can send the 100Mb CSV file as a single operation (or use GetFile) and have various processors operate on each line in the file independently.

Upvotes: 1

Related Questions