Reputation: 303
I'm trying to upload a PDF file to Confluence using Nifi's ExecuteScript processor. I can upload the file successfully, but when I download and open it, it's BLANK. There must be something wrong with my conversion. Can anyone please help check?
So this is how I do it:
import org.apache.commons.io.IOUtils
import java.nio.charset.StandardCharsets
flowFile = session.get()
if(!flowFile)return
def text = ''
session.read(flowFile, {inputStream ->
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
} as InputStreamCallback)
flowFile = session?.putAttribute(flowFile, "file_content", text)
session.transfer(flowFile, /*ExecuteScript.*/ REL_SUCCESS)
3. ExecuteScript Python - to upload PDF file to Confluence
Here's my code for #3. I think something's wrong here -->
import json
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder
from org.apache.nifi.processor.io import OutputStreamCallback
class OutputWrite(OutputStreamCallback):
def __init__(self, obj):
self.obj = obj
def process(self, outputStream):
outputStream.write(bytearray(json.dumps(self.obj).encode('utf-8')))
flowFile = session.get()
if (flowFile != None):
url = 'https://myconfluence.com/rest/api/content/12345/child/attachment'
auth = 'myauthorization'
file_name = 'mypdf.pdf'
file_content = flowFile.getAttribute('file_content')
s = requests.Session()
m = MultipartEncoder(fields={'file': (file_name, file_content, 'application/pdf')})
headers = {"X-Atlassian-Token":"nocheck", "Authorization":auth, "Content-Type":m.content_type}
r = s.post(url, data=m, headers=headers, verify=False)
session.write(flowFile, OutputWrite(json.loads(r.text)))
session.transfer(flowFile, REL_SUCCESS)
session.commit()
UPDATE 06/28/2019
I decided to follow Peter's advice and merge codes 1 and 2. It's still not working. Before, the PDF file is 2MB, but it's BLANK. Now, its size is 0KB. Any help would be greatly appreciated!
import json
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder
from org.apache.nifi.processor.io import OutputStreamCallback
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import InputStreamCallback
class PyInputStreamCallback(InputStreamCallback):
def __init__(self):
pass
def process(self, inputStream):
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
class OutputWrite(OutputStreamCallback):
def __init__(self, obj):
self.obj = obj
def process(self, outputStream):
outputStream.write(bytearray(json.dumps(self.obj).encode('utf-8')))
text = ''
flowFile = session.get()
if(flowFile != None):
session.read(flowFile, PyInputStreamCallback())
confluence_attachment_api = flowFile.getAttribute('confluence_attachment_api')
confluence_authorization = flowFile.getAttribute('confluence_authorization')
file_name = flowFile.getAttribute('file_name')
s = requests.Session()
m = MultipartEncoder(fields={'file': (file_name, text, 'application/pdf')})
headers = {"X-Atlassian-Token":"nocheck", "Authorization":confluence_authorization, "Content-Type":m.content_type}
r = s.post(confluence_attachment_api, data=m, headers=headers, verify=False)
session.write(flowFile, OutputWrite(json.loads(r.text)))
session.transfer(flowFile, REL_SUCCESS)
session.commit()
Upvotes: 1
Views: 678
Reputation: 9712
It doesn't look like you are actually sending the FlowFile contents. Instead, you are sending an attribute named file_content
as the file contents, which probably isn't what you intended
You will need to do a session.read
to get the file stream. The below code doesn't work as is, but shows how you can get access to the stream.
class PyInputStreamCallback(InputStreamCallback):
def __init__(self):
pass
def process(self, inputStream):
m = MultipartEncoder(fields={'file': (file_name, inputStream, 'application/pdf')})
session.read(flowFile, PyInputStreamCallback())
Ref: https://community.hortonworks.com/articles/75545/executescript-cookbook-part-2.html
Upvotes: 1