Dave Cassel
Dave Cassel

Reputation: 8422

Getting binary content from Avro

I have an ExecuteSQL processor that returns a SQL Server varbinary field for a particular row:

select [File]
from dbo.Attachment
where attachmentid=?

The query will find one row. The content gets stored in Avro. The retrieved File could be a text format (CSV, HTML, etc) or a binary format (PDF, Office docs, images, etc).

If the content is text, I can run it through ConvertAvroToJSON and then EvaluateJsonPath to get the content that I want. That doesn't work with the binary content, however. When I download the content of a flowfile that has, say, a PowerPoint file, PowerPoint complains about the content.

I'd like to have the Content of my FlowFile be just the binary content (I'll be sending it on to a PutMarkLogic processor later). How can I do that?

Upvotes: 1

Views: 603

Answers (1)

daggett
daggett

Reputation: 28564

I did not test it.

but you could use ExecuteGroovyScript as workaround to write binary field directly to a file content.

SQL.mydb - add this parameter on the level of processor and link it to required DBCP pool.

AttributeWithID - i assume there is a flow file attribute with this name that contains value to be used in sql query for attachmentid

def ff=session.get()
if(!ff)return

SQL.mydb.eachRow("""
    select [File]
    from dbo.Attachment
    where attachmentid=${ff.AttributeWithID}
"""){row->
    ff.write{outStream-> 
        outStream << row.getBinaryStream(1)
    }
}
REL_SUCCESS << ff

Upvotes: 2

Related Questions