Reputation: 2660
I'm getting the URI of PDF's from different sources (local on the phone, Google drive etc) and for Dropbox I can read a byte array using the URI as input. But the PDF that I'm getting is not a valid PDF. Base64 is also not correct.
This is my URI:
content://com.dropbox.android.FileCache/filecache/a54cc030-e2e0-4ef5-8e72-0ac3269a16e1
val inputStream = context.contentResolver.openInputStream(Uri.parse(uri))
val allText = inputStream.bufferedReader().use(BufferedReader::readText)
val base64Image = Base64.encodeToString(allText.toByteArray(), Base64.DEFAULT)
allText content (snippet):
%PDF-1.3
%���������
4 0 obj
<< /Length 5 0 R /Filter /FlateDecode >>
.
.
.
13025
%%EOF
When storing the allText content with .PDF extension doesn't work.
The format looks good, but when inserting base64Image in https://base64.guru/converter/decode/pdf it shows that it's not correct.
Original PDF content (snippet):
2550 4446 2d31 2e33 0a25 c4e5 f2e5 eba7
f3a0 d0c4 c60a 3420 3020 6f62 6a0a 3c3c
.
.
.
.
0a73 7461 7274 7872 6566 0a31 3330 3235
0a25 2545 4f46 0a
Upvotes: 3
Views: 143
Reputation: 15916
"I can read a byte array using the URI as input. But the PDF that I'm getting is not a valid PDF."
"When storing the
allText
content with .PDF extension doesn't work."
You're reading the PDF input bytes (hex) and storing them into a wrong format (text).
For example, all valid PDF files are expected to begin with bytes 25 50 44 46
. Your allText
content snippet starts with %PDF
which is the converted ASCII/UTF text representation of those bytes.
Problem:
All this is fine because we can just convert the text characters back into their respective byte values, right? Nope, not all byte values can be correctly recovered back from text format.
example #1: can convert...
input bytes : 25 50 44 46
as text : % P D F
into bytes : 25 50 44 46
example #2: cannot convert (original data is not recovered, because no text chars for such bytes)...
input bytes : 25 C4 E5 F2 E5 EB A7 F3 A0 D0
as text : % � � � � � � � � �
into bytes : 25 00 00 00 00 00 00 00 00 00
Solution:
Try something like below. You want the logic as explained within the code comments...
import java.io.File
import java.io.InputStream
fun main(args: Array<String>)
{
//# setup access to your file...
var inFile :InputStream = File("your-file-path-here.pdf")
var fileSize :Int = File(path).length()
//# read file bytes into a bytes Array...
var inStream :InputStream = inFile.inputStream()
var inBytes :ByteArray = inStream.readBytes()
//# Make as String (of hex values)...
//var hexString :String = ""
val hexString = ""
for (b in inBytes) { hexString = String.format("%02X", b) }
//# check values as hex... should print: 25
//print(hexString) //could be long print-out for a big file
//# Make Base64 string...
val base64 = Base64.getEncoder().encodeToString(inBytes)
}
"Base64 is also not correct."
(option 1)
Try converting to Base64 the hexString
in above example code (note: now added as val base64
).
(option 2)
Directly read file bytes into a Base64 string with simple...
val bytes = File(filePath).readBytes()
val base64 = Base64.getEncoder().encodeToString(bytes)
Upvotes: 2
Reputation: 1007296
This is my URI:
That is not a file.
val file = File(uri)
That is not how you use a Uri
. Use a ContentResolver
and openInputStream()
to get an InputStream
on the content identified by the Uri
.
Note that reading in the entire content, let alone converting it to Base64 in memory, may cause you to encounter OutOfMemoryErrors
.
Upvotes: 0