Sharan
Sharan

Reputation: 11

Image to PDF conversion without using third party library in C#

I need to convert image files to PDF without using third party libraries in C#. The images can be in any format like (.jpg, .png, .jpeg, .tiff).

I am successfully able to do this with the help of itextsharp; here is the code.

string value = string.Empty;//value contains the data from a json file
    List<string> sampleData;
    public void convertdata()
    {
        //sampleData = Newtonsoft.Json.JsonConvert.DeserializeObject<List<string>>(value);
        var jsonD = System.IO.File.ReadAllLines(@"json.txt");
        sampleData = Newtonsoft.Json.JsonConvert.DeserializeObject<List<string>>(jsonD[0]);
        Document document = new Document();

        using (var stream = new FileStream("test111.pdf", FileMode.Create, FileAccess.Write, FileShare.None))
        {
            PdfWriter.GetInstance(document, stream);
            document.Open();

            foreach (var item in sampleData)
            {
                newdata = Convert.FromBase64String(item);
                var image = iTextSharp.text.Image.GetInstance(newdata);
                document.Add(image);
                Console.WriteLine("Conversion done check folder");

            }
            document.Close();
        }

But now I need to perform the same without using third party library.

I have searched the internet but I am unable to get something that can suggest a proper answer. All I am getting is to use it with either "itextsharp" or "PdfSharp" or with the "GhostScriptApi".

Would someone suggest a possible solution?

Upvotes: 1

Views: 5114

Answers (3)

Dejan Mauer
Dejan Mauer

Reputation: 106

I would write my own ASP.NET Web Service or Web API service and call it within the app :)

Upvotes: 0

Eugene
Eugene

Reputation: 2878

The structure of the simpliest PDF document with one single page and one single image is the following:

- pdf header
- pdf document catalog
- pages info 
- image 
  - image header
  - image data 
- page
  - reference to image
- list of references to objects inside pdf document   

Check this Python code that is doing the following steps to convert image to PDF:

  1. Writes PDF header;
  2. Checks image data to find which filter to use. You should better select just one format like FlateDecode codec (used by PDF to compress images without loss);
  3. Writes "catalog" object which is basically is the array of references to page objects.
  4. Writes image object header;
  5. Writes image data (pixels by pixels, converted to the given codec format) as the "stream" object in pdf;
  6. Writes "page" object which contains "image" object;
  7. Writes "trailer" section with the set of references to objects inside PDF and their starting offsets. PDF format stores references of objects at the end of PDF document.

Upvotes: 0

David van Driessche
David van Driessche

Reputation: 7046

This is doable but not practical in the sense that it would very likely take way too much time for you to implement. The general procedure is:

  1. Open the image file format
  2. Either copy the encoded bytes verbatim to a stream in a PDF document you have created or decode the image data and re-encode it in a PDF stream (whether it's the former or latter depends on the image format)
  3. Save the PDF

This looks easy (it's only three points after all :-)) but when you start to investigate you'll see that it's very complicated.

First of all you need to understand enough of the PDF specification to write a new PDF file from scratch, doing all of the right things. The PDF specification is way over 1000 pages by now; you don't need all of it but you need to support a good portion of it to write a proper PDF document.

Secondly you will need to understand every image file format you want to support. That by itself is not trivial (the TIFF file format for example is so broad that it's a nightmare to support a reasonable fraction of TIFF files out there). In some cases you'll be able to simply copy the bulk of an image file format into your PDF document (jpeg files fall in that category for example), that's a complication you want to support because uncompressing the JPEG file and then recompressing it in a PDF stream will cause quality loss.

So... possible? Yes. Plausible? No. Unless you have gotten lots and lots of time to complete this project.

Upvotes: 1

Related Questions