Maxymus
Maxymus

Reputation: 1480

How to Convert pdf file to datatable

Is there any way to convert PDF file to DataTable? The PDF file mainly consist of only tables any help will be highly appreciated.

Upvotes: 1

Views: 4831

Answers (2)

suneelsarraf
suneelsarraf

Reputation: 953

using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;

 public DataTable ImportPDF(string Filename)
    {
        string strText = string.Empty;
        List<string[]> list = new List<string[]>();
        string[] PdfData = null;
        try
        {
            PdfReader reader = new PdfReader((string)Filename);
            for (int page = 1; page <= reader.NumberOfPages; page++)
            {
                ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.LocationTextExtractionStrategy();
                String cipherText = PdfTextExtractor.GetTextFromPage(reader, page, its);
                cipherText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(cipherText)));
                strText = strText + "\n" + cipherText;
                PdfData = strText.Split('\n');

            }
            reader.Close();
        }
        catch (Exception ex)
        {
        }

        List<string> temp = PdfData.ToList();
        temp.RemoveAt(0);
        list = temp.ConvertAll<string[]>(x => x.Split(' ').ToArray());
        List<string> columns = list.FirstOrDefault().ToList();
        DataTable dtTemp = new DataTable();
        columns.All(x => { dtTemp.Columns.Add(new DataColumn(x)); return true; });
        list.All(x => { dtTemp.Rows.Add(dtTemp.NewRow().ItemArray = x); return true; });
        return dtTemp;
    }

Upvotes: 2

mark stephens
mark stephens

Reputation: 3184

If the PDF contains marked content (you can see how to find this in my blog article http://www.jpedal.org/PDFblog/2010/09/the-easy-way-to-discover-if-a-pdf-file-contains-structured-content/) you can extract it from the PDF file. Otherwise you will need to extract the text and try to guess the structure.

Upvotes: 1

Related Questions