Maher Shahmeer
Maher Shahmeer

Reputation: 148

Powerpoint to Text C# - Microsoft.Interop

I have been trying to read .ppt files from last 3 days. I searched a lot on internet and I came up with different source code snippets but nothing was perfect. And now i tried this code, and it is not printing "Check4" because of some unidentified problem in "Foreach" statement, and throwing an exception. Please guide me. I need it badly.

public static  void ppt2txt (String source)
        {
            string fileName = System.IO.Path.GetFileNameWithoutExtension(source);
            string filePath = System.IO.Path.GetDirectoryName(source);
            Console.Write("Check1");
            Application pa = new Microsoft.Office.Interop.PowerPoint.ApplicationClass ();
            Microsoft.Office.Interop.PowerPoint.Presentation pp = pa.Presentations.Open (source,
            Microsoft.Office.Core.MsoTriState.msoTrue,
            Microsoft.Office.Core.MsoTriState.msoFalse,
            Microsoft.Office.Core.MsoTriState.msoFalse);
            Console.Write("Check2");
            String pps = "";
            Console.Write("Check3");
            foreach (Microsoft.Office.Interop.PowerPoint.Slide slide in pp.Slides)
            {
            foreach (Microsoft.Office.Interop.PowerPoint.Shape shape in slide.Shapes)
            pps += shape.TextFrame.TextRange.Text.ToString ();
            }

            Console.Write("Check4");

            Console.WriteLine(pps);
        }

Thrown exception is

System.ArgumentException: The specified value is out of range. at Microsoft.Office.Interop.PowerPoint.TextFrame.get_TextRange() at KareneParser.Program.ppt2txt(String source) in c:\Users\Shahmeer\Desktop\New folder (2)\KareneParser\Program.cs:line 323 at KareneParser.Program.Main(String[] args) in c:\Users\Shahmeer\Desktop\New folder (2)\KareneParser\Program.cs:line 150

Line 323 on which exception is caught

pps += shape.TextFrame.TextRange.Text.ToString ();

Thanks in advance.

Upvotes: 3

Views: 3793

Answers (2)

Daniel Lane
Daniel Lane

Reputation: 2593

It looks like you need to check your shape objects to see if they have a TextFrame and Text present.

In your nested foreach loop try this:

foreach (Microsoft.Office.Interop.PowerPoint.Slide slide in pp.Slides)
{
   foreach (Microsoft.Office.Interop.PowerPoint.Shape shape in slide.Shapes)
   {
        if(shape.HasTextFrame == Microsoft.Office.Core.MsoTriState.msoTrue)
        {
           var textFrame = shape.TextFrame;
           if(textFrame.HasText == Microsoft.Office.Core.MsoTriState.msoTrue)
           {
              var textRange = textFrame.TextRange;
              pps += textRange.Text.ToString ();
           }
        }

   }
}

This is of course untested on my part, it looks to me though that as your foreach loops, you're trying to access some shapes in the powerpoint doc that don't have text present, hence the out of range exception. I've added in checking to make sure it only appends text to your pps string if it has Text present.

Upvotes: 2

LocEngineer
LocEngineer

Reputation: 2917

Not all shapes have text. Lines etc are also shapes. Check for HasText first:

foreach (Microsoft.Office.Interop.PowerPoint.Shape shape in slide.Shapes)
{
  if(shape.TextFrame.HasText)
  {
     pps += shape.TextFrame.TextRange.Text;
  }
}

Upvotes: 0

Related Questions