JLin
JLin

Reputation: 23

Parse email body paragragh in Powershell

I am creating a script to parse outlook email body, so that I can get say an (ID number, date, name) after strings ID: xxxxxx Date: xxxxxx Name:xxxxx. I was looking around and could not fine anything that allows me to take the string after a match.

What I manage so far is to query for the email that was send by the specific users from outlook.

    Add-Type -Assembly "Microsoft.Office.Interop.Outlook"
    $Outlook = New-Object -ComObject Outlook.Application
    $namespace = $Outlook.GetNameSpace("MAPI")
    $inbox =                 $namespace.GetDefaultFolder([Microsoft.Office.Interop.Outlook.OlDefaultFolders]::olFolderInbox)
    foreach ($items in $inbox.items){if (($items.to -like "*email*") -or ($items.cc -like "*email.add*")){$FindID = $items.body}}

Now that I have the email body in the for loop I am wondering how I can parse the content?

In between the paragraphs will be a text something like this

ID: xxxxxxxx
Name: xxxxxxxxx
Date Of Birth : xxxxxxxx

I did some testing on the below to see if I can add that into the for loop but it seem like I cannot break the paragraphs.

$FindID| ForEach-Object {if (($_ -match 'ID:') -and ($_ -match ' ')){$testID = ($_ -split 'ID: ')[1]}}

I get the following results which I cannot get just the ID.

Sample Result when i do $testID

xxxxxxxx
 Name: xxxxxxxxx 
 Date Of Birth : xxxxxxxx

 Regards,
 xxxxx xxxxx
 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

How do I get just the results I want? I am just struggling at that portion.

Upvotes: 2

Views: 6860

Answers (2)

user6811411
user6811411

Reputation:

You'll need a Regular Expression with (named) capture groups to grep the values. See example on rexgex101.com.
Provdid $item.bodyis not html and a single string, this could work:

## Q:\Test\2018\07\24\SO_51492907.ps1
Add-Type -Assembly "Microsoft.Office.Interop.Outlook"
$Outlook = New-Object -ComObject Outlook.Application
$namespace = $Outlook.GetNameSpace("MAPI")
$inbox = $namespace.GetDefaultFolder(
    [Microsoft.Office.Interop.Outlook.OlDefaultFolders]::olFolderInbox)
## see $RE on https://regex101.com/r/1B2rD1/1
$RE = [RegEx]'(?sm)ID:\s+(?<ID>.*?)$.*?Name:\s+(?<Name>.*?)$.*?Date Of Birth\s*:\s*(?<DOB>.*?)$.*'

$Data = ForEach ($item in $inbox.items){
    if (($item.to -like "*email*") -or 
        ($item.cc -like "*email.add*")){
        if (($item.body -match $RE )){
            [PSCustomObject]@{
                ID   = $Matches.ID
                Name = $Matches.Name
                DOB  = $Matches.DOB
            }
        }
    }
}
$Data 
$Data | Export-CSv '.\data.csv' -NoTypeInformation

Sample output with above anonimized mail

> Q:\Test\2018\07\24\SO_51492907.ps1

ID        Name       DOB
--        ----       ---
xxxxxx... xxxxxxx... xxxxxx...

Upvotes: 2

Theo
Theo

Reputation: 61028

I don't have Outlook available at the moment, but i think this will work

Add-Type -Assembly "Microsoft.Office.Interop.Outlook"
$Outlook = New-Object -ComObject Outlook.Application
$namespace = $Outlook.GetNameSpace("MAPI")
$inbox = $namespace.GetDefaultFolder([Microsoft.Office.Interop.Outlook.OlDefaultFolders]::olFolderInbox)
$inbox.items | Where-Object { $_.To -like "*email*" -or $_.CC -like "*email.add*"} {
    $body = $_.body
    if ($body -match '(?s)ID\s*:\s*(?<id>.+)Name\s*:\s*(?<name>.+)Date Of Birth\s*:\s*(?<dob>\w+)') {
        New-Object -TypeName PSObject -Property @{
            'Subject'        = $_.Subject
            'Date Received'  = ([datetime]$_.ReceivedTime).ToString()
            'ID'             = $matches['id']
            'Name'           = $matches['name']
            'Date of Birth'  = $matches['dob']

        }
    }
}

Upvotes: 1

Related Questions