How to extract a text from newline followed by some specific keywords in Python?

Question

I am working on a problem where I have some Multi line strings which are in a table type email snapshot format.

Example below:

Hello,

please provide an update on the following invoice

Invoice#        Status    Invoice_Amount        Account#
646464646       Open      7446.00               53334444
645543333       Open      6443.00               23599499
874646553       Open      6223.50               94744663

Thanks,

My task is to extract the Invoice numbers which in this case are 646464646,645543333 & 874646553. After looking at few examples I know that they are normally in next line followed by a heading like Invoice# or Invoice Numbers etc.

I my trying to use Regular Expressions to solve this problem but I am not able to build a solution which can match a keyword like "Invoice#" in the header and extract numbers just below that header (there could be N number of rows in the table snapshot)

My desired output from this example is:

[646464646,645543333,874646553]

I tried searching for any existing solution but didn't find any example for a match in newline text, please suggest if you have an idea how to solve this.

Please let me know if further details are required. Thanks.

Edit: The example shown above is not the standard format this is just one of the emails, actual emails may have this snapshot in a different way like there could be more than 4 columns with different headers and names, also the invoice number could have more than or less than 9 digits, only consistent thing I believe is the "Invoice#" keyword in header.

How to extract a text from newline followed by some specific keywords in Python?

Answers (1)

Related Questions