Pythenx
Pythenx

Reputation: 72

Regex for chat messages

I have a chat conversation log and I want each group to be of type (Stranger|You): message. Here is the format of the chat log:

foofoofoofoofoofooStranger: heyy You: asdasdasdassdasad Stranger: asdasdasd You: Stranger:asdasdasd You: bye You have disconnected.\n\n \n\n \n\x0c

I tried (Stranger:\s|You:\s)(.*?)(Stranger:\s|You:\s), but it doesn't quite work.

Upvotes: 1

Views: 843

Answers (2)

Ryszard Czech
Ryszard Czech

Reputation: 18621

Use

((?:Stranger|You):\s+)((?:(?!(?:Stranger|You):\s).)*)

See proof

EXPLANATION

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    (?:                      group, but do not capture:
--------------------------------------------------------------------------------
      Stranger                 'Stranger'
--------------------------------------------------------------------------------
     |                        OR
--------------------------------------------------------------------------------
      You                      'You'
--------------------------------------------------------------------------------
    )                        end of grouping
--------------------------------------------------------------------------------
    :                        ':'
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the most amount
                             possible)):
--------------------------------------------------------------------------------
      (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
        (?:                      group, but do not capture:
--------------------------------------------------------------------------------
          Stranger                 'Stranger'
--------------------------------------------------------------------------------
         |                        OR
--------------------------------------------------------------------------------
          You                      'You'
--------------------------------------------------------------------------------
        )                        end of grouping
--------------------------------------------------------------------------------
        :                        ':'
--------------------------------------------------------------------------------
        \s                       whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
      )                        end of look-ahead
--------------------------------------------------------------------------------
      .                        any character except \n
--------------------------------------------------------------------------------
    )*                       end of grouping
--------------------------------------------------------------------------------
  )                        end of \2

Upvotes: 1

The fourth bird
The fourth bird

Reputation: 163447

You could change the last capturing group into a positive lookahead (?=.

To also match the last part, you could add $ to also assert the end of the string.

(Stranger:\s|You:\s)(.*?)(?=Stranger:\s|You:\s|$)

Regex demo

Upvotes: 1

Related Questions