user1927396
user1927396

Reputation: 459

Parsing a piece of data to get id and email in one shot

I need some suggestion on if this can be done..I have following data(this is just one line,there will be blocks of such lines) coming out after running a command,is there a way I can use grep and awk and parse each line to get the number and owner email in one shot like below

Output:-

12345 [email protected]

Input:-
change I5e55796844350e543f8460c53ec6e755ebe663d4
  project: platform/vendor/company-proprietary/chip
  branch: master
  id: I5e55796844350e543f8460c53ec6e755ebe663d4
  number: 12345
  subject: chip: changes to tl logging structure
  owner:
    name: Gord barry
    email: [email protected]
    username: mbarry
  url: https://review-android.quicinc.com/12345
  commitMessage: chip: changes to tl logging structure

                 The existing TL logging has been divided into three distinct modules:
                 TL_BA (14), TL_HO (13) and TL (existing module). Thus the log with
                 loglevel 5 in file chip_qct_tl_hostsupport.c can be viewed by issuing
                 the following command - iwpriv chip0 setchipdbg 13 5 1.

                 Change-Id: I5e55796844350e543f8460c53ec6e755ebe663d4
  createdOn: 2012-08-09 15:40:57 PDT
  lastUpdated: 2012-08-21 16:43:08 PDT
  sortKey: 001f390f00023ead
  open: true
  status: NEW
  currentPatchSet:
    number: 3
    revision: 922872178946a712ab9f04483bc93216573cec6e
    parents:
 [ae259408e6ab530be62e02fdeafef34834d68709]
    ref: refs/changes/17/12345/3
    uploader:
      name: Gord barry
      email: [email protected]
      username: mbarry
    createdOn: 2012-08-21 16:43:08 PDT
    files:
      file: /COMMIT_MSG
      type: ADDED
    files:
      file: rich/CORE/TL/inc/tlDebug.h
      type: MODIFIED
    files:
      file: rich/CORE/TL/inc/chip_qct_tl.h
      type: MODIFIED
    files:
      file: rich/CORE/TL/src/chip_qct_tl.c
      type: MODIFIED
    files:
      file: rich/CORE/TL/src/chip_qct_tl_ba.c
      type: MODIFIED
    files:
      file: rich/CORE/TL/src/chip_qct_tl_hosupport.c
      type: MODIFIED
    files:
      file: rich/CORE/VOSS/inc/vos_types.h
      type: MODIFIED
    files:
      file: rich/CORE/VOSS/src/vos_trace.c
      type: MODIFIED
    files:
      file: rich/CORE/WDA/src/chip_qct_wda_ds.c
      type: MODIFIED
    files:
      file: rich/CORE/WDI/TRP/DTS/src/chip_qct_wdi_dts.c
      type: MODIFIED

Upvotes: 0

Views: 308

Answers (2)

Chris Seymour
Chris Seymour

Reputation: 85845

Using grep in one shot:

$ grep -Po '(?<=(email|umber): )\S+' file
12345 
[email protected]
3 
[email protected]

Use xargs -n2 to get both on one line:

$ grep -Po '(?<=(email|umber): )\S+' file | xargs -n2
12345 [email protected]
3 [email protected]

$ grep -Po '(?<=(email|umber): )\S+' tfile | paste - -
12345   [email protected]
3       [email protected]

Explanation:

This is positive lookbehind '(?<=a)b' that matches b followed by a. In your case you want to match the the string following email: or number: however positive look behinds have to be fixed length so we have to drop the n in number. \S+ matches one or more non-whitespace character.

(?<=   # Positive lookbehind 
(      # Group for alternation
email  # Literal string email
|      # Alternation (or)
umber  # Literal string umber
)      # Close 
:      # : Literal colon and single space 
)      # Close positive lookbehind 
\S+    # One or more non-whitespace character

With awk:

$ awk -F: '/email|number/{print $2}' file | xargs -n2
12345 [email protected]
3 [email protected]

Upvotes: 3

Ed Morton
Ed Morton

Reputation: 203995

Try these:

awk -F'[[:space:]:]+' '{a[$2]=$3} END{ print a["number"], a["email"] }' file
awk -F'[[:space:]:]+' '{a[$2]=$3} /email:/{ print a["number"], a["email"] }' file
awk -F'[[:space:]:]+' '{a[$2]=$3} /email:/{ print a["number"], a["email"]; exit }' file

and if neither of those is what you're looking for then provide more details on what it IS you're looking for.

Here's how the last script above works for me with the posted sample input:

$ head -15 file
change I5e55796844350e543f8460c53ec6e755ebe663d4
  project: platform/vendor/company-proprietary/chip
  branch: master
  id: I5e55796844350e543f8460c53ec6e755ebe663d4
  number: 12345
  subject: chip: changes to tl logging structure
  owner:
    name: Gord barry
    email: [email protected]
    username: mbarry
  url: https://review-android.quicinc.com/12345
  commitMessage: chip: changes to tl logging structure

                 The existing TL logging has been divided into three distinct modules:
                 TL_BA (14), TL_HO (13) and TL (existing module). Thus the log with

$ awk -F'[[:space:]:]+' '{a[$2]=$3} /email:/{ print a["number"], a["email"]; exit }' file
12345 [email protected]

Upvotes: 1

Related Questions