Shax
Shax

Reputation: 4307

Lua patterns - How to remove unwanted string within a string

I am receiving bunch of rows like below

2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    <a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>    18.03   18.03   EUR 1.14977 20.73   20.73

You see in every row, I am receiving a value surrounded by HTML tags as mentioned below.

<a href="/cgi-bin/recon_detail?rectent=AAUUM&benificary=ARRTC&period=2011/02&svctype=Voice">AAUUM_ARRTC_0211_TBT</a>

I want to actually get rid of these html tags and want to replace all that value with the real value which is hidden inside these html tags, which is AAUUM_ARRTC_0211_TBT

So after processing I need the above data should become like this.

2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73
2011/02 ARRTC   AAUUMCO ZZITNWMOBILE COMMUNICATIONS CENTER  ARRTC-AAUUM-TBT-2011-02 0.00    AAUUM_ARRTC_0211_TBT    18.03   18.03   EUR 1.14977 20.73   20.73

I need to achieve this task using Lua pattern. can anybody help me and share their experience.

Thanks

Upvotes: 3

Views: 4280

Answers (2)

Stomp
Stomp

Reputation: 920

This gets rid of everything in brackets: s=s:gsub("%b<>", "")

Upvotes: 3

lhf
lhf

Reputation: 72312

Try s=s:gsub("<a.->(.-)</a>","%1").

Upvotes: 4

Related Questions