Liu Silong
Liu Silong

Reputation: 5532

The problem of using regular expressions in the shell script

I have a regex that removes the content in the Activity tag. The regex is \s*<activity .*>(?:\s|\S)*<\/activity>. It is possible in Java, but it will not work when written in the shell. The wording in the shell is as follows:

sed 's+\s*<activity .*>(?:\s|\S)*<\/activity>++g' AndroidManifest.xml

AndroidManifest.xml

<?xml version="1.0" encoding="utf-8"?>
<!-- GENERATED BY UNITY. REMOVE THIS COMMENT TO PREVENT OVERWRITING WHEN EXPORTING AGAIN-->
<manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.unity3d.player" xmlns:tools="http://schemas.android.com/tools">
  <application>
    <activity android:name="com.unity3d.player.UnityPlayerActivity" android:theme="@style/UnityThemeSelector" android:screenOrientation="fullSensor" android:launchMode="singleTask" android:configChanges="mcc|mnc|locale|touchscreen|keyboard|keyboardHidden|navigation|orientation|screenLayout|uiMode|screenSize|smallestScreenSize|fontScale|layoutDirection|density" android:hardwareAccelerated="false">
      <intent-filter>
        <action android:name="android.intent.action.MAIN" />
        <category android:name="android.intent.category.LAUNCHER" />
      </intent-filter>
      <meta-data android:name="unityplayer.UnityActivity" android:value="true" />
      <meta-data android:name="android.notch_support" android:value="true" />
    </activity>
    <meta-data android:name="unity.splash-mode" android:value="0" />
    <meta-data android:name="unity.splash-enable" android:value="True" />
    <meta-data android:name="notch.config" android:value="portrait|landscape" />
    <meta-data android:name="unity.build-id" android:value="07a923ed-bdbd-46ed-98bd-afef17a7904a" />
  </application>
  <uses-feature android:glEsVersion="0x00030000" />
  <uses-feature android:name="android.hardware.vulkan.version" android:required="false" />
  <uses-feature android:name="android.hardware.touchscreen" android:required="false" />
  <uses-feature android:name="android.hardware.touchscreen.multitouch" android:required="false" />
  <uses-feature android:name="android.hardware.touchscreen.multitouch.distinct" android:required="false" />
</manifest>

What should I do. Thanks.

Upvotes: 0

Views: 61

Answers (1)

tshiono
tshiono

Reputation: 22032

The syntax of regular expression are roughly classified in three variants: BRE, ERE and PCRE. The latter has more features and power of expression. Your regex is written in PCRE while sed supports up to ERE. Another problem is that sed processes the input file line by line and it requires some trick to make sed regex match across lines.

With sed please try the following:

sed -E '
:l              # define a label "l"
N               # append the next line of input into the pattern space
$!b l           # repeat until the last line
                # then whole lines are stored in the pattern space
s+[[:blank:]]*<activity .*>.*<\/activity>++g
                # perform the replace command over the pattern space
' AndroidManifest.xml
  • The -E option enables ERE
  • It slurps the whole file at first then performs the replacement next.

BTW if perl is your option, you can apply your regex as is:

perl -0777 -pe 's+\s*<activity .*>(?:\s|\S)*<\/activity>++g' AndroidManifest.xml

There is one caveat regarding the (?:\s|\S)* expression. The quantifier * is greedy and tries to match as long as possible. If the xml file contains multiple <activity> .. </activity> tags, the entire block across the tags is removed including the intermediate lines which should not be removed. It will be better to rewrite it as: (?:\s|\S)*? or [\s\S]*? in a common manner.

Upvotes: 2

Related Questions