Sarah Bergquist
Sarah Bergquist

Reputation: 385

Hive: regexp_replace square brackets

I need to replace the square brackets around a key:value pair similar to the following. Any help is much appreciated!

data in 'properties' looks like this:

name: property1
value: [12345667:97764458]

**code**
SELECT p.name, regexp_replace(p.value,'[','') AS value
FROM properties p

Solved: Revised code

SELECT p.name, regexp_replace(p.value,'\\[|\\]','') AS value
FROM properties p;

Upvotes: 6

Views: 18430

Answers (3)

Faiz
Faiz

Reputation: 5453

If you are attempting this from Python virtual environment, triple the /// to escape [ or ]

spark.sql("""    
    select 
        REGEXP_REPLACE(
            'arn:scheduler:::job/3bf19fab[xyz]-“#date{dt($YYY_MM_DD)}”',
            '[^a-zA-Z0-9:\/\\\-_{}()\\\[\\\]#$%,.]+',
            '',
            1) 
""").show(truncate=False)
arn:scheduler:::job/3bf19fab[xyz]-#date{dt($YYY_MM_DD)} 

Upvotes: 0

iggy
iggy

Reputation: 722

You always need to double your backslashes in Hive regexes. That's because a single backslash is used as an escape character in Hive strings so it would get stripped off before the regex parser can see it. A double backslash becomes a single backslash, which is what you need.

To see how backslashes get stripped just run a select using your regex as a string literal:

select '\\[' from t limit 1;
OK
\[

Upvotes: 5

João Pinho
João Pinho

Reputation: 3775

Here is you regex [\[\]]+ this will match one or more [ and ] in a string.

Upvotes: 2

Related Questions