afropunk
afropunk

Reputation: 91

Regex last part of string not consistent pattern or length

I'm trying to write a regex style condition to only extract the last part of a long string. I'm having trouble getting it to work as the string format is not consistent. I've included an example below:

2:0000:PlaceOne|2:30000:PlaceTwo|187768:20003:PlaceThree|187904:20011:PlaceFour|2614991:20033:PlaceFive|1166533:60006:PlaceSix

In this example what I need the output to be is PlaceSix. In other examples it could be PlaceFive, PlaceSeven etc so the length is not always the same for the entire string or for what I'm extracting. The only consistent pattern is it will always be at the end of the string and comes after the last colon (all characters after that colon). I'm sure this must be possible with regex but so far regrettably have been unable to get this to work.

Also expanding on this logic I have a related question, if it is possible to do this via regex, can I also use the same logic to extract another part of the string if needed? So for instance for the same example above, if I wanted to extract PlaceTwo instead is that possible? Problem is there is no consistent pattern I can think of, not even being at the end of the string like with the previous PlaceSix example. It does come after a colon but as you can see there are multiple colons within the string. The numbers will also seem random corresponding to the places. Was thinking maybe something along the lines of after x number of colons extract text inbetween: and | although I'm not even sure if that's actually possible.

Main question is the first question though, if that's possible I'll consider this question answered. The second part is more of a bonus question if the first one is possible, being so similar didn't think it was worth posting two separate questions.

Hopefully I've explained this correctly, please let me know if any further clarifications are required. Many thanks.

Upvotes: 1

Views: 63

Answers (2)

David דודו Markovitz
David דודו Markovitz

Reputation: 44991

drop table t;
create table t (str varchar(1000));
insert into t (str) values ('2:0000:PlaceOne|2:30000:PlaceTwo|187768:20003:PlaceThree|187904:20011:PlaceFour|2614991:20033:PlaceFive|1166533:60006:PlaceSix
');

select  split_part(split_part(str,'|',1),':',3) as c1
       ,split_part(split_part(str,'|',2),':',3) as c2
       ,split_part(split_part(str,'|',3),':',3) as c3
       ,split_part(split_part(str,'|',4),':',3) as c4
       ,split_part(split_part(str,'|',5),':',3) as c5
       ,split_part(split_part(str,'|',6),':',3) as c6

from    t
;

+----------+----------+------------+-----------+-----------+----------+
| c1       | c2       | c3         | c4        | c5        | c6       |
+----------+----------+------------+-----------+-----------+----------+
| PlaceOne | PlaceTwo | PlaceThree | PlaceFour | PlaceFive | PlaceSix |
+----------+----------+------------+-----------+-----------+----------+

select  arr[1*3] as c1   
       ,arr[2*3] as c2
       ,arr[3*3] as c3
       ,arr[4*3] as c4
       ,arr[5*3] as c5
       ,arr[6*3] as c6

from   (select  regexp_split_to_array(str,'[|:]') as arr
        from    t
        ) t

+----------+----------+------------+-----------+-----------+----------+
| c1       | c2       | c3         | c4        | c5        | c6       |
+----------+----------+------------+-----------+-----------+----------+
| PlaceOne | PlaceTwo | PlaceThree | PlaceFour | PlaceFive | PlaceSix |
+----------+----------+------------+-----------+-----------+----------+

select  arr[1+1] as c1   
       ,arr[2+1] as c2
       ,arr[3+1] as c3
       ,arr[4+1] as c4
       ,arr[5+1] as c5
       ,arr[6+1] as c6 

from   (select  regexp_split_to_array('|'||str,'\|([^:]+:){2}') as arr
        from    t
        ) t

+----------+----------+------------+-----------+-----------+----------+
| c1       | c2       | c3         | c4        | c5        | c6       |
+----------+----------+------------+-----------+-----------+----------+
| PlaceOne | PlaceTwo | PlaceThree | PlaceFour | PlaceFive | PlaceSix |
+----------+----------+------------+-----------+-----------+----------+

select  arr[1] as c1 
       ,arr[2] as c2
       ,arr[3] as c3
       ,arr[4] as c4
       ,arr[5] as c5
       ,arr[6] as c6

from   (select  regexp_matches(str,'^.*?:([^:|]*)\|.*?:([^:|]*)\|.*?:([^:|]*)\|.*?:([^:|]*)\|.*?:([^:|]*)\|.*?:([^:|]*)$') as arr               
        from    t
        ) t

+----------+----------+------------+-----------+-----------+----------+
| c1       | c2       | c3         | c4        | c5        | c6       |
+----------+----------+------------+-----------+-----------+----------+
| PlaceOne | PlaceTwo | PlaceThree | PlaceFour | PlaceFive | PlaceSix |
+----------+----------+------------+-----------+-----------+----------+

Upvotes: 1

maraaaaaaaa
maraaaaaaaa

Reputation: 8193

How's this

:([^:]*)$

You can test it here

And to answer your other question, i would say that as long as you know what it is you are looking for, there should be a way to find it.

To give an example, if i want to get the content after the fourth : every single time, its as simple as:

^(?:[^:]*:){4}([^:]*)

And to change the position, all you have to do is change the 4 to whatever you want

Test it here

Upvotes: 1

Related Questions