rshdzrt
rshdzrt

Reputation: 135

How to get the string which is less than 4 using SED or AWS or GREP

I'm trying to get strings which are less than 4 (0,3) characters which might include some special characters too.

The issue here is I'm not really sure what all special characters are involved

It can contain names of any length with some special characters not sure what all are included.

Sample Input data is as below

r@nger
d!nger
'iterr
4#e
c#nuidig
c@niting
c^neres
sample

Sample Output should be like this

r@n
d!n
'it
4#e
c#n
c@n
c^n
sam

I have tried below which both works but both has flaws apart from the 0,3 character strings I'm also getting only 1 character outputs which is incorrect.

Like just C, which I don't have in the input by itself

grep -iE '^[a-z0-9\.-+?$_,@]{0,3}$'

sed -n '/^.\{0,3\}$/p' grep uid: file.csv | awk {'print $2'} | sed -En 's/^([^[:space:]]{3}).*/\1/p' | sort -f > output

Sample Output from above

I'm thinking that there might be some special character after the first character which is making it break and only printing the first character.

Can someone please suggest how to get this working as expected

Thanks,

Upvotes: 0

Views: 164

Answers (4)

stevesliva
stevesliva

Reputation: 5665

sed 's/.//4g' file

Delete every char starting at 4th until there aren't any more. GNU sed, which says:

Note: the POSIX standard does not specify what should happen when you mix the g and number modifiers, and currently there is no widely agreed upon meaning across sed implementations. For GNU sed, the interaction is defined to be: ignore matches before the numberth, and then match and replace all matches from the numberth on.

Also: grep -o '^...' file

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203995

To get the output you posted from the input you posted is just:

$ cut -c1-3 file
r@n
d!n
'it
4#e
c#n
c@n
c^n
sam

If that's not all you need then edit your question to more clearly state your requirements and provide more truly representative sample input/output including cases where this doesn't work.

Upvotes: 2

j_b
j_b

Reputation: 2020

Using awk:

awk '{print (length($0)<3) ? $0 : substr($0,0,3)}' src.dat 

Output:

r@n
d!n
'it
4#e
c#n
c@n
c^n
sam
1
11
-1
.

Contents of src.dat:

r@nger
d!nger
'iterr
4#e
c#nuidig
c@niting
c^neres
sample
1
11
-1
.

Upvotes: 1

anubhava
anubhava

Reputation: 785531

You may use this grep with -o and -E options:

grep -oE '^[^[:blank:]]{1,3}' file

r@n
d!n
'it
4#e
c#n
c@n
c^n
sam

Regex ^[^[:blank:]]{1,3} matches and outputs 1 to 3 non-whitespace characters from start position.

Upvotes: 1

Related Questions