user4221591
user4221591

Reputation: 2150

Regular expression to get only the first word from each line

I have a text file

@sp_id      int,    
@sp_name                varchar(120),
@sp_gender              varchar(10),
@sp_date_of_birth       varchar(10),
@sp_address             varchar(120),
@sp_is_active           int, 
@sp_role            int

Here, I want to get only the first word from each line. How can I do this? The spaces between the words may be space or tab etc.

Upvotes: 0

Views: 17498

Answers (4)

Joshua Okonkwo
Joshua Okonkwo

Reputation: 41

I did something similar with this:

with open('handles.txt', 'r') as handles:
    handlelist = [line.rstrip('\n') for line in handles]
    newlist = [str(re.findall("\w+", line)[0]) for line in handlelist] 

This gets a list containing all the lines in the document, then it changes each line to a string and uses regex to extract the first word (ignoring white spaces)

My file (handles.txt) contained info like this:

JoIyke - personal twitter link;

newMan - another twitter handle;

yourlink - yet another one.

The code will return this list: [JoIyke, newMan, yourlink]

Upvotes: 2

Judson Cruz
Judson Cruz

Reputation: 11

Find the first word of each line with /^\w+/gm.

Upvotes: 0

vks
vks

Reputation: 67968

Find What: ^(\S+).*$

Replace by : \1

You can simply use this to get the first word.Here we are capturing the first word in a group and replace the while line by the captured group.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

Here is what I suggest:

Find what: ^([^ \t]+).*

Replace with: $1

Explanation: ^ matches the start of line, ([^ \t]+) matches 1 or more (due to +) characters other than space and tab (due to [^ \t]), and then any number of characters up to the end of the line with .*.

See settings:

enter image description here

In case you might have leading whitespace, you might want to use

^\s*([^ \t]+).*

Upvotes: 7

Related Questions