tempid
tempid

Reputation: 8198

regex - extract strings at specifc positions

I have a huge fixed-width string that looks something like below:

B100000DA3F19C                                     Android                                                                                              600             AND                                                2011-08-29 15:03:21.537
352a0D21ffd800000a3a95911801700e                   iPad                                                                                                 600             iOS                                                2011-08-29 19:35:12.753
.
.
.

I need to extract the first part (id) and the fourth part (device type - "AND" or "iOS"). The first column starts at 0 and ends at the 51st position for all lines. The fourth part starts at 168 and ends at 171 for all lines. The length of each line is 244 characters. If this is complicated, the other option is to delete everything in this file except id and device type. This single file has around 800K records measuring 180mb but Notepad++ seems to be handling it okay.

I tried doing a SQL Server import data but even though the Preview looks fine, when the data gets inserted into the table, it is not accurate.

I have the following so far which gives me the first 51 characters -

^(.{51}).*

It would be great if I could one regex that will keep id and device type and delete the rest.

Upvotes: 0

Views: 64

Answers (1)

axic
axic

Reputation: 161

Well if you are certain it is always at that position a very simple way is this:

^(.{51}).{117}(.{3})

The parentheses are the captures (the results you are getting out), while the brackets are the counters.

EDIT: Use the following to explicitly discard the rest of the line:

^(.{51}).{117}(.{3}).*$

Upvotes: 3

Related Questions