fangio
fangio

Reputation: 1796

vim filter/delete a line

We have a tyfoons.txt file which contains several information attributes of real typhoons in the past. One line from the file is as follows:

Name-identification number- wind - duration - beginning date - ending date
----------------------------------------------------------------------------------
Mary ( North ) − 1977.21 − 945 − 12 Days 18 Hours − 1977−12−21 00:00 − 1978−01−02 18:00

Now I have to filter out the typhoons span 2 different years, like the example above being for 1977 and 1978. We consider that a typhoon doesn't exist longer than 30 days.

How can I filter those lines out of the file?

Upvotes: 0

Views: 306

Answers (1)

jamessan
jamessan

Reputation: 42757

A naive solution (just checking whether a year is crossed) could be achieved using the :g command.

E.g., delete any line which doesn't have the same year for start and end

:g!/\(\d\{4}\)−\d\d−\d\d \d\d:\d\d − \1−\d\d−\d\d \d\d:\d\d$/d

The relevant parts here are re-using the matched year in the first date (\(\d\{4\}\)) as the year in the second data (\1). :g!/{pat}/{cmd} executes {cmd} for every line that doesn't match {pat}, so when the years differ, the :d command is run and deletes that line.

Performing more complex checks (within a 30 day window) would probably be better achieved by writing a function that pulls apart the relevant data from the line and does the necessary calculations. A basic skeleton would be:

function DeleteInvalidDateRanges()
    " matchlist() returns a list of the entire string that matched and all matched groups
    " so slicing the list to extract items 1-4 gives a list of [year, month, date, time]
    let dateTime1 = matchlist(getline('.'), '\(\d\{4}\)−\(\d\d\)−\(\d\d\) \(\d\d:\d\d\)', 0, 1)[1:4]
    let dateTime2 = matchlist(getline('.'), '\(\d\{4}\)−\(\d\d\)−\(\d\d\) \(\d\d:\d\d\)', 0, 2)[1:4]
    if MoreThan30Days(dateTime1, dateTime2)
        delete
    endif
endfunction
:%call DeleteInvalidDateRanges()

For more information on scripting in Vim, you can see the relevant part of the user guide.


Simply deleting any lines which have a start month of 12 and end month of 01 is a slight change to the initial "different years" example.

:g/\d\{4}−12−\d\d \d\d:\d\d − \d\{4}−01−\d\d \d\d:\d\d$/d

Upvotes: 1

Related Questions