LPH
LPH

Reputation: 163

Reason for twice the selection in a select-string command

This script,

$lt="_cdbc ","_dié","_diéq ","_cost ","_coste "
foreach($x in $lt){sls -path GA0.txt -pattern $x -CaseSensitive >> GA1.txt}

working on the following content of file GA0 (encoding label in status bar: utf-8),

1 _cdbc \> contenu  diaphane\\
2 _dié \> cable deux\\
3 _diéq \> vingt \\
4 _cost \> pin parasol\\
5 _coste \> thyme\\

yields the following result in GA1 (resulting encoding label in status bar, whatever the initial label (utf-8 or utf-16 Le): utf-16 Le),

1                                        <emty line>
2  GA0.txt:1:_cdbc \> contenu  diaphane\\
3                                        <emty line>
4                                        <emty line>
5                                        <emty line>
6  GA0.txt:2:_dié \> cable deux\\
7  GA0.txt:3:_diéq \> vingt \\
8                                        <emty line>
9                                        <emty line>
10                                       <emty line>
11 GA0.txt:3:_diéq \> vingt \\
12                                       <emty line>
13                                       <emty line>
14                                       <emty line>
15 GA0.txt:4:_cost \> pin parasol\\
16                                       <emty line>
17                                       <emty line>
18                                       <emty line>
19 GA0.txt:5:_coste \> thyme\\

1/ It is fairly obvious, but not certain that the problem of the listing twice of line 3 (on lines 7 and 11) is caused by the accented letter (é): for instance, there is no error for the pattern "cost/coste". I tried adding the parameter specification -encoding utf8 and that made no difference. Would someone know what to do to make that right?
2/ What is the reason for the the first empty line and the series of 3 empty lines between the result lines in GA1, except before the problematic line? How can the code be changed so as to a have listing without any empty lines?

Upvotes: 1

Views: 81

Answers (1)

js2010
js2010

Reputation: 27516

>> or | out-file -append adds extra formatting (format-custom). Try | add-content instead. "_dié" doesn't have the space after it, so it matches two lines. Since "_cost " has the space at the end, it only matches one line.

You can also do all the patterns at once:

search-string $lt da0.txt

For a whole word match, you need a little extra regex: Select-String -pattern wholeword

Upvotes: 2

Related Questions