Reputation: 23
I need to extract ~5000 lines from a file with ~300,000 lines on bash (OSX). Running
sed '128082p;128083p;...(4996 numbers)....;159845q;d' file > output
gives the error
sed: 1: "128082p;128083p;128084p ...": command expected
This same command works if I try to extract 10 lines only. Whereas running
for i in `cat line_file`; do sed -n "$ip" file; done >> output
creates a file that's more than ~5000 lines long. What's the right command in either case?
Edit: this is not a range of numbers.
Upvotes: 2
Views: 356
Reputation: 437833
Tip of the hat to Jonathan Leffler for his help.
It looks like BSD sed
as used on macOS (as of macOS 10.12.1) has a hard limit on the size of each line of a script that can be passed to it: 2048
bytes.
When passed as a command-line argument (implicitly as the first operand, or explicitly via -e
options), scripts are typically passed as a single line, as you did.
If that single line gets too long, it is regrettably blindly cut off, typically resulting in a seemingly random syntax error, like the one you saw.
There are two workarounds:
Make sure that your script contains only short-enough lines by separating commands with \n
(newlines) instead of ;
and/or split your script across multiple -e
options (which is cumbersome).
Provide the entire script via a file, using the -f
option, in which case all commands must be separated with \n
rather than ;
anyway.
In the unlikely event that your script is too long to fit on a single command line (a limit imposed by the system - see bottom), using -f
is your only option.
Here's an example of a command-line script that is too long:
$ sed -n "$(printf '%sp;' {1..432})" <<<'line 1'
sed: 1: "1p;2p;3p;4p;5p;6p;7p;8p ...": command expected # !! ERROR
Even though the script is syntactically correct, cutting its one and only line off at 2048 bytes leaves it incorrect, resulting in the seemingly random command expected
error.
In this case, working around the limitation is simple: by replacing ;
with \n
, the individual lines become short enough:
$ sed -n "$(printf '%sp\n' {1..432})" <<<'line 1'
line 1 # OK
Since you already have a file of line numbers - line_file
- you can use an auxiliary sed
command to create your \n
-separated script from it:
$ sed -n "$(sed 's/$/p/' line_file)" file > output
Here's how to solve the problem via a script file passed via -f
, in which the commands are \n
-separated fixes the problem:
$ printf '%sp\n' {1..432} > script.sed # Create script file with \n-separated commands.
$ sed -n -f "script.sed" <<<'line 1' # Pass script file via -f
line 1 # OK
Note: Using a process substitution (sed -n -f <(printf ...) ...
) as an ad-hoc script file inexplicably does not work.
Also note that the overall max. length of a command line for invoking an external utility such as sed
on macOS (as of 10.12) is 262144
(256 KB; determined with getconf ARG_MAX
), and in practice the limit is lower, because the size of the environment-variable block plays a role.
If you were to hit that limit, however, you'd get a more helpful error message: Argument list too long
.
Upvotes: 3