Reputation: 1144
I have a tab-separated-file myfile.tsv
of form:
"a" "b"
"c" "d"
How can I process it with xargs
(i.e. what args I need to pass) to pass these fields as two arguments to an external program while preserving all characters, including double-quotes (as I don't want to mess-up escaping and this is crucial if values in the tsv are actually JSON-formatted) and splitting by tab into external command's arguments?
E.g. I'd like to have cat myfile.tsv | xargs ... python -c 'import sys;print(sys.argv[1:])'
to print
['"a"', '"b"']
['"c"', '"d"']
Thanks!
Upvotes: -1
Views: 98
Reputation: 11
If you're not sure that there are exactly 2 fields per line, you have to read the file line by line and then split each line individually.
To demonstrate it I added a line to the tsv file, including a backslash, as these may get treated as an escape character by some tools like read
:
$ cat myfile.tsv
"a" "b"
"c" "d"
"e" "f" "g\t"
To split on tab, you have to set the delimiter to tab. We can do this with $(printf \\t)
.
$ cat myfile.tsv | while read -r line ; do echo -n "$line" | xargs -d "$(printf \\t)" python -c 'import sys;print(sys.argv[1:])' ; done
['"a"', '"b"']
['"c"', '"d"']
['"e"', '"f"', '"g\\t"']
However we can also do the line splitting at once by read
by setting the IFS variable (input field separator), reading into an array variable and passing it to the next command (python):
$ cat myfile.tsv | while IFS="$(printf \\t)" read -ra line ; do python -c 'import sys;print(sys.argv[1:])' "${line[@]}" ; done
['"a"', '"b"']
['"c"', '"d"']
['"e"', '"f"', '"g\\t"']
Upvotes: 0
Reputation: 69388
You should use NULL chars as delimiters for xargs
:
tr '\t\n' '\0' < myfile.tsv | xargs -0 -L2 python -c 'import sys;print(sys.argv[1:])'
['"a"', '"b"']
['"c"', '"d"']
Upvotes: 0
Reputation: 26220
Bit hacky but this works:
$ cat myfile.tsv | xargs -n2 -I{} bash -c 'sed -e s/^/\"/ -e s/$/\"/ -e s/\ /\"\ \"/ <<<"{}"' | python -c 'import fileinput
for line in fileinput.input():
print(line.split())
'
['"a"', '"b"']
['"c"', '"d"']
Upvotes: 0
Reputation: 922
not sure why you want/need xargs , a trivial example to process
cat parseMe.py
#!/usr/bin/env python
import sys
for row in sys.stdin:
lst=list(row.strip().split('\t'))
print(f'{lst}')
cat myfile.tsv | ./parseMe.py
['"a"', '"b"']
['"c"', '"d"']
['"e"', '"f"']
feel free to ignore if not suitable / I've missed the essence of why you need xargs.
Upvotes: 0