WestCoastProjects
WestCoastProjects

Reputation: 63192

Sort command is seeing data stream out of sequence with respect to other shell commands?

Sort is exhibiting unexpected behavior by resurrecting fields already removed from a "cut" command:

Consider the following bash pipeline:

history | cut -d' ' -f3- | grep mvn

Here is an excerpt of the output:

 mvn  -pl sql/hbase -DMAVEN_OPTS="XX:MaxPermSize=394m -Xmx1500m" -Pyarn -Phadoop-2.3 -Phive  -Phbase install compile package -DskipTests
 mvn -Dspark.testing.use-external-hbase=true  -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 mvn -Dspark.testing.use-external-hbase=false -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -     mvn -Dspark.testing.use-external-hbase=false -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -     mvn -Dspark.testing.use-external-hbase=true -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 date; mvn   -pl sql/hbase -DskipTests -Phbase -Pyarn -Phadoop-2.3 -Phive compile package install ; date
 mvn   -pl sql/hbase -DskipTests -Phbase -Pyarn -Phadoop-2.3 -Phive compile package install
 history | grep mvn
 mvn -Pyarn -Phadoop-2.3  install compile package -DskipTests
 mvn -Phbase -Pyarn -Phadoop-2.3  install compile package -DskipTests
 mvn -Phbase -Pyarn -Phadoop-2.3  clean install compile package -DskipTests

Now let us append the sort command:

history | cut -d' ' -f3- | grep mvn | sort

And now we see the first column with the history number - which the cut command had already removed - has magically reappeared:

734  mvn -Pyarn -Phadoop-2.3 -Phive  install compile package -DskipTests
735  mvn -Pyarn -Phadoop-2.3  install compile package -DskipTests
745  mvn -pl sql/core -Pyarn -Phadoop-2.3  install compile package -DskipTests
748  mvn -pl sql/core -Pyarn -Phadoop-2.3 -DwildcardSuites=org.apache.spark.sql.SparkSQLJoinSuite test
763  mvn -Pyarn -Phadoop-2.3  install compile package -DskipTests
768  mvn -Pyarn -Phadoop-2.3  install compile package -DskipTests
769  mvn -pl sql/core -Pyarn -Phadoop-2.3 -DwildcardSuites=org.apache.spark.sql.SparkSQLJoinSuite test
798  mvn -pl sql/core -Pyarn -Phadoop-2.3 -DwildcardSuites=org.apache.spark.sql.SparkSQLJoinSuite test
825  mvn -Dspark.testing.use-external-hbase=false  -Dlog4j.rootLogger=DEBUG,CA,FA  -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
831  mvn -Dspark.testing.use-external-hbase=true  -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -850  hist100 | grep mvn
855  mvn -Pyarn -Phadoop-2.3  install compile package -DskipTests

So.. why did the first column (the history number) come back? It is like the sort command were seeing the original output from the history command before the cut command:

 cut -d' ' -f3-

had a chance to operate on it.

Is there something different about the way bash streaming works with sort ??

UPDATE I installed and tried out gnu sort "gsort" and the same behavior occurs.

history | cut -d' ' -f3- | grep mvn | gsort

Another update There seems to be some confusion about what the history output format is. Here is another excerpt: though it shows nothing different than the first section above, it is intended to quell some lingering questions raised in one of the answers. Specifically, there are NOT two sets of numbers for each history line.

history | tail -n 200 | grep mvn

 1968  mvn -DskipTests=true -Pyarn -Phadoop-2.3  -Phive -Phbase install compile package
 1969  mvn -Dspark.testing.use-external-hbase=false -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 1982  mvn -Dspark.testing.use-external-hbase=false -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 1985  mvn -Dspark.testing.use-external-hbase=false -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 1987  mvn -Dspark.testing.use-external-hbase=false -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 1989  mvn -Dspark.testing.use-external-hbase=false -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 1996  mvn -Dspark.testing.use-external-hbase=false -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 2010  mvn -Dspark.testing.use-external-hbase=false -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 2013  mvn -Dspark.testing.use-external-hbase=false -DforkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 2014  mvn -Dspark.testing.use-external-hbase=true -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 2015  mvn -Dspark.testing.use-external-hbase=true -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 2016  mvn -Dspark.testing.use-external-hbase=true -DforkrkMode=never -pl sql/hbase -Pyarn -Phadoop-2.3  -Phive -Phbase test   -DwildcardSuites=org.apache.spark.sql.hbase.JoinsSuite
 2017  mvn -Dspark.testing.use-external-hbase=true -DforkMode=neve

another update

history | grep 734


 734  mvn -Pyarn -Phadoop-2.3 -Phive  install compile package -DskipTests
 1734  bashrc
 2045  history | grep 734

Upvotes: 0

Views: 143

Answers (1)

mirabilos
mirabilos

Reputation: 5327

Instead of cutting away the lines, you should be suppressing them in the first place.

fc -l is the equivalent to the history command, although you may need to give it a starting line number; fc -l 1 usually works for me.

Then, fc's -n option suppresses line number printing.

This gives me fc -nl 1.

Now, the remaining “problem” is that fc likes to still echo a leading tab character, even if not printing line numbers. You can cut that away, though, and it is always just one of them.

Upvotes: 1

Related Questions