kamikater
kamikater

Reputation: 101

Change a value of find -printf in bash

I'm using find to print a line for each file and directory:

find ${rootdirectory} -printf '%p,%T@\n' >> ${outputfile}

However, I like to convert %T@ from unixepoch to Windows FILETIME:

filetime=$(( (%T@ + 11644473600) * 10000000 ))
find ${rootdirectory} -printf '%p,${filetime}\n' >> ${outputfile}

This doesn't work of course because %T@ is not set prior to find -printf.

What is the fastest way to find millions of files while changing a found integer? I already have a solution with stat but it's extremely slow:

find ${rootdirectory} -exec 1>${outputfile} sh -c 'for file in "${1}"/* ;
  do
    unixtime=$(stat -c%Y ${file})
    filetime=$(( (${unixtime} + 11644473600) * 10000000 ))
    stat -c%n,${filetime} ${file}
  done' none {}  \;

I changed this to a variation with -printf but T is not recognized:

find ${rootdirectory} -exec 1>${outputfile} sh -c 'for file in "${1}"/* ;
  do
    unixtime=$(printf %T@)
    filetime=$(( (${unixtime} + 11644473600) * 10000000 ))
    -printf %p,${filetime}
  done' none {}  \;

My last hope was this:

print_format="%p,$(( %T@ + 11644473600 ))\n"
find ${rootdirectory} -printf "$print_format"

For the sake of completeness, this doesn't work:

find ${rootdirectory} -printf '%p,$(( (%T@ + 11644473600) * 10000000 ))\n'

Does anybody have any ideas? And would xargs be faster than exec?

Upvotes: 0

Views: 655

Answers (1)

dash-o
dash-o

Reputation: 14493

The 'killer' in your solution (given large number of files) are the repeated execution (one per file) of the 'shell'. As you have already pointed out 'find' does not support arithmetic on operator.

One alternative is to use a post-processor (awk, Perl, Python) that will read the output from find, and perform the conversion.

# Using printf
find ${rootdirectory} -printf '%p,%T@\n' | awk -v FS=, -v OFS=, '{ printf ("%s,%d\n",  $1, ($2+ 11644473600) * 10000000)}'

# On 32 bit environment, using %.0f
find ${rootdirectory} -printf '%p,%T@\n' | awk -v FS=, -v OFS=, '{ printf ("%s,%.0f\n",  $1, ($2+ 11644473600) * 10000000)}'

# Or using regular print
find ${rootdirectory} -printf '%p,%T@\n' | awk -v FS=, -v OFS=, '{ printf ("%s,%d\n",  $1, ($2+ 11644473600) * 10000000)}'

Given only one invocation of awk, this will be much faster than attempted solutions.

Using xargs can speedup the code, but only if you use some 'bulking', where large number of files will be processed by a single command. It is unlikely to be faster than 'awk' - single process.

Using bash only solution will be hard, since bash does not support math on floating point values (On Mint 19, @T includes fraction).

Upvotes: 1

Related Questions