awk, IFS, and file name truncations

Question

Updated question based on new information…

Here is a gist of my code, with the general idea that I store items in DropBox at:

~/Dropbox/Public/drops/xx.xx.xx/whatever

Where the date is always 2 chars, 2 chars, and 2 chars, dot separated. Within that folder can be more folders and more files, which is why when I use find I do not set the depth and allow it to scan recursively. https://gist.github.com/anonymous/ad51dc25290413239f6f

Below is a shortened version of the gist, it won't run as it stands, I don't believe, though the gist will run assuming you have DropBox installed and there are files at the path location that I set up.

General workflow:
SIZE="+250k" # For `find` this is the value in size I am looking for files to be larger than
# Location where I store the output to `find` to process that file further later on.
TEMP="/tmp/drops-output.txt" 

Next I rm the tmp file and touch a new one.

I will then cd into
DEST=/Users/$USER/Dropbox/Public/drops

Perform a quick conditional check to make sure that I am working where I want to be, 
with all my values as variables, I could mess up easily and not be working where I 
thought I would be.
# Conditional check: is the current directory the one I want to be the working directory?
if [ "$(pwd)" = "${DEST}" ]; then
    echo -e "Destination and current working directory are equal, this is good!:
    $(pwd)
"
fi

The meat of step one is the `find` command
# Use `find` to locate a subset of files that are larger than a certain size
# save that to a temp file and process it.  I believe this could all be done in 
# one find command with -exec or similar but I can't figure it out
find . -type f -size "${SIZE}" -exec ls -lh {} \; >> "$TEMP"

Inside $TEMP will be a data set that looks like this:
-rw-r--r--@ 1 me  staff    61K Dec 28  2009 /Users/me/Dropbox/Public/drops/12.28.09/wor-10e619e1-120407.png
-rw-r--r--@ 1 me  staff   230K Dec 30  2009 /Users/me/Dropbox/Public/drops/12.30.09/hijack-loop-d6250496-153355.pdf
-rw-r--r--@ 1 me  staff    49K Dec 31  2009 /Users/me/Dropbox/Public/drops/12.31.09/mt-5a819185-180538.png

The trouble is, not all files will contains no spaces, though I have done all I can to make sure variables are quoted 
and wrapped in parens or braces or quotes where applicable.

With the results in /tmp I run:
# Number of results located as a result of the find `command` above
RESULTS=$(wc -l "$TEMP" | awk '{print $1}')
echo -e "Located: [$RESULTS] total files greater than or equal to $SIZE
"

# With a result set found via `find`, now use awk to print out the sorted list of file 
# sizes and paths.
echo -e "SIZE    DATE      FILE PATH"
#awk '{print "["$5"]          ", $9, $10}' < "$TEMP" | sort -n
awk '{for(i=5;i<=NF;i++) {printf $i " "} ; printf "
"}' "$TEMP" | sort -n

With the changes to awk from how I had it originally, my result now looks like this:
751K Oct 21 19:00 ./10.21.14/netflix-67-190039.png 
760K Sep 14 19:07 ./01.02.15/logos/RCA_old_logo.jpg 
797K Aug 21 03:25 ./08.21.14/girl-88-032514.zip 
916K Sep 11 21:47 ./09.11.14/small-shot-4d-214727.png

I want it to look like this:
SIZE    FILE PATH
========================================
751K    ./10.21.14/netflix-67-190039.png 
760K    ./01.02.15/logos/RCA_old_logo.jpg 
797K    ./08.21.14/girl-88-032514.zip 
916K    ./09.11.14/small-shot-4d-214727.png

# All Done
if [ "$?" -ne "0" ]; then
    echo "find of drop files larger than $SIZE completed without errors.
"
    exit 1
fi

Original Post to Stack prior to gaining some new information leading to new questions…

Original Post is below, given new information, I tried some new tactics and have left myself with the above script and info.

I have a simple script, Mac OS X, it performs a find on a dir and locates all files of type file and of size greater than +SIZE

These are then appended to a file via >>

From there, I have a file that essentially contains a ls -la listing, so I use awk to get to the file size and the file name with this command:

# With a result set found via `find`, now use awk to print out the sorted list of file 
# sizes and paths.
echo -e "SIZE          FILE PATH"
awk '{print "["$5"]          ", $9, $10}' < "$TEMP" | sort -n

All works as I want it to, but I get some filename truncation right at the above code. The entire file is around 30 lines, I have pinned it to this line. I think if I throw in a different Internal Field Sep that would fix it. I could use as there can't be a in Mac OS X filenames.

I thought it was just quoting, but I can't seem to see where if that is the case. Here is a sample of the data returned, usually I get about 50 results. The first one I stuffed in this file has filename truncation:

[1.0M]           ./11.26.14/Bruna Legal
[1.4M]           ./12.22.14/card-88-082636.jpg 
[1.6M]           ./12.22.14/thrasher-8c-082637.jpg 
[11M]           ./01.20.15/td-6e-225516.mp3

Bruna Legal is "Bruna Legal Name.pdf" on the filesystem.

awk, IFS, and file name truncations

Updated question based on new information…

Original Post to Stack prior to gaining some new information leading to new questions…

Answers (1)

Related Questions