codef0rmer
codef0rmer

Reputation: 10530

How to store directory files listing into an array?

I'm trying to store the files listing into an array and then loop through the array again. Below is what I get when I run ls -ls command from the console.

total 40
36 -rwxrwxr-x 1 amit amit 36720 2012-03-31 12:19 1.txt
4 -rwxrwxr-x 1 amit amit  1318 2012-03-31 14:49 2.txt

The following bash script I've written to store the above data into a bash array.

i=0
ls -ls | while read line
do
    array[ $i ]="$line"        
    (( i++ ))
done

But when I echo $array, I get nothing!

FYI, I run the script this way: ./bashscript.sh

Upvotes: 65

Views: 177269

Answers (11)

Gustavo Espinoza
Gustavo Espinoza

Reputation: 11

This should work

array=($(ls -l path/to/directory | awk '{print $9}'))

You may need to adjust the 9 to get the correct parsing

Upvotes: 1

F. Hauri  - Give Up GitHub
F. Hauri - Give Up GitHub

Reputation: 70852

My two cents

The asker wanted to parse output of ls -ls

Below is what I get when I run ls -ls command from the console.

total 40
36 -rwxrwxr-x 1 amit amit 36720 2012-03-31 12:19 1.txt
4 -rwxrwxr-x 1 amit amit  1318 2012-03-31 14:49 2.txt

But there are few answer addressing this parsing operation.

ls's output

Before trying to parse something, we have to ensure command output is consistant, stable and easy to parse as possible

  • In order to ensure output wont be altered by some alias you may prefer to specify full path of command: /bin/ls.
  • Avoid variations of output due to locales, prefix your command by LANG=C LC_ALL=C
  • Use --time-style command switch to use UNIX EPOCH more easier to parse time infos.
  • Use -b switch for holding special characters

So we will prefer

LANG=C LC_ALL=C /bin/ls -lsb --time-style='+%s.%N'

to just

ls -ls

Full bash sample

#!/bin/bash

declare -a bydate=() bysize=() byname=() details=()
declare -i cnt=0 vtotblk=0 totblk
{
    read -r _ totblk # ignore 1st line
    while read -r blk perm lnk usr grp sze date file;do
        byname[cnt]="${file//\\ / }"
        details[cnt]="$blk $perm $lnk $usr $grp $sze $date"
        bysize[sze]+="$cnt "
        bydate[${date/.}]+="$cnt "
        cnt+=1 vtotblk+=blk
    done
} < <(LANG=C LC_ALL=C /bin/ls -lsb --time-style='+%s.%N')

From there, you could easily sort by dates, sizes of names (sorted by ls command).

echo "Path '$PWD': Total: $vtotblk, sorted by dates"
for dte in ${!bydate[@]};do
    printf -v msec %.3f .${dte: -9}
    for idx in ${bydate[dte]};do
        read -r blk perm lnk usr grp sze date <<<"${details[idx]}"
        printf ' %11d %(%a %d %b %T)T%s %s\n' \
               $sze "${date%.*}" ${msec#0} "${byname[idx]}"
    done
done

echo "Path '$PWD': Total: $vtotblk, sorted by sizes"
for sze in ${!bysize[@]};do
    for idx in ${bysize[sze]};do
        read -r blk perm lnk usr grp sze date <<<"${details[idx]}"
        printf -v msec %.3f .${date#*.}
        printf ' %11d %(%a %d %b %T)T%s %s\n' \
               $sze "${date%.*}" ${msec#0} "${byname[idx]}"
    done
done

echo "Path '$PWD': Total: $vtotblk, sorted by names"
for((idx=0;idx<cnt;idx++));{
    read -r blk perm lnk usr grp sze date <<<"${details[idx]}"    
    printf -v msec %.3f .${date#*.}
    printf ' %11d %(%a %d %b %T)T%s %s\n' \
           $sze "${date%.*}" ${msec#0} "${byname[idx]}"
}

( Accessory, you could check if total block printed by ls match total block by lines:

(( vtotblk == totblk )) ||
    echo "WARN: Total blocks: $totblk != Block count: $vtotblk" >&2

Of course, this could be inserted before first echo "Path...;)

Here is an output sample. (Note: there is a filename with a newline)

Path '/tmp/so': Total: 16, sorted by dates
           0 Sun 04 Sep 10:09:18.221 2.txt
         247 Mon 05 Sep 09:11:50.322 Filename with\nsp\303\251cials characters
          13 Mon 05 Sep 10:12:24.859 1.txt
        1313 Mon 05 Sep 11:01:00.855 parseLs.00
        1913 Thu 08 Sep 08:20:20.836 parseLs
Path '/tmp/so': Total: 16, sorted by sizes
           0 Sun 04 Sep 10:09:18.221 2.txt
          13 Mon 05 Sep 10:12:24.859 1.txt
         247 Mon 05 Sep 09:11:50.322 Filename with\nsp\303\251cials characters
        1313 Mon 05 Sep 11:01:00.855 parseLs.00
        1913 Thu 08 Sep 08:20:20.836 parseLs
Path '/tmp/so': Total: 16, sorted by names
          13 Mon 05 Sep 10:12:24.859 1.txt
           0 Sun 04 Sep 10:09:18.221 2.txt
         247 Mon 05 Sep 09:11:50.322 Filename with\nsp\303\251cials characters
        1913 Thu 08 Sep 08:20:20.836 parseLs
        1313 Mon 05 Sep 11:01:00.855 parseLs.00

And if you want to format characters (with care: there could be some issues, if you don't know who create content of path). But if folder is your, you could:

echo "Path '$PWD': Total: $vtotblk, sorted by dates, with special chars"
printf -v spaces '%*s' 37 ''
for dte in ${!bydate[@]};do
    printf -v msec %.3f .${dte: -9}
    for idx in ${bydate[dte]};do
        read -r blk perm lnk usr grp sze date <<<"${details[idx]}"
        printf ' %11d %(%a %d %b %T)T%s %b\n' $sze \
            "${date%.*}" ${msec#0} "${byname[idx]//\\n/\\n$spaces}"
    done
done

Could output:

Path '/tmp/so': Total: 16, sorted by dates, with special chars
           0 Sun 04 Sep 10:09:18.221 2.txt
         247 Mon 05 Sep 09:11:50.322 Filename with
                                     spécials characters
          13 Mon 05 Sep 10:12:24.859 1.txt
        1313 Mon 05 Sep 11:01:00.855 parseLs.00
        1913 Thu 08 Sep 08:20:20.836 parseLs

Upvotes: 1

David Golembiowski
David Golembiowski

Reputation: 175

In the conversation over at https://stackoverflow.com/a/9954738/11944425 the behavior can be wrapped into a convenience function which applies some action to entries of the directory as string values.

#!/bin/bash
iterfiles() {
    i=0
    while read filename
    do 
        files[ $i ]="$filename"
        (( i++ ))
    done < <( ls -l )
    for (( idx=0 ; idx<${#files[@]} ; idx++ ))
    do 
        $@ "${files[$idx]}" &
        wait $!
    done
}

where $@ is the complete glob of arguments passed to the function! This lets the function have the utility to take an arbitrary command as a partial function of sorts to operate on the filename:

iterfiles head -n 1 | tee -a header_check.out

When a script needs to iterate over files, returning an array of them is not possible. The workaround is to define the array outside of the function scope (and possibly unset it later) — modifying it inside the function's scope. Then, after the function is called by a script, the array variable becomes available. For instance, the mutation on files demonstrates how this could be done.

declare -a files # or just `files= ` (nothing)
iterfiles() {
    # ...
    files=...
}

Extending the conversation above, @Jean-BaptistePoittevin pointed out a valuable detail.

#!/bin/bash
# Adding a section to unset certain variable names that
# may already be active in the shell.
unset i
unset files
unset omit

i=0
omit='^([\n]+)$'
while read file
do
    files[ $i ]="$file"
    (( i++ ))
done < <(ls -l | grep -Pov ${omit} )

 

Note: This can be tested using echo ${files[0]} or for entry in ${files[@]}; do ... ; done

Often times, the circumstance could require an absolute path in double quotes, where the file (or ancestor directories) have spaces or unusual characters in the name. find is one answer here. The simplest usage might look like the above one, except done < <(ls -l ... ) is replaced with:

done < <(find /path/to/directory ! -path /path/to/directory -type d)

Its convenient when you need absolute paths in double quotes as an iterable collection to use a recipe like the one below. When export is not used, the shell does not update the environment namespace to include it in the find subshell:

#!/bin/bash

export DIRECTORY="$PWD" # For example
declare -a files

i=0
while read filename; do 
    files[ $i ]="$filename"
done < <(find $DIRECTORY ! -path $DIRECTORY -type d)

for (( idx=0; idx<${#files[@]}; idx++ )); do

    # Make a templated string for macro script generation
    quoted_path="\"${files[$idx]}\""
    
    if [[ "$(echo $quoted_path | grep some_substring | wc -c)" != "0" ]]; then
        echo "mv $quoted_path /some/other/watched/folder/" >> run_nightly.sh
    fi

done

Upon running this, ./run_nightly.sh will be populated with bulk commands to move a quoted path to /some/other/watched/folder/. This kind of scripting pattern will make it possible to supercharge your scripts.

Upvotes: 0

Rifwan Jaleel
Rifwan Jaleel

Reputation: 19

simply you can use this below for loop (do not forget to quote to handle filenames with spaces)

declare -a arr
arr=()
for file in "*.txt"
do
    arr=(${arr[*]} "$file")
done

Run

for file in ${arr[*]}
do
    echo "<$file>"
done

to test.

Upvotes: -1

OldManRiver
OldManRiver

Reputation: 156

Isn't these 2 code lines, either using scandir or including the dir pull in the declaration line, supposed to work?

src_dir="/3T/data/MySQL";
# src_ray=scandir($src_dir);
declare -a src_ray ${src_dir/*.sql}
printf ( $src_ray );

Upvotes: 0

Dan Bray
Dan Bray

Reputation: 7822

You may be tempted to use (*) but what if a directory contains the * character? It's very difficult to handle special characters in filenames correctly.

You can use ls -ls. However, it fails to handle newline characters.

# Store la -ls as an array
readarray -t files <<< $(ls -ls)
for (( i=1; i<${#files[@]}; i++ ))
{
    # Convert current line to an array
    line=(${files[$i]})
    # Get the filename, joining it together any spaces
    fileName=${line[@]:9}
    echo $fileName
}

If all you want is the file name, then just use ls:

for fileName in $(ls); do
    echo $fileName
done

See this article or this this post for more information about some of the difficulties of dealing with special characters in file names.

Upvotes: 1

rashok
rashok

Reputation: 13444

Running any shell command inside $(...) will help to store the output in a variable. So using that we can convert the files to array with IFS.

IFS=' ' read -r -a array <<< $(ls /path/to/dir)

Upvotes: 3

Mat
Mat

Reputation: 206785

Try with:

#! /bin/bash

i=0
while read line
do
    array[ $i ]="$line"        
    (( i++ ))
done < <(ls -ls)

echo ${array[1]}

In your version, the while runs in a subshell, the environment variables you modify in the loop are not visible outside it.

(Do keep in mind that parsing the output of ls is generally not a good idea at all.)

Upvotes: 43

harschware
harschware

Reputation: 13414

Here's a variant that lets you use a regex pattern for initial filtering, change the regex to be get the filtering you desire.

files=($(find -E . -type f -regex "^.*$"))
for item in ${files[*]}
do
  printf "   %s\n" $item
done

Upvotes: 7

glenn jackman
glenn jackman

Reputation: 246992

I'd use

files=(*)

And then if you need data about the file, such as size, use the stat command on each file.

Upvotes: 147

potong
potong

Reputation: 58463

This might work for you:

OIFS=$IFS; IFS=$'\n'; array=($(ls -ls)); IFS=$OIFS; echo "${array[1]}"

Upvotes: 6

Related Questions