user2209700
user2209700

Reputation: 23

Read in non-standard data structure, Linux file listing

I am attempting to parse a Linux directory listing into a clean flat file. A subset of the data is presented below.

./DIRECTORY1/SUBDIR1:
total 5
drwxrwx---  2 user1 group1  2048 Sep  8 13:40 .
drwxrws--- 13 user2 group1  2048 Sep  8 17:00 ..
-rwxrwx---  1 user1 group1 56362 Dec 18  2014 file12112012.csv
-rwxrwx---  1 user1 group1 65233 Dec 18  2014 file12112013.csv
-rwxrwx---  1 user1 group1 66322 Dec 22  2014 file20140902.csv
-rwxrwx---  1 user1 group1 65443 Dec 22  2014 file20140918.csv
-rwxrwx---  1 user1 group1 64003 Dec 22  2014 file20141016.csv

./DIRECTORY1/SUBDIR2:
total 5
-rw-r--r--  1 user1 group1     133 Jun 25 16:05 test.sas
-rwxrwx---  1 user1 group1     338 Sep 19  2014 threads.sas
-rwxrwx---  1 user1 group1    5997 Apr  8 16:05 comparison.sas
-rwxrwx---  1 user1 group1 5341617 May  6 20:02 univariate.pdf
-rwxrwx---  1 user1 group1     814 Jan 30  2015 avg_fix.sas

./DIRECTORY2:
total 44
drwxrwx---  8 user1 group1 3864 May 20  2014 .
drwxrws--- 13 user2 group1 2048 Sep  8 17:00 ..
drwxrwx---  2 user1 group1 3864 May 20  2014 DataSources
drwxrwx---  2 user1 group1 3864 May 20  2014 HPDM
drwxrwx---  2 user1 group1 3864 May 20  2014 Meta
drwxrwx---  2 user1 group1 3864 May 20  2014 Reports
drwxrwx---  2 user1 group1 3864 May 20  2014 System
drwxrwx---  2 user1 group1 3864 May 20  2014 Workspaces
-rwxrwx---  1 user1 group1   83 May 20  2014 project.emp

Ideally, I would like the output data to look like:

filename                               user  group   size  date
./DIRECTORY1/SUBDIR1/file12112012.csv  user1 group1 56362  12/18/2014
./DIRECTORY1/SUBDIR1/file12112013.csv  user1 group1 65233  12/18/2014
..etc..

I can't just disregard the "header" portion, as that contains the beginning of the filename, but the non-header rows are relatively standard for what I would expect an standard input data step to be able to handle.

Is there a way to control input by line characteristic? Has anyone had experience reading in a file like this?

For reference, the file can be created in a Linux environment using

ll -R

Upvotes: 0

Views: 51

Answers (1)

Mr. Llama
Mr. Llama

Reputation: 20909

I wouldn't recommend using ls for this.
Instead, use find. It has a -printf option that lets you format and display certain information about the found files.

For example:

find /path/to/folder -type f -printf "%p\t%g\t%s\n"

Will print each found file's name, group, and size delimited by tabs.

See the man pages for find for additional information.

Upvotes: 2

Related Questions