Reputation: 23
I am attempting to parse a Linux directory listing into a clean flat file. A subset of the data is presented below.
./DIRECTORY1/SUBDIR1:
total 5
drwxrwx--- 2 user1 group1 2048 Sep 8 13:40 .
drwxrws--- 13 user2 group1 2048 Sep 8 17:00 ..
-rwxrwx--- 1 user1 group1 56362 Dec 18 2014 file12112012.csv
-rwxrwx--- 1 user1 group1 65233 Dec 18 2014 file12112013.csv
-rwxrwx--- 1 user1 group1 66322 Dec 22 2014 file20140902.csv
-rwxrwx--- 1 user1 group1 65443 Dec 22 2014 file20140918.csv
-rwxrwx--- 1 user1 group1 64003 Dec 22 2014 file20141016.csv
./DIRECTORY1/SUBDIR2:
total 5
-rw-r--r-- 1 user1 group1 133 Jun 25 16:05 test.sas
-rwxrwx--- 1 user1 group1 338 Sep 19 2014 threads.sas
-rwxrwx--- 1 user1 group1 5997 Apr 8 16:05 comparison.sas
-rwxrwx--- 1 user1 group1 5341617 May 6 20:02 univariate.pdf
-rwxrwx--- 1 user1 group1 814 Jan 30 2015 avg_fix.sas
./DIRECTORY2:
total 44
drwxrwx--- 8 user1 group1 3864 May 20 2014 .
drwxrws--- 13 user2 group1 2048 Sep 8 17:00 ..
drwxrwx--- 2 user1 group1 3864 May 20 2014 DataSources
drwxrwx--- 2 user1 group1 3864 May 20 2014 HPDM
drwxrwx--- 2 user1 group1 3864 May 20 2014 Meta
drwxrwx--- 2 user1 group1 3864 May 20 2014 Reports
drwxrwx--- 2 user1 group1 3864 May 20 2014 System
drwxrwx--- 2 user1 group1 3864 May 20 2014 Workspaces
-rwxrwx--- 1 user1 group1 83 May 20 2014 project.emp
Ideally, I would like the output data to look like:
filename user group size date
./DIRECTORY1/SUBDIR1/file12112012.csv user1 group1 56362 12/18/2014
./DIRECTORY1/SUBDIR1/file12112013.csv user1 group1 65233 12/18/2014
..etc..
I can't just disregard the "header" portion, as that contains the beginning of the filename, but the non-header rows are relatively standard for what I would expect an standard input data step to be able to handle.
Is there a way to control input by line characteristic? Has anyone had experience reading in a file like this?
For reference, the file can be created in a Linux environment using
ll -R
Upvotes: 0
Views: 51
Reputation: 20909
I wouldn't recommend using ls
for this.
Instead, use find
. It has a -printf
option that lets you format and display certain information about the found files.
For example:
find /path/to/folder -type f -printf "%p\t%g\t%s\n"
Will print each found file's name, group, and size delimited by tabs.
See the man pages for find
for additional information.
Upvotes: 2