Rohit Chattopadhyay
Rohit Chattopadhyay

Reputation: 67

Conditional extraction of files from an Archive file

I have a large tar.gz archive file having nxml files and total size is around 5gb. My aim is to extract files from it but, I do not have to extract all of them. I have to extract all those files whose name is greater than a threshold value.

For example: Let us consider 1000 is our threshold value. So
path/to/file/900.nxml will not be extracted but
path/to/file/1100.nxml will be extracted.

So my requirement is to make a conditional extraction of files from the archive.
Thanks

Upvotes: 1

Views: 192

Answers (2)

Kapil Kumar
Kapil Kumar

Reputation: 64

You can also use --wildcards option of tar.
For example in the case when your threshold is 1000 you can use tar -xf tar.gz --wildcards path/to/files/????*.nxml. The ? will match one character and using * will match any number of character. This pattern will look for any file name with 4 or more characters.
Hope this helps.

Upvotes: 1

Roland Weber
Roland Weber

Reputation: 3655

  1. Use tar -tf <archive> to get a list of files in the archive.
  2. Process the list of files to determine those you need to extract. Write the file list to a temporary file <filelist>, one line per file.
    • Looking at the tags you chose, you can use either Python or bash for this string filtering, whichever you prefer.
  3. Use tar -xf <archive> -T <filelist> to extract the files you need.
    The option -T or --files-from reads the filenames to process from the given file.

Upvotes: 1

Related Questions