Reputation: 888
I have many files in the GCS bucket and I would like to find the file name which contains my data (for ex: grep "APPLE"). Is there any way to find the file names based on the grep command?
The following command only results in the data with respective rows, but I want to find the file name in which the grep data resides
gsutil cat gs://my-bucket/part-2020-01-09** | grep 'APPLE'
Is there any way to find the respective file names?
Upvotes: 1
Views: 3696
Reputation: 151
You just have to use the -h
command line option
gsutil cat -h gs://my-bucket/part-2020-01-09** | grep 'APPLE'
This would print a header with the object name before the contents of each text object that matched the wildcard.
https://cloud.google.com/storage/docs/gsutil/commands/cat
Upvotes: 4
Reputation: 75745
For doing this, I would write a script like this:
for i in $(gsutil ls gs://my-bucket/part-2020-01-09**)
do
gsutil cat ${i} | grep 'APPLE' > /dev/null
if [ "${?}" == "0" ]
then
echo ${i}
fi
done
But it's not as efficient as a cat because it performs an API call for each file. I don't know the number of files that you have and if it's an acceptable solution.
Upvotes: 2