Reputation: 73
I am done with writing the code to upload files (text files) to azure blob storage. Now I want to provide search based on text files content. For ex. If I search for "Hello" then the name of files that contains "Hello" words should appear in search result. Here my code to search
class BlobSearch
{
static void Main(string[] args)
{
string searchText = "Hello";
CloudStorageAccount account = CloudStorageAccount.Parse(azureConString);
CloudBlobClient blobClient = account.CreateCloudBlobClient();
CloudBlobContainer blobContainer = blobClient.GetContainerReference("MyBlobContainer");
blobContainer.FetchAttributes();
var blobItemList = blobContainer.ListBlobs();
foreach (var item in blobItemList)
{
string line = string.Empty;
CloudBlockBlob blockBlob = blobContainer.GetBlockBlobReference(item.Uri.ToString());
if(blockBlob.Name.Contains(".txt"))
{
int lineno = 1;
using (var stream = blockBlob.OpenRead())
{
using (StreamReader reader = new StreamReader(stream))
{
while ((line = reader.ReadLine()) != null)
{
if (line.IndexOf(searchText) != -1)
{
Console.WriteLine("Line : " + lineno +" => "+ blockBlob.Name);
}
lineno++;
}
}
}
}
}
Console.WriteLine("SEARCH COMPLETE");
Console.ReadLine();
}
}
Above code is working but it is too slow. Is there any way to do it faster or Can improve above code.
Upvotes: 3
Views: 1336
Reputation: 12498
That is a very bad way to do it. It will be very slow. The best option for this is Azure Search. Search can now automatically index your blobs!
Upvotes: 1
Reputation: 171178
Your code is not bad. Find out where most time is spent. Probably network or CPU. For network, you are out of luck. For CPU you can parallelize.
You are using culture-specific string processing. StringComparison.Ordinal
is far less CPU intensive (like 10x). It has different semantics, though.
Upvotes: 0
Reputation: 1932
// get blob data
CloudBlob cloudBlob = blobContainer.GetBlobReference(blobName);
string text = cloudBlob.DownloadText();
Maybe downloading it in one go is faster than reading line by line in a loop?
Upvotes: 1