Reputation: 659
I have a collection of strings. I added a "."
at the end of each in a foreach
loop and concatenated them into a single string.
However, not all strings after concatenation need a "."
So now I have a long string where I want to remove unnecessary "."
So I need to check for example "awefeaefe. efewgwe waggrgrgae. Weefwafewf ewfefw. Ewfewfgewgr. ewgfewg"
Where ". "
is followed by a lower case character, for example "e"
, I want to delete the "."
Where ". "
is followed by an upper case character, for example "E"
, do nothing.
I have tried creating a foreach (char c in parsedPara)
loop to get every 3 characters and check each one, but it's missing 2 out of every 3 combinations of letters as it's running on characters consecutively and I also don't know how to get the index for the correct "."
character in my original string from the loop if I find a combination anyway.
I have also tried creating a badString = ". " + char.toLower()
but I don't have a character to put into .toLower
. I know the 3rd character is going to be lowercase, but I don't know what character it will be.
Example as requested:
public class AnalyzeImage
{
public async Task analyzeImage(string imageUri)
{
string endpoint = Environment.GetEnvironmentVariable("VISION_ENDPOINT");
string key = Environment.GetEnvironmentVariable("VISION_KEY");
ImageAnalysisClient client = new ImageAnalysisClient(new Uri(endpoint), new AzureKeyCredential(key));
ImageAnalysisResult result = client.Analyze(new Uri(imageUri), VisualFeatures.Read, new ImageAnalysisOptions { GenderNeutralCaption = true });
string cleanLine;
string parsedPara = string.Empty;
string miniString = string.Empty;
int i = 0;
foreach (DetectedTextBlock block in result.Read.Blocks)
{
foreach (DetectedTextLine line in block.Lines)
{
cleanLine = line.Text.Replace("'", "");
if (!cleanLine.EndsWith(".") || !cleanLine.EndsWith(",") || !cleanLine.EndsWith("!") || !cleanLine.EndsWith("?") || !cleanLine.EndsWith("-"))
{
cleanLine += ". "; //Add period character to the end of strings missing a closing character.
}
if (cleanLine.EndsWith(".."))
{
cleanLine.Remove(cleanLine.Length - 1);
}
parsedPara += cleanLine; //Concatenate strings into a single string.
foreach (char c in parsedPara) //This is where I start trying to check every combination of 3 characters to identify any misplaced period characters mid sentence.
{
miniString = miniString + c;
if (miniString.Length > 2)
{
miniString = string.Empty;
} else if (miniString.Length == 3)
{
char firstChar = miniString[0];
char secondChar = miniString[1];
char thirdChar = miniString[2];
if(firstChar.ToString() == "." && secondChar.ToString() == " " && char.IsLower(thirdChar))
{
Debug.WriteLine("HIT!");
}
}
i++;
}
}
}
Debug.WriteLine("parsedPara: " + parsedPara);
}
}
}
Upvotes: 0
Views: 61
Reputation: 2829
Try using regex by matching:
\.(?=\s*[a-z])
and replacing with an empty string. See: regex101
Explanation
MATCH:
\.
: Match a literal dot(?= ... )
: only if it is succeeded by
\s*
: any amount of whitespace characters (change to
if you only ever have a single space)[a-z]
: and a lowercase letter.Upvotes: 2