Reputation: 13
I'm attempting to write a Java program that searches for a specific substring (xyz) within a user-entered string, and keeping a running count, unless that substring is preceded by a period. At this point in the class, we've only used charAt and length, so if possible I need to stick to that. Additionally, we haven't used regular expressions at all, so that's out the window too.
I've managed to get the program working as desired, with one notable exception: if the String entered begins with a period, it fails to count any successive matches. This is what I've got so far:
System.out.println("Give me a String:");
String s1 = kb.nextLine();
int index = 0;
int count = 0;
while(index <= s1.length() - 1 && s1.charAt(index) != '.')
{
if(s1.charAt(index) == 'x' && s1.charAt(index + 2) == 'z')
{
count++;
}
index++;
}
System.out.println(count);
Upvotes: 1
Views: 1241
Reputation: 4713
As this seems like a homework type of question I will attempt to guide you in the right direction first and provide a solution at a later time. I strongly encourage you to work through the problem on your own first to the best of your ability before you look at my solution (once I post it) and to read this page before going ANY further
First, consider the kinds of inputs you could receive. Since you didn't specify any limitations you could get things like:
I could go on, but I'm sure you can come up with all the various combinations of weird things you might receive. These are just a few examples to get you started (along with those I posted in the comments already)
Next, think about what you need your algorithm to do. As I said in the comments it sounds like you want to count the occurrences of the substring "xyz" while ignoring the occurrences of the substring ".xyz". Now consider how you're going to look for these substrings - you're going to advance one character at a time from left to right across the String looking for a substring that matches one of these two possibilities. When you find one of them you'll either ignore it or count it.
Hopefully this helps and as I said, I will post a solution later after you've had some time to wrestle with the code. If you do solve it go ahead and post your solution (maybe edit your question to add the new code or add an answer) Finally I once again strongly urge you to read this page if you have not already.
EDIT #1:
I wanted to add a little more information and that is: you already have a pretty good idea of what you need to do in order to count your "xyz" substring at this point - despite the small flaw in the logic for inputs like "xaz", which is easily fixable. What you need to focus on is how to ignore the substring ".xyz" so think about how you could implement the ignore logic, what does it mean to ignore it? Once you answer that it should start coming together for you.
EDIT #2:
Below you will find my solution to the problem. Once again it's important to understand how the solution works not just copy and paste it. If you simply copy my code without understanding it you're cheating yourself out of the education that you're trying to gain. I don't have time at the moment to describe in detail why and how this code works, but I do plan to edit again later to add those details.
import java.util.Scanner;
public class Main {
private static Scanner scan = new Scanner(System.in);
public static void main(String[] args) {
System.out.println("Give me a String:");
String s1 = scan.nextLine();
System.out.println(countSubstrings(s1));
}
public static int countSubstrings(String s1){
int index = 0;
int count = 0;
while (index < s1.length()-2) {
if(s1.charAt(index) == '.' && s1.charAt(index+1) != '.'){
index++;
}
else if (index+2 < s1.length() && s1.charAt(index) == 'x' && s1.charAt(index + 1) == 'y'
&& s1.charAt(index + 2) == 'z') {
count++;
index+=2;
}
index++;
}
return count;
}
}
EDIT #3:
Here is the nuts and bolts of why the above code does what it does. First, we think about the fact that we're looking for 3 items (a triple) in a specific order within an array and if we see a fourth item (a period) immediately preceding the first item of the triple then we need to ignore the triple.
Per my previous edit we need to define what it means to ignore. In this case what we mean is to simply not count it and move on with our search for valid substrings to count. The simplest way to do that is to advance the index
without incrementing the count
.
So, ask yourself the following:
When should my loop stop? Since we're looking for triples we know we can stop if the length of the input String is less than 3 or when there are less than 3 characters left in the String that we have not examined yet. For example if the input is "xyzab" by the time we get to index 3 we know there's no possible way to form a triple where "a" is the first character in the triple and that therefore our counting is done.
Is there ever a time when I would not want to skip the next 3 characters after a period? After all the goal is to look for triples so wouldn't I want to skip 3 characters not just 1? Yes there is a time when you do NOT want to skip 3 characters and that's when you have something like ".axyz" because a valid triple could start as soon as the 2nd character past the period. So in fact you want to skip only 1 character.
This, and the fact that index is always incremented by 1 at the end of the loop (more on this later), is why the first condition inside the while
only advances the index
by 1:
if(s1.charAt(index) == '.' && s1.charAt(index+1) != '.'){
index++;
}
This is why the second half of the above condition exists:
`&& s1.charAt(index+1) != '.'`
Now ask yourself how to identify a valid triple. I'm sure by now you can see how to do this - check the current character, the next character, and the character after that for the values you want. This logic is the latter portion of the second if
condition within the while:
s1.charAt(index) == 'x' && s1.charAt(index + 1) == 'y'
&& s1.charAt(index + 2) == 'z'
Whenever you're using calculations like index +1 or index +2 inside of a loop that is incrementing the index until it reaches a boundary you have to consider the possibility that your calculation will exceed the boundary because you can't rely on the loop to check this for you as the loop will not perform that check until either the end or beginning of the loop (depending on which kind of loop it is)
index+2 < s1.length()
You may be wondering - why not add two checks since we're using index+1 and index+2? We only need one check to see if the greatest index we use will exceed the boundary in this case. If index +2 is beyond the bounds we don't care if index+1 is or is not because it won't matter we can't possibly have a matching substring.
if
inside the while
you see there is code to increment the index by 2: index+=2;
This is done for efficiency since once we have identified a triple we know there is no way to form another triple with characters that are already part of another triple. Therefore we want to skip over them and just like the first bullet point we take advantage of the loop incrementing the index so we only need to increment by 2 and let the loop add the extra 1 later.Finally we reach the end of the logic within the loop. This part you're already familiar with and that's the index++;
which simply increments the position within the String that we're currently examining. Note that this works in tandem with the first bullet point. Take the example from the first bullet point of ".axyz". There is a period in index 0 and the character in index 1 is not another period so the logic from the first bullet point will increment index by 1, making it 1. At the end of the loop index is incremented again making it 2 thereby skipping over the period - at the start of the next loop index is 2, it was never 1 at the start of the loop.
Well, I hope this helps to explain how it all works and illustrate how to think about these sorts of problems. The basic principle is to visualize where your current element is and how you can use that to achieve your goal. At the same time think about what kinds of properties the different elements of your program have and how you can take advantage of them - such as the fact that once you identify a triple it is safe to skip over those characters because they have the property of only being usable once. As with any program you always want to try to create as many test inputs as you can to test all the weird boundary cases that might occur to ensure the correctness of your code. I realize you probably are not familiar with JUnit but it is a very useful tool and you might try researching the basics of using it when you have a little spare time, and the bonus is that if you use the Eclipse IDE it has integrated JUnit functionality you can use.
Upvotes: 0
Reputation: 139
You can simply check the input string whether it starts with period. If so then you can use the following piece of code to handle the validation.
if(s1.charAt(0)!='.')
{
while(index <= s1.length() - 1 && s1.charAt(index) != '.')
{
if(s1.charAt(index) == 'x' && s1.charAt(index + 2) == 'z')
{
count++;
}
index++;
}
}
else
{
index=1;
while(index <= s1.length() - 1 && s1.charAt(index) != '.')
{
if(s1.charAt(index) == 'x' && s1.charAt(index + 2) == 'z')
{
count++;
}
index++;
}
}
System.out.println(count);
}
Upvotes: 1