Reputation: 131
You are given a string, One has to constitute all the substring of a given string,
Example, String "baceb" is given, the substrings are {b, ba, bac, bace, a, ac, ace, aceb, c, ce, ceb, e, eb, baceb} and each element in the list contains, {0, 1, 1, 2, 1, 1, 2, 2, 0, 1, 1, 1, 1, 2} number of vowels, the sum is 16. The size of the string goes up to 10^5.
This is how far I have come, works for smaller case files, for bigger cases,I get timeout error.
Scanner sc=new Scanner(System.in);
int n=sc.nextInt();
sc.nextLine();
while(n-->0){
ArrayList<String> list=new ArrayList<>();
String s=sc.nextLine();
int len=s.length();
s=s.toLowerCase();
for(int i=0;i<len;i++){
for(int j=i+1;j<=len;j++){
String temp=s.substring(i,j);
if(!list.contains(temp)){
list.add(temp);
}
}
}
// System.out.println(list);
int count=0;
for(String str:list){
for(int k=0;k<str.length();k++){
char ch=str.charAt(k);
if(ch=='a'||ch=='e'||ch=='i'||ch=='o'||ch=='u'){
count++;
}
}
}
System.out.println(count);
}
n is the number of test cases.
Upvotes: 1
Views: 1164
Reputation: 86276
You’ve done everything correctly this far. You need to do some more.
First approach to a programming task is to write clear code that solves it. In this case write code that generates the substrings and counts the vowels in the generated substrings. Every programmer can understand this. Very good work.
Next, if the coded solution turns out not to perform well enough, it’s time for optimization (for 99.9 % of real-world programming we will never reach this point, but in coding challenges like yours it’s commonplace). You will want code that performs well enough that no timeout error occurs.
For an optimized solution we don’t need to generate the substrings. Instead we observe: For your example string 5 characters long: The first character (index 0) can be part of 5 substrings: b
, ba
, bac
, bace
and baceb
. However, it’s a consonant, so it doesn’t really matter how many. The next character, a
at index 1, is part of 8 substrings: 4 beginning at index 0 and 4 beginning at index 1. So it contributes 8 towards the total of 16 vowels in all substrings. Had the next character (c
at index 2) been a vowel, we would have needed to calculate that it goes into 9 substrings: 3 beginning at index 0, 3 beginning at index 1 and 3 beginning at index 2. Can you begin to see a pattern? I think that we can calculate the number of substrings that a character contributes to by multiplying the number of characters up to and including that character by the number of characters from that character inclusive to the end of the string. Please check if I am correct.
So an efficient algorithm can be: iterate through the string indices. If the character at a given index is a vowel, calculate how many substrings it is in, and add this count to a total.
Edit:
But without constructing the substring how is one going to know how many vowels actually are in the substring?
The point is: You do not need to know how many vowels are in each substring. You only need to know the sum of all those counts. So we are obtaining that sum in quite a different manner. We are exploiting the fact that every time there’s a vowel in a substring, that vowel must come from one particular index in the original string. So instead of counting the vowels in each substring we are counting the substrings that each vowel is in. The result has got to be the same.
Take the example string from your question, baceb
. There are two vowels, a
at index 1 and e
at index 3. a
is in substrings ba
, bac
, bace
, baceb
, a
, ac
, ace
and aceb
, 8 in total. So contributes 8 to the count of vowels in all substrings. e
too is in 8 substrings. 8 + 8 equals 16, which is the sum of counts of vowels in all substings.
Let me try a more formal argument. Consider a vowel at index i
in a string of length len
(0 <= i < len
). Now the question is: out of the substrings of the string, in how many is this particular vowel included? For it to be included in a substring, that substring must being at index 0, 1, … i
(inclusive), so there are i + 1
possible start indices. The substring must end at index i + 1
, i + 2
, … len
, giving len - i
possibilities. Since every possible start index can be combined with any possible end index to define a substring, we can multiply the two numbers. The product gives the number of substrings that this vowel is in. And hence the contribution from this vowel towards the sum of counts of vowels in all substrings. So what’s left to do is add up all the products for the vowels in the original string. Then you’ve got your result.
PS I have assumed that substrings need not be unique. In the string bobo
the substring bo
comes twice and o
constributes to the vowel count both times. I see from your code that this disagrees with your understanding, but I would still assume that mine is correct.
PPS Also be aware that for strings of length up to 100 000 the total may overflow an int
. Use a long
for the total count.
PPPS For an addition slight optimization you may make a faster check on whether a character is a vowel. Create a BitSet
once, and set the 10 bits corresponding to the upper case and lower case variants of each vowel. Now to check whether a character is a vowel simply inquire whether the corresponding bit in the BitSet
is set. No need to convert to lower case first.
Upvotes: 4