Reputation: 38919
I was recently asked to submit a solution to a problem for a job.
Problem: Find a sub-string in a string.
Input: "Little star's deep dish pizza sure is fantastic."
Search: "deep dish pizza"
Output: "Little star's [[HIGHLIGHT]]deep dish pizza[[ENDHIGHLIGHT]] sure is fantastic."
Note that the highlighter doesn't have to have the exact same result on this example, since you are defining what a good snippet is and return the the most relevant snippet with the query terms highlighted.
The most important requirement was to write it as I would write a production code.
My solution was not accepted. How could I have improved it? I know, I could have used:
My QUESTION:
What do tech companies take into consideration when they review a code for a job. I submitted the code the same day, does that help in any way?
In one of the comments, it pointed out, it looks like a school code than production code. How? Any suggestions?
My solution:
FindSubString.java
/**
* FindSubString.java: Find sub-string in a given query
*
* @author zengr
* @version 1.0
*/
public class FindSubstring {
private static final String startHighlight = "[[HIGHLIGHT]]";
private static final String endHighlight = "[[ENDHIGHLIGHT]]";
/**
* Find sub-string in a given query
*
* @param inputQuery: A string data type (input Query)
* @param highlightDoc: A string data type (pattern to match)
* @return inputQuery: A String data type.
*/
public String findSubstringInQuery(String inputQuery, String highlightDoc) {
try {
highlightDoc = highlightDoc.trim();
if (inputQuery.toLowerCase().indexOf(highlightDoc.toLowerCase()) >= 0) {
// update query if exact doc exists
inputQuery = updateString(inputQuery, highlightDoc);
}
else {
// If exact doc is not in the query then break it up
String[] docArray = highlightDoc.split(" ");
for (int i = 0; i < docArray.length; i++) {
if (inputQuery.toLowerCase().indexOf(docArray[i].toLowerCase()) > 0) {
inputQuery = updateString(inputQuery, docArray[i]);
}
}
}
} catch (NullPointerException ex) {
// Ideally log this exception
System.out.println("Null pointer exception caught: " + ex.toString());
}
return inputQuery;
}
/**
* Update the query with the highlighted doc
*
* @param inputQuery: A String data type (Query to update)
* @param highlightDoc: A String data type (pattern around which to update)
* @return inputQuery: A String data type.
*/
private String updateString(String inputQuery, String highlightDoc) {
int startIndex = 0;
int endIndex = 0;
String lowerCaseDoc = highlightDoc.toLowerCase();
String lowerCaseQuery = inputQuery.toLowerCase();
// get index of the words to highlight
startIndex = lowerCaseQuery.indexOf(lowerCaseDoc);
endIndex = lowerCaseDoc.length() + startIndex;
// Get the highlighted doc
String resultHighlightDoc = highlightString(highlightDoc);
// Update the original query
return inputQuery = inputQuery.substring(0, startIndex - 1) + resultHighlightDoc + inputQuery.substring(endIndex, inputQuery.length());
}
/**
* Highlight the doc
*
* @param inputString: A string data type (value to be highlighted)
* @return highlightedString: A String data type.
*/
private String highlightString(String inputString) {
String highlightedString = null;
highlightedString = " " + startHighlight + inputString + endHighlight;
return highlightedString;
}
}
TestClass.java
/**
* TestClass.java: jUnit test class to test FindSubString.java
*
* @author zengr
* @version 1.0
*/
import junit.framework.Test;
import junit.framework.TestCase;
import junit.framework.TestSuite;
public class TestClass extends TestCase
{
private FindSubstring simpleObj = null;
private String originalQuery = "I like fish. Little star's deep dish pizza sure is fantastic. Dogs are funny.";
public TestClass(String name) {
super(name);
}
public void setUp() {
simpleObj = new FindSubstring();
}
public static Test suite(){
TestSuite suite = new TestSuite();
suite.addTest(new TestClass("findSubstringtNameCorrect1Test"));
suite.addTest(new TestClass("findSubstringtNameCorrect2Test"));
suite.addTest(new TestClass("findSubstringtNameCorrect3Test"));
suite.addTest(new TestClass("findSubstringtNameIncorrect1Test"));
suite.addTest(new TestClass("findSubstringtNameNullTest"));
return suite;
}
public void findSubstringtNameCorrect1Test() throws Exception
{
String expectedOutput = "I like fish. Little star's deep [[HIGHLIGHT]]dish pizza[[ENDHIGHLIGHT]] sure is fantastic. Dogs are funny.";
assertEquals(expectedOutput, simpleObj.findSubstringInQuery(originalQuery, "dish pizza"));
}
public void findSubstringtNameCorrect2Test() throws Exception
{
String expectedOutput = "I like fish. Little star's [[HIGHLIGHT]]deep dish pizza[[ENDHIGHLIGHT]] sure is fantastic. Dogs are funny.";
assertEquals(expectedOutput, simpleObj.findSubstringInQuery(originalQuery, "deep dish pizza"));
}
public void findSubstringtNameCorrect3Test() throws Exception
{
String expectedOutput = "Hello [[HIGHLIGHT]]how[[ENDHIGHLIGHT]] are [[HIGHLIGHT]]you[[ENDHIGHLIGHT]]r?";
assertEquals(expectedOutput, simpleObj.findSubstringInQuery("Hello how are your?", "how you"));
}
public void findSubstringtNameIncorrect1Test() throws Exception
{
String expectedOutput = "I like fish. Little star's deep dish pizza sure is fantastic. Dogs are funny.";
assertEquals(expectedOutput, simpleObj.findSubstringInQuery(originalQuery, "I love Ruby too"));
}
public void findSubstringtNameNullTest() throws Exception
{
String expectedOutput = "I like fish. Little star's deep dish pizza sure is fantastic. Dogs are funny.";
assertEquals(expectedOutput, simpleObj.findSubstringInQuery(originalQuery, null));
}
}
Upvotes: 10
Views: 1542
Reputation: 977
It sounds like you missed the point of the problem. The original problem statement says:
Note that the highlighter doesn't have to have the exact same result on this example, since you are defining what a good snippet is and return the the most relevant snippet with the query terms highlighted.
It sounds like they wanted you to identify a good snippet to return, not to just highlight words in the original input. If the input was long, you'd want to return a smaller snippet of text with the highlighted words.
One possible approach would be:
Things like good sentence identification, word stemming, and spelling corrections might also be within scope, especially if you are allowed to use third party libraries.
Upvotes: 2
Reputation: 18455
A few comments;
findSubstringInQuery
's main task isn't to find, it is to highlight and the inQuery
part is superflous. Just call the method highlight
or maybe highlightIgnoreCase
if you are going to have a highlight
that respects case.searchTerm
and text
.System.out.println()
.trim()
unless this was specified as a requirement. If I did, then obviously this behaviour would be documented in the javadoc.I wouldn't worry about the algorithm used for searching, Knuth-Morris-Pratt looks good but they shouldn't expect you to know about it and implement it unless the job spec specifically asked for experience/expertise in string searching.
Upvotes: 8
Reputation: 22025
If this code was submitted to me for review, this is what I would think:
trim()
toLowerCase()
unless explicitly asked for, although I would have added a comment stating that they could be added if needed. Anyway even if the search is meant to be case insensitive, there are too many redundant calls to toLowerCase()
in your code.NullPointerException
-- instead, you should ensure that this exception is never thrown.Upvotes: 3
Reputation: 14789
As they seem to emphasize that you can define what a good output is ... perhaps how do you do the parsing is not what they want to know. Maybe they want that you realize that marking the text in the string with a marker is not a very good solution. If the result is to be used in the program for further processing something like this might be more appropriate.
class MarkedText {
String highlightDoc;
String inputQuery;
List<Range> ranges;
}
class Range {
int offset;
int length;
}
public MarkedText findSubstringInQuery(String inputQuery, String highlightDoc) {
[...]
}
Upvotes: 1
Reputation: 21903
I don't know if I'm missing something, but I'd write a simple block of code using indexOf() and other string methods. Depending on the defintion of the problem, I'd probably use a StringBuilder.insert() method to inject the highlighting. I'd not be looking do to tokenising and looping as you have done here. I'd justify it by saying that it's the simplest solution to the problem as specified. KISS is the best approach to open questions like this.
Although they have given you the inputs and outputs, did they specify what was to happen if the inputs changed and there was no match?
I also notice the catching of a null pointer. Good idea if you know where it's likely to occur and intend to do something about it. But as you are just logging, perhaps you should have done a more general catch. For example, can your code trigger a IndexOutOfBoundsException. So I'd be looking to catch Exception or Throwable.
The other thing I'd be asking is a definition of what they consider "Production code". At first glance it sounds simple enough, but my experience is that it can be interpreted in many different ways.
The real problem is that they will be expecting certain things, and you don't know them. So you code what works for you and hope that it matches what they expect.
Upvotes: 1
Reputation: 27900
Do you mean to be matching on partial words in the case where you break up the query? You also aren't accounting for the possibility that the search string contains the word "HIGHLIGHT".
For example, try this:
Input: "Little star's deep dish pizza sure is fantastic."
Search: "a HIGHLIGHT"
Output: (probably not what you want)
Upvotes: -1