Reputation: 627
I have a list of strings which I am going to write to a CSV file. The list elements has a String like this,
List<String> list1 = new ArrayList<String>();
list1.add("one, Aug 21, 2018 11:08:51 PDT, last");
list1.add("two, newlast, Aug 22, 2018 11:08:52 PDT");
But the problem is when I write to CSV file, "Aug 21"
and "2018 11:08:51"
gets separated into the different column.
I need it like "Aug 21, 2018 11:08:51 PDT"
.
Also, the index might change, it is not sure Aug 21 will always come at the same position in the list.
I tried the below code to fix this, It is Working. But is there any better way to fix this, (Instead of splitting to the array and iterating)
list1.forEach(s -> {
String s1[] = s.split(",");
for(int i=0; i<s1.length; i++) {
if(isValidMonthDate(s1[i])==true) {
if(s1[i+1]!=null && !s1[i+1].isEmpty()) {
if(isValidYearTime(s1[i+1])) {
s1[i] = s1[i].trim();
System.out.println("\""+ s1[i] +","+s1[i+1]+"\""); //i will concatenate this string and write to csv
}
}
}
}
});
}
public static boolean isValidMonthDate(String inDate) {
SimpleDateFormat dateFormat = new SimpleDateFormat("MMM dd"); dateFormat.setLenient(false);
try {
dateFormat.parse(inDate.trim());
} catch (ParseException pe) {
return false;
}
return true;
}
public static boolean isValidYearTime(String inDate) {
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy HH:mm:ss zzz");
dateFormat.setLenient(false);
try {
dateFormat.parse(inDate.trim());
} catch (ParseException pe) {
return false;
}
return true;
}
I am able to get output,
"Aug 21, 2018 11:08:51 PDT"
"Aug 22, 2018 11:08:52 PDT"
Is there any better way to achieve this without splitting to aarray and iterating it.
Upvotes: 0
Views: 478
Reputation: 3554
You could utilize the normal date parser to attempt parsing at each index using a parse position, and see where it succeeds.
As I try to ignore the old date api nowadays, here's a simple demo with the new one:
public static void main(String[] args) {
List<String> inputs = Arrays.asList(
"Aug 21, 2018 11:08:51 PDT",
"one, Aug 21, 2018 11:08:51 PDT, last",
"two, newlast, Aug 22, 2018 11:08:52 PDT"
);
String formatPattern = "MMM dd, yyyy HH:mm:ss zzz";
DateTimeFormatter pattern = DateTimeFormatter.ofPattern(formatPattern, Locale.US);
for(String input : inputs) {
System.out.println("Processing " + input);
int[] matchStartEnd = null;
TemporalAccessor temp = null;
// check all possible offsets i in the input string
for(int i = 0, n = input.length() - formatPattern.length(); i <= n; i++) {
try {
ParsePosition pt = new ParsePosition(i);
temp = pattern.parse(input, pt);
matchStartEnd = new int[] { i, pt.getIndex() };
break;
}
catch(DateTimeParseException e) {
// ignore this
}
}
if(matchStartEnd != null) {
System.out.println(" Found match at indexes " + matchStartEnd[0] + " to " + matchStartEnd[1]);
System.out.println(" temporal accessor is " + temp);
}
else {
System.out.println(" No match");
}
}
}
Upvotes: 1
Reputation: 29159
When output, put the date in quotes. That's how CSV escapes them.
To parse your input, use a regex. This one will read each date or word, and consume the comma separator
(\w{3} \d{1,2}, \d{4})|(\w+),?
You can elaborate with more parenthesis to pre-parse your date. If the first expression matches, it's the date. I will leave it to OP to order the final CSV.
Here the regex in Javascript for POC. I know the question is Java, but REGEX is same.
// read word or date followed by comma
const rx = /(\w{3} \d{1,2}, \d{4})|(\w+),?/g
const input = ['one, Aug 2, 1999, two', 'three, four, Aug 3, 2000', 'Aug 3, 2010, five, six']
let csv2 = ''
input.forEach(it => {
let parts = []
let m2 = rx.exec(it)
while (m2) {
parts.push(m2[1] || m2[2])
m2 = rx.exec(it)
}
csv2 += parts.map(it => '"' + it + '"').join(',') + '\n'
})
console.log(csv2)
Upvotes: 0
Reputation: 44476
I suggest you to use Regex to extract the date:
^(.*?)(\w{3} \d{1,2}, \d{4} \d{2}:\d{2}:\d{2} PDT)(.*?)$
And Stream::map
to extract the date and try to parse it. Don't forget to filter null
values out since they didn't pass the parsing.
SimpleDateFormat sdf = new SimpleDateFormat("MMM dd, yyyy HH:mm:ss Z", Locale.ENGLISH);
list1.stream()
.map(s -> {
try {
return sdf.parse(s.replaceAll("^(.*?)(\\w{3} \\d{1,2}, \\d{4} \\d{2}:\\d{2}:\\d{2} PDT)(.*?)$", "$2")));
} catch (ParseException e) {} return null; })
.filter(Objects::nonNull)
.forEach(System.out::println);
I suggest you wrap the try-catch
and the Regex extracting into a separate method.
static SimpleDateFormat sdf = new SimpleDateFormat("MMM dd, yyyy HH:mm:ss Z", Locale.ENGLISH);
static Date validate(String date) {
String s = date.replaceAll("^(.*?)(\\w{3} \\d{1,2}, \\d{4} \\d{2}:\\d{2}:\\d{2} PDT)(.*?)$", "$2");
try {
return sdf.parse(s);
} catch (ParseException e) { }
return null;
}
... which significantly simplifies the Stream:
list1.stream()
.map(Main::validate)
.filter(Objects::nonNull)
.forEach(System.out::println);
Upvotes: 0