Reputation:
I am trying to put all values in map and i have more then 20k values , now I am trying to put values in map by using idea as key 1 contains values from 1(consider i) to 1000 (i.e i*1000) but the output I'm getting contains duplicate values (key 1 & 2 have same values), not sure what wrong I am doing
here is code
public class GetNumbers {
public static List<String> createList() throws IOException {
List<String> numbers = new LinkedList<>();
Path path = null;
File file = null;
BufferedReader reader = null;
String read = "";
try {
path = Paths.get("file.txt");
file = path.toFile();
reader = new BufferedReader(new FileReader(file));
while ((read = reader.readLine()) != null) {
numbers.add(read);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
return numbers;
}
public static Map<Integer, List<String>> createNewFiles() throws IOException {
Map<Integer, List<String>> myMap = new HashMap<>();
List<String> getList = GetNumbers.createList();
List<String> list = null;
int count = getList.size() / 1000;
---------------------------doubt full code-----------------------------------
for (int i = 1; i <= count; i++) {
if (getList.size() > 1000) {
list = getList.subList(i, i * 1000);
} else if (getList.size() < 999) {
list = getList.subList(i, getList.size());
}
-----------------------------------------------------------------------------
myMap.put(i, list);
}
return myMap;
}
public static void getMap() throws IOException {
Map<Integer, List<String>> map = GetNumbers.createNewFiles();
List<String> listAtIndexOne = map.get(2);
List<String> listAtIndexTwo = map.get(1);
for (String elementFromFirstList : listAtIndexOne) {
for (String elementFromSecondList : listAtIndexTwo) {
if (elementFromFirstList.equals(elementFromSecondList)) {
System.out.println("duplicate copy");
}
}
}
}
public static void main(String[] args) {
try {
GetNumbers.getMap();
} catch (IOException e) {
e.printStackTrace();
}
}
}
EDIT
if I change my Code to
for (int i = 0; i <= count; i++) {
if (getList.size() > (i * 1000)) {
list = getList.subList(i, (i + 1) * 1000);
} else if (getList.size() < 999) {
list = getList.subList(i, getList.size());
}
myMap.put(i, list);
}
I'm getting
Exception in thread "main" java.lang.IndexOutOfBoundsException: toIndex = 25000 at java.util.SubList.(Unknown Source) at java.util.AbstractList.subList(Unknown Source) at com.dnd.GetNumbers.createNewFiles(GetNumbers.java:43) at com.dnd.GetNumbers.getMap(GetNumbers.java:54) at com.dnd.GetNumbers.main(GetNumbers.java:69)
Any help is appreciated
Thanks
Upvotes: 2
Views: 96
Reputation: 533442
There is quite a few things I would change but one bug is in this line
subList(i, i * 1000);
You are starting the list at 1 to 1000
first which ignores the value at 0
but on the second iteration you are doing 2 to 2000
etc.
Most likely what you intended was 0 to 999
and 1000 to 1999
after that. BTW Performing a subList on a LinkedList is pretty inefficient.
I would build these lists as you read the file.
I would write it like this
public static void splitFile(String inputFile, String outputTemplate, int count) throws IOException {
int fileCount = 0, lineCount = 0;
// check for duplicates.
Set<String> previous = new HashSet<>();
// file to write to
PrintWriter pw = null;
// file to read from
try (BufferedReader in = new BufferedReader(new FileReader(inputFile))) {
// while there is another line to read.
for (String line; (line = in.readLine()) != null; ) {
// skip duplicates.
if (!previous.add(line))
continue;
// if we are at the end or haven't start a file.
if (pw == null || lineCount++ >= count) {
// close the old on if there was one.
if (pw != null)
pw.close();
// start a new file using the template i.e. where do we put the number.
pw = new PrintWriter(String.format(outputTemplate, fileCount++));
// we will have one line in this file.
lineCount = 1;
}
// add the line.
pw.println(line);
}
}
// close the file if we had one left open.
if (pw != null)
pw.close();
}
public static void main(String[] args) throws IOException {
// split the file into multiple files with up to 1000 lines each.
splitFile("file.txt", "file-part-%n.txt", 1000);
}
Upvotes: 2
Reputation: 393771
To split the list into sub-lists of 1000 elements, you can write something like this :
for (int i = 1; i <= count; i++) {
if (getList.size() >= i*1000) {
list = getList.subList((i-1) * 1000, i * 1000);
} else {
list = getList.subList((i-1) * 1000, getList.size());
}
myMap.put(i, list);
}
or simpler :
for (int i = 1; i <= count; i++) {
list = getList.subList((i-1) * 1000, Math.min(getList.size(),i * 1000));
myMap.put(i, list);
}
Note that the indices are 0 based, so the first sub-list will be 0 to 999, the second 1000 to 1999 and so on.
Upvotes: 2