Yashasvi Raj Pant
Yashasvi Raj Pant

Reputation: 1424

How to read from particular header in opencsv?

I have a csv file. I want to extract particular column from it.For example: Say, I have csv:

id1,caste1,salary,name1
63,Graham,101153.06,Abraham
103,Joseph,122451.02,Charlie
63,Webster,127965.91,Violet
76,Smith,156150.62,Eric
97,Moreno,55867.74,Mia
65,Reynolds,106918.14,Richard

How can i use opencsv to read only data from header caste1?

Upvotes: 14

Views: 27727

Answers (5)

ccpizza
ccpizza

Reputation: 31666

From the opencsv docs:

Starting with version 4.2, there’s another handy way of reading CSV files that doesn’t even require creating special classes. If your CSV file has headers, you can initialize a CSVReaderHeaderAware instance and read values as a map:

  reader = new CSVReaderHeaderAware(new FileReader("yourfile.csv"));
  record = reader.readMap();

.readMap() will return a single record. You need to call .readMap() repeatedly to get all the records until you get null when it runs to the end (or to the first empty line), e.g.:

Map<String, String> values;

while ((values = reader.readMap()) != null) {

    // consume the values here

}

The class also has another constructor which allows more customization, e.g.:

CSVReaderHeaderAware reader = new CSVReaderHeaderAware(
        new InputStreamReader(inputStream),
        0,      // skipLines
        parser, // custom parser
        false,  // keep end of lines
        true,   // verify reader
        0,      // multiline limit
        null    // null for default locale
);

A possible downside is that since the reader is lazy it does not offer a record count, therefore, if you need to know the total number (for example to display correct progress information), then you'll need to use another reader just for counting lines.

See also:

Upvotes: 3

elkoo
elkoo

Reputation: 762

I had a task to remove several columns from existing csv, example of csv:

FirstName, LastName, City, County, Zip
Steve,Hopkins,London,Greater London,15554
James,Bond,Vilnius,Vilniaus,03250

I needed only FirstName and LastName columns with values and it is very important that order should be the same - default rd.readMap() does not preserve the order, code for this task:

        String[] COLUMN_NAMES_TO_REMOVE = new String[]{"", "City", "County", "Zip"};
        CSVReaderHeaderAware rd = new CSVReaderHeaderAware(new StringReader(old.csv));
        CSVWriter writer = new CSVWriter((new FileWriter(new.csv)),
                CSVWriter.DEFAULT_SEPARATOR, CSVWriter.NO_QUOTE_CHARACTER, CSVWriter.NO_ESCAPE_CHARACTER, CSVWriter.DEFAULT_LINE_END);

        // let's get private field
        Field privateField = CSVReaderHeaderAware.class.getDeclaredField("headerIndex");
        privateField.setAccessible(true);
        Map<String, Integer> headerIndex = (Map<String, Integer>) privateField.get(rd);

        // do ordering in natural order - 0, 1, 2 ... n
        Map<String, Integer> sortedInNaturalOrder = headerIndex.entrySet().stream()
                .sorted(Map.Entry.comparingByValue(Comparator.naturalOrder()))
                .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue,
                        (oldValue, newValue) -> oldValue, LinkedHashMap::new));

        // let's get headers in natural order
        List<String> headers = sortedInNaturalOrder.keySet().stream().distinct().collect(Collectors.toList());

        // let's remove headers
        List<String> removedColumns = new ArrayList<String>(Arrays.asList(COLUMN_NAMES_TO_REMOVE));
        headers.removeAll(removedColumns);
        // save column names           
        writer.writeNext(headers.toArray(new String[headers.size()]));
   
        List<String> keys = new ArrayList<>();
        Map<String, String> values;
        while ((values = rd.readMap()) != null) {
            for (String key : headers) {
                keys.add(values.get(key));
                if (keys.size() == headers.size()) {
                    String[] itemsArray = new String[headers.size()];
                    itemsArray = keys.toArray(itemsArray);
                    // save values                       
                    writer.writeNext(itemsArray);
                    keys.clear();
                }
            }
        }
        writer.flush();

Output:

FirstName, LastName
Steve,Hopkins
James,Bond

Upvotes: 0

Magnilex
Magnilex

Reputation: 11958

There is no built in functionality in opencsv for reading from a column by name.

The official FAQ example has the following example on how to read from a file:

CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
   // nextLine[] is an array of values from the line
   System.out.println(nextLine[0] + nextLine[1] + "etc...");
}

You simply fetch the value in second column for each row by accesing the row with nextLine[1] (remember, arrays indices are zero based).

So, in your case you could simply read from the second line:

CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
   System.out.println(nextLine[1]);
}

For a more sophisticated way of determining the column index from its header, refer to the answer from Scott Conway.

Upvotes: 5

Scott Conway
Scott Conway

Reputation: 983

Magnilex and Sparky are right in that CSVReader does not support reading values by column name. But that being said there are two ways you can do this.

Given that you have the column names and the default CSVReader reads the header you can search the first the header for the position then use that from there on out;

private int getHeaderLocation(String[] headers, String columnName) {
   return Arrays.asList(headers).indexOf(columnName);
}

so your method would look like (leaving out a lot of error checks you will need to put in)

CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
String [] nextLine;
int columnPosition;

nextLine = reader.readNext();
columnPosition = getHeaderLocation(nextLine, "castle1");

while ((nextLine = reader.readNext()) != null && columnPosition > -1) {
   // nextLine[] is an array of values from the line
   System.out.println(nextLine[columnPosition]);
}

I would only do the above if you were pressed for time and it was only one column you cared about. That is because openCSV can convert directly to an object that has the variables the same as the header column names using the CsvToBean class and the HeaderColumnNameMappingStrategy.

So first you would define a class that has the fields (and really you only need to put in the fields you want - extras are ignored and missing ones are null or default values).

public class CastleDTO {
   private int id1;
   private String castle1;
   private double salary;
   private String name1;

   // have all the getters and setters here....
}

Then your code would look like

CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
HeaderColumnNameMappingStrategy<CastleDTO> castleStrategy = new HeaderColumnNameMappingStrategy<CastleDTO>();
CsvToBean<CastleDTO> csvToBean = new CsvToBean<CastleDTO>();

List<CastleDTO> castleList = csvToBean.parse(castleStrategy, reader);

for (CastleDTO dto : castleList) {
   System.out.println(dto.getCastle1());
}

Upvotes: 10

Shawn Mehan
Shawn Mehan

Reputation: 4568

Looking at the javadoc

if you create a CSVReader object, then you can use the method .readAll to pull the entire file. It returns a List of String[], with each String[] representing a line of the file. So now you have the tokens of each line, and you only want the second element of that, so split them up as they have been nicely given to you with delimiters. And on each line you only want the second element, so:

public static void main(String[] args){
    String data = "63,Graham,101153.06,Abraham";
    String result[] = data.split(",");
    System.out.print(result[1]);
}

Upvotes: -4

Related Questions