Danijel Babic
Danijel Babic

Reputation: 21

How to import CSV file into Octave and keep the column headers

I am trying to import a CSV file so that I can use it with the k-means clustering algorithm. The file contains 6 columns and over 400 rows. Here is a picture of the excel document I used (before exporting it into a CSV file). In essence, I want to be able to use the column header names in my code so that I can use the column names when plotting the data, as well as clustering it.

I looked into some other documentation and came up with this code but nothing came as an output when I just put it into the command window:

[Player BA OPS RBI OBP] = CSVIMPORT( 'MLBdata.csv', 'columns', {'Player', 'BA', 'OPS', 'RBI', 'OBP'}

The only thing that has worked for me so far is the dlm read function, but it returns 0 when there is a String of words N = dlmread('MLBdata.csv')

Upvotes: 1

Views: 4554

Answers (1)

Tasos Papastylianou
Tasos Papastylianou

Reputation: 22225

Octave

Given file data.csv with the following contents:

Player,Year,BA,OPS,RBI,OBP
SandyAlcantara,2019,0.086,0.22,4,0.117
PeteAlonso,2019,0.26,0.941,120,0.358
BrandonLowe,2019,0.27,0.85,51,0.336
MikeSoroka,2019,0.077,0.22,3,0.143

Open an octave terminal and type:

pkg load io
C = csv2cell( 'data.csv' )

resulting in the following cell array:

C =
{
  [1,1] = Player
  [2,1] = SandyAlcantara
  [3,1] = PeteAlonso
  [4,1] = BrandonLowe
  [5,1] = MikeSoroka

  [1,2] = Year
  [2,2] = 2019
  [3,2] = 2019
  [4,2] = 2019
  [5,2] = 2019

  [1,3] = BA
  [2,3] = 0.086000
  [3,3] = 0.2600
  [4,3] = 0.2700
  [5,3] = 0.077000

  [1,4] = OPS
  [2,4] = 0.2200
  [3,4] = 0.9410
  [4,4] = 0.8500
  [5,4] = 0.2200

  [1,5] = RBI
  [2,5] = 4
  [3,5] = 120
  [4,5] = 51
  [5,5] = 3

  [1,6] = OBP
  [2,6] = 0.1170
  [3,6] = 0.3580
  [4,6] = 0.3360
  [5,6] = 0.1430
}

From there on, you can collect that data into arrays or structs as you like and continue working. One nice option is Andrew Janke's nice 'tablicious' package:

octave:13> pkg load tablicious                                                                                                                                                                                                                                                                                            
octave:14> T = cell2table( C(2:end,:), 'VariableNames', C(1,:) );                                                                                                                                                                                                                                                         
octave:15> prettyprint(T)                                                                                                                                                                                                                                                                                                 
-------------------------------------------------------                                                                                                                                                                                                                                                                   
| Player         | Year | BA    | OPS   | RBI | OBP   |                                                                                                                                                                                                                                                                   
-------------------------------------------------------                                                                                                                                                                                                                                                                   
| SandyAlcantara | 2019 | 0.086 | 0.22  | 4   | 0.117 |                                                                                                                                                                                                                                                                   
| PeteAlonso     | 2019 | 0.26  | 0.941 | 120 | 0.358 |                                                                                                                                                                                                                                                                   
| BrandonLowe    | 2019 | 0.27  | 0.85  | 51  | 0.336 |                                                                                                                                                                                                                                                                   
| MikeSoroka     | 2019 | 0.077 | 0.22  | 3   | 0.143 |                                                                                                                                                                                                                                                                   
-------------------------------------------------------   

Upvotes: 2

Related Questions