adam Wadsworth
adam Wadsworth

Reputation: 784

pull out column 2,3 of a 2d array in python

hi all i have a 2d array and i want to create a new 2d array with only the column 2,3

here is my code

#!/user
# -*- coding: utf-8 -*-

import csv
import urllib2
import numpy as np


url = 'https://api.bmreports.com/BMRS/FUELINST/v1?APIKey=66ky5jo5p5w0vbd&ServiceType=CSV'
url2 = 'https://api.bmreports.com/BMRS/FUELINSTHHCUR/v1?APIKey=66ky5jo5p5w0vbd&ServiceType=CSV'
response = urllib2.urlopen(url2)
cr = csv.reader(response)


arr = np.genfromtxt(response,delimiter=",", skip_header=1, skip_footer=2,dtype=None)

data = arr[:, [1:2]]


print data

so the data comes back like this

[('FUELINSTHHCUR', 'CCGT', 10430, 35.8, 10282, 35.2, 205996, 32. )
 ('FUELINSTHHCUR', 'OCGT',     0,  0. ,     0,  0. ,     17,  0. )
 ('FUELINSTHHCUR', 'OIL',     0,  0. ,     0,  0. ,      0,  0. )
 ('FUELINSTHHCUR', 'COAL',     0,  0. ,     0,  0. ,      0,  0. )
 ('FUELINSTHHCUR', 'NUCLEAR',  6963, 23.9,  6970, 23.9, 167591, 26. )
 ('FUELINSTHHCUR', 'WIND',  6986, 24. ,  7061, 24.2, 160036, 24.9)
 ('FUELINSTHHCUR', 'PS',   297,  1. ,   412,  1.4,   8136,  1.3)
 ('FUELINSTHHCUR', 'NPSHYD',   322,  1.1,   319,  1.1,   8015,  1.2)
 ('FUELINSTHHCUR', 'OTHER',   129,  0.4,   128,  0.4,   3093,  0.5)
 ('FUELINSTHHCUR', 'INTFR',  1494,  5.1,  1494,  5.1,  31731,  4.9)
 ('FUELINSTHHCUR', 'INTIRL',     0,  0. ,     0,  0. ,   2650,  0.4)
 ('FUELINSTHHCUR', 'INTNED',   882,  3. ,   880,  3. ,  18991,  2.9)
 ('FUELINSTHHCUR', 'INTEW',     0,  0. ,     0,  0. ,      0,  0. )
 ('FUELINSTHHCUR', 'BIOMASS',  1608,  5.5,  1630,  5.6,  37688,  5.9)]

i'm trying to create a new 2d array that only brings back columns back so it looks like this

[('CCGT', 10430)
 ('OCGT',     0)
 ('OIL',     0)
 ('COAL',     0)
 ('NUCLEAR',  6963)
 ('WIND',  6986)
 ('PS',   297)
 ('NPSHYD',   322)
 ('OTHER',   129)
 ('INTFR',  1494)
 ('INTIRL',     0)
 ('INTNED',   882)
 ('INTEW',     0)
 ('BIOMASS',  1608)]

Upvotes: 0

Views: 278

Answers (2)

webmite
webmite

Reputation: 575

If you review this code sample it may help you.

It assumes that the first values you have collected turn out to be individual strings in a numpy array....so each line contains commas, and quoted elements but they are all embedded in a single element.

The var 'lst' is a list used to step through all the array elements and create multidimensional list of lists containing the comma separated elements as individual strings.

The var 'tmp' is a numpy array constructed of the newly parsed 'lst'.

The var 'data' is the sliced numpy array from 'tmp' with just the columns you were looking for.

Hope that helps...


    import numpy as np

    def ElementSplit(x) : return x.split(",")
    ElementSplit = np.vectorize(ElementSplit)

    arr = np.array([["'FUELINSTHHCUR', 'CCGT', 10430, 35.8, 10282, 35.2, 205996, 32. "],
     ["'FUELINSTHHCUR', 'OCGT',     0,  0. ,     0,  0. ,     17,  0. "],
     [" 'FUELINSTHHCUR', 'OIL',     0,  0. ,     0,  0. ,      0,  0. "],
     [" 'FUELINSTHHCUR', 'COAL',     0,  0. ,     0,  0. ,      0,  0. "],
     [" 'FUELINSTHHCUR', 'NUCLEAR',  6963, 23.9,  6970, 23.9, 167591, 26. "],
     [" 'FUELINSTHHCUR', 'WIND',  6986, 24. ,  7061, 24.2, 160036, 24.9"],
     [" 'FUELINSTHHCUR', 'PS',   297,  1. ,   412,  1.4,   8136,  1.3"],
     [" 'FUELINSTHHCUR', 'NPSHYD',   322,  1.1,   319,  1.1,   8015,  1.2"],
     [" 'FUELINSTHHCUR', 'OTHER',   129,  0.4,   128,  0.4,   3093,  0.5"],
     [" 'FUELINSTHHCUR', 'INTFR',  1494,  5.1,  1494,  5.1,  31731,  4.9"],
     [" 'FUELINSTHHCUR', 'INTIRL',     0,  0. ,     0,  0. ,   2650,  0.4"],
     [" 'FUELINSTHHCUR', 'INTNED',   882,  3. ,   880,  3. ,  18991,  2.9"],
     [" 'FUELINSTHHCUR', 'INTEW',     0,  0. ,     0,  0. ,      0,  0. "],
     [" 'FUELINSTHHCUR', 'BIOMASS',  1608,  5.5,  1630,  5.6,  37688,  5.9"]])

    print " --arr--"
    print arr

    lst = []
    for items in arr:
        lst.append(items[0].split(", "))

    print " --lst--"
    print lst

    exit

    #data = arr[:]
    tmp = np.array(lst)

    print " --tmp--"
    print tmp

    data = tmp[:,[1,2]]

    print " --data--"
    print data

and the output looks like this....


    $python main.py
     --arr--
    [["'FUELINSTHHCUR', 'CCGT', 10430, 35.8, 10282, 35.2, 205996, 32. "]
     ["'FUELINSTHHCUR', 'OCGT',     0,  0. ,     0,  0. ,     17,  0. "]
     [" 'FUELINSTHHCUR', 'OIL',     0,  0. ,     0,  0. ,      0,  0. "]
     [" 'FUELINSTHHCUR', 'COAL',     0,  0. ,     0,  0. ,      0,  0. "]
     [" 'FUELINSTHHCUR', 'NUCLEAR',  6963, 23.9,  6970, 23.9, 167591, 26. "]
     [" 'FUELINSTHHCUR', 'WIND',  6986, 24. ,  7061, 24.2, 160036, 24.9"]
     [" 'FUELINSTHHCUR', 'PS',   297,  1. ,   412,  1.4,   8136,  1.3"]
     [" 'FUELINSTHHCUR', 'NPSHYD',   322,  1.1,   319,  1.1,   8015,  1.2"]
     [" 'FUELINSTHHCUR', 'OTHER',   129,  0.4,   128,  0.4,   3093,  0.5"]
     [" 'FUELINSTHHCUR', 'INTFR',  1494,  5.1,  1494,  5.1,  31731,  4.9"]
     [" 'FUELINSTHHCUR', 'INTIRL',     0,  0. ,     0,  0. ,   2650,  0.4"]
     [" 'FUELINSTHHCUR', 'INTNED',   882,  3. ,   880,  3. ,  18991,  2.9"]
     [" 'FUELINSTHHCUR', 'INTEW',     0,  0. ,     0,  0. ,      0,  0. "]
     [" 'FUELINSTHHCUR', 'BIOMASS',  1608,  5.5,  1630,  5.6,  37688,  5.9"]]
     --lst--
    [["'FUELINSTHHCUR'", "'CCGT'", '10430', '35.8', '10282', '35.2', '205996', '32. '], ["'FUELINSTHHCUR'", "'OCGT'", '    0', ' 0. ', '    0', ' 0. ', '    17', ' 0. '], [" 'FUELINSTHHCUR'", "'OIL'", '    0', ' 0. ', '    0', ' 0. ', '     0', ' 0. '], [" 'FUELINSTHHCUR'", "'COAL'", '    0', ' 0. ', '    0', ' 0. ', '     0', ' 0. '], [" 'FUELINSTHHCUR'", "'NUCLEAR'", ' 6963', '23.9', ' 6970', '23.9', '167591', '26. '], [" 'FUELINSTHHCUR'", "'WIND'", ' 6986', '24. ', ' 7061', '24.2', '160036', '24.9'], [" 'FUELINSTHHCUR'", "'PS'", '  297', ' 1. ', '  412', ' 1.4', '  8136', ' 1.3'], [" 'FUELINSTHHCUR'", "'NPSHYD'", '  322', ' 1.1', '  319', ' 1.1', '  8015', ' 1.2'], [" 'FUELINSTHHCUR'", "'OTHER'", '  129', ' 0.4', '  128', ' 0.4', '  3093', ' 0.5'], [" 'FUELINSTHHCUR'", "'INTFR'", ' 1494', ' 5.1', ' 1494', ' 5.1', ' 31731', ' 4.9'], [" 'FUELINSTHHCUR'", "'INTIRL'", '    0', ' 0. ', '    0', ' 0. ', '  2650', ' 0.4'], [" 'FUELINSTHHCUR'", "'INTNED'", '  882', ' 3. ', '  880', ' 3. ', ' 18991', ' 2.9'], [" 'FUELINSTHHCUR'", "'INTEW'", '    0', ' 0. ', '    0', ' 0. ', '     0', ' 0. '], [" 'FUELINSTHHCUR'", "'BIOMASS'", ' 1608', ' 5.5', ' 1630', ' 5.6', ' 37688', ' 5.9']]
     --tmp--
    [["'FUELINSTHHCUR'" "'CCGT'" '10430' '35.8' '10282' '35.2' '205996' '32. ']
     ["'FUELINSTHHCUR'" "'OCGT'" '    0' ' 0. ' '    0' ' 0. ' '    17' ' 0. ']
     [" 'FUELINSTHHCUR'" "'OIL'" '    0' ' 0. ' '    0' ' 0. ' '     0' ' 0. ']
     [" 'FUELINSTHHCUR'" "'COAL'" '    0' ' 0. ' '    0' ' 0. ' '     0' ' 0. ']
     [" 'FUELINSTHHCUR'" "'NUCLEAR'" ' 6963' '23.9' ' 6970' '23.9' '167591'
      '26. ']
     [" 'FUELINSTHHCUR'" "'WIND'" ' 6986' '24. ' ' 7061' '24.2' '160036' '24.9']
     [" 'FUELINSTHHCUR'" "'PS'" '  297' ' 1. ' '  412' ' 1.4' '  8136' ' 1.3']
     [" 'FUELINSTHHCUR'" "'NPSHYD'" '  322' ' 1.1' '  319' ' 1.1' '  8015'
      ' 1.2']
     [" 'FUELINSTHHCUR'" "'OTHER'" '  129' ' 0.4' '  128' ' 0.4' '  3093'
      ' 0.5']
     [" 'FUELINSTHHCUR'" "'INTFR'" ' 1494' ' 5.1' ' 1494' ' 5.1' ' 31731'
      ' 4.9']
     [" 'FUELINSTHHCUR'" "'INTIRL'" '    0' ' 0. ' '    0' ' 0. ' '  2650'
      ' 0.4']
     [" 'FUELINSTHHCUR'" "'INTNED'" '  882' ' 3. ' '  880' ' 3. ' ' 18991'
      ' 2.9']
     [" 'FUELINSTHHCUR'" "'INTEW'" '    0' ' 0. ' '    0' ' 0. ' '     0'
      ' 0. ']
     [" 'FUELINSTHHCUR'" "'BIOMASS'" ' 1608' ' 5.5' ' 1630' ' 5.6' ' 37688'
      ' 5.9']]
     --data--
    [["'CCGT'" '10430']
     ["'OCGT'" '    0']
     ["'OIL'" '    0']
     ["'COAL'" '    0']
     ["'NUCLEAR'" ' 6963']
     ["'WIND'" ' 6986']
     ["'PS'" '  297']
     ["'NPSHYD'" '  322']
     ["'OTHER'" '  129']
     ["'INTFR'" ' 1494']
     ["'INTIRL'" '    0']
     ["'INTNED'" '  882']
     ["'INTEW'" '    0']
     ["'BIOMASS'" ' 1608']]

Upvotes: 0

Kasravnd
Kasravnd

Reputation: 107347

Firstly you need to use a Numpy array for preserving your items not a list, then you can pass the column indices in a list to the second axis to get the desire results:

data = arr[:, [2, 3]]

Or slice like following:

data = arr[:, 2:4]

Also if the string you're reading is correctly formatted instead of using csv you can use fromstring() function to load your data.

Upvotes: 1

Related Questions