Reputation: 7490
I wanted to have some education on how do we access the variable values(observation for a particular variable in a file using the variable names.
So my question is this:
Suppose I have a file with four variables and the following example data.
ID Name Marks Rank
1 Tom 76 3
2 Dick 95 2
3 Harry 97 1
Now instead of accessing the data values of each variable above by first removing the first line(Variable Name) using fob.readline() method and then iterating the remaining line using a for loop, I want to use the variable name present in the file to access the values for that variable.
So if I want to access '1' from the variable ID, can we do it by just using the variable name ID here using some function/method or a way?
I guess what I am trying to find out is that instead of reading each line of a data file and storing it as a list, is it possible to access the observation/records in a data file using just the variable names of that data?
Like in SAS or other statistical tool if I use the variable name in a SAS Data Step, we can access the values of that variable for each observation. So is it possible to access values of a variable using the variable name? Like ID[0] , ID[1] etc or anything similar can give us each observation value in that variable? I know ID[0], ID[1] etc wont work but this might give a drift what I am asking.
This actually helps as in a file with many variables we might want to use a variable name to access the data values in that file in case we are running any algorithm on that data.
Upvotes: 3
Views: 1829
Reputation: 101072
Given that you file really looks like
ID Name Marks Rank
1 Tom 76 3
2 Dick 95 2
3 Harry 97 1
you can create a DataFrame
with Pandas' read_csv
function:
data = read_csv('your_data.txt', sep=r'\s+')
Now you can access the values the easy way:
>>> data
ID Name Marks Rank
0 1 Tom 76 3
1 2 Dick 95 2
2 3 Harry 97 1
>>> data.Marks
0 76
1 95
2 97
Name: Marks
>>> data.Name[2]
'Harry'
Upvotes: 2