Reputation: 47
Being new to Python I am finding it a little difficult to comprehend solutions to problems of 'similar' (not sure though) nature posted on the forum and all my attempts to correlate them with mine have not been successful
I have a .csv file (part of which) shown below
Rank Day Parameter
10 1 x_time
10 2 x_time
9 3 x_time
2 15 x_time
7 16 x_time
10 18 x_time
3 25 x_time
5 31 x_time
2 35 x_time
4 1 t_msg
3 5 t_msg
5 9 t_msg
8 10 t_msg
4 90 t_msg
8 4 g_up
3 5 g_up
3 56 g_up
Problem Statement: The .csv file has been extracted from a dataset; the aim of which is to study the pattern of the "parameter" over a span of period (say 90 days) along with its "rank"(gravity) on any given "day" of a period. The said parameter may or may not occur on a particular day during the said period.
A model now exists where every instance of occurrence of a parameter is being put in a separate row (of the csv file). What I am now attempting (in vain,till) is that for every unique parameter only one row may exist with 90 corresponding columns for each day(as the analysis window is 90 days). For all days when the parameter is ranked, its ranking comes in column as it is and the others are left as 0.
If may explanation has aided confusion let me put it in this way. Considering the csv file had been as shown could I reshape this to a one having 3 rows and 90 columns
day1 day2 day3 day4 ............ day90
x_time 10 10 9 0 0
t_msg 4 0 0 0 4
g_up 0 0 0 8 0
.
.
.
PS:I understand that I have been mentioning .csv file all throughout but have put 'table' in the title as I'm made to believe that such a feat may only be accomplished through a table. Please correct me if wrong.
Thanking in anticipation
Upvotes: 0
Views: 103
Reputation: 77900
I did this with a dictionary of 90-element lists. The parameter serves as the dictionary key; the day is the index into the list. You stuff the rank into that list location.
my_table = {
"x_time": 90*[0],
"t_msg": 90*[0],
"g_up": 90*[0],
}
with open("p.csv") as f:
for line in f:
r, d, p = line.split(',')
rank = int(r)
day = int(d)
param = p.strip()
my_table[param][day-1] = rank
print my_table
... and here's the output on your given data (which I put into p.csv):
{'x_time': [10, 10, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 7, 0, 10, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 5, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
'g_up': [0, 0, 0, 8, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
't_msg': [4, 0, 0, 0, 3, 0, 0, 0, 5, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4]}
Upvotes: 2