Mohammad Alshehri
Mohammad Alshehri

Reputation: 63

multiple datatypes in one column

I'm building a predictive model to predict who of the student will pass the course based on the time they spent on each task in minutes.

Course   StudentID  Task1   Task2   Task3 ...
AA       3547       2       9       2
AA       3548       5       2       5
AA       3549       1       7       3
AA       3550       2       9       2
AA       3551       5       2       5
AA       3542       2       9       2
BB       3543       5       2       5
BB       3544       1       7       3
BB       3555       2       9       2
CC       3556       5       2       5

However, every task in every course is different e.g. if Task1 in course AA is a question and answer, it might be a video to watch or a wiki to read. I feel feeding the data as it is to the network would be inappropriate and just wondering if there's any way to sort this out. I thought of adding a column next to each task but that would impossible due to the great number of tasks and students. Below is the type of task in each course:

Course  Task1   Task2   Task3
AA      Q/A     Video   Wiki
BB      Video   Wiki    Q/A
CC      Wiki    Wiki    Video 

Upvotes: 1

Views: 54

Answers (1)

theDBA
theDBA

Reputation: 239

I would say adding a column to each of the tasks is the clean way to do this. However if you do not want to add one additional column for each of the tasks, then you can have a tuple of each of the tasks.

The tasks tuple would be like (taskId, timeSpent)

Course  Student T1  T2
AA  357 (2, 22) (21, 14)
AA  358 (9, 33) (6, 35)
BB  359 (4, 19) (14, 19)
BB  360 (8, 34) (3, 28)
CC  361 (6, 9)  (6, 19)
CC  362 (3, 14) (5, 22)

Upvotes: 1

Related Questions