Reputation: 8014
I am new to python and I need to get these results from SQL database.
This is how my data look like in the database
Features List X
Group Type Code Frequency
1 A A1 3
1 A A2 6
1 A A3 2
1 B B1 9
1 B B2 4
2 A A1 5
2 A A4 7
2 B B1 2
2 B B2 1
2 C C1 7
...
Results List Y
Group Result
1 P938
1 H266
1 J992
2 F882
2 K391
these are in SQL Server database
My target to get these into python code where I can use predictive modelling like this
Group A1 A2 A3 A4 ... A383 B1 B2 ... B93838 C1 .. C3838 ...
1 3 6 2 7 1 9 4 8 1 4
2 2 1 5 7 2 8 2 6 2 6
.
.
.
The header is going to be the values from the Features List X and the values are the frequency in X
while the results are going to be array
Group Results
1 [0,1,0,0,1,..,1]
2 [1,0,1,0,1,..,0]
Edit
My result will should be in a single row array
Example
Group Result
1 B
1 D
1 E
2 A
2 B
2 C
2 E
3 C
3 F
4 B
4 E
I need the result to be an array that has the binary of columns A,B,C,D,E,F
Group ResultArray
1 [0,1,0,1,1,0]
2 [1,1,1,0,1,0]
3 [0,0,1,0,0,1]
4 [0,1,0,0,1,0]
Upvotes: 0
Views: 38
Reputation: 3739
Have you considered using Pandas?
import pandas as pd
it should go something along these lines:
get your data into a pandas DataFrame, either by downloading it to a file and loading it:
df = pd.read_csv(<file_name>)
or if you have a direct connection:
df = pd.read_sql(<query>, <connection>)
for the features use Pivot:
features_df = df.pivot(index='Group', columns='Code', values='Frequency')
for the results use groupby:
results_df = df.groupby('Group').apply(lambda row: row.values[0])
merge/join the 2 dataframes:
pd.merge(features_df, results_df)
Upvotes: 1