Reputation: 2140
I have a dataframe like this:
ids dim
0 1 2
1 1 0
2 1 1
3 2 1
4 2 2
5 3 0
6 3 2
7 4 1
8 4 2
9 Nan 0
10 Nan 1
11 Nan 0
I want to build a tensorflow tensor out of it so that the result look like this:
Here the columns are correspond to the dim
column in df
, as we have three distinct value (0, 1,2) the equivalent tensor would have three column.
And the values of the tensor are the associated id
s in the df
.
1 1 1
Nan 2 2
3 Nan 3
Nan 4 4
What I did:
I tried to convert the df
to a numpy and then convert it to the tensor, however, the result does not look like what I want:
tf.constant(df[['ids', 'dim']].values, dtype=tf.int32)
Upvotes: 1
Views: 1581
Reputation: 11333
You can use pd.pivot_table()
for a concise computation
df = pd.DataFrame([[1, 2],
[1, 0],
[1, 1],
[2, 1],
[2, 2],
[3, 0],
[3, 2],
[4, 1],
[4, 2],
[np.nan, 0],
[np.nan, 1],
[np.nan, 0]], columns=['ids', 'dim'])
df['val'] = 1
df = df.pivot_table(index='ids',columns='dim',values='val')
df = df.multiply(np.array(df.index), axis=0)
tensor = tf.constant(df)
Upvotes: 1
Reputation: 2162
Check my code:
import numpy as np
import pandas as pd
import tensorflow as tf
df = pd.DataFrame([[1, 2],
[1, 0],
[1, 1],
[2, 1],
[2, 2],
[3, 0],
[3, 2],
[4, 1],
[4, 2],
[np.nan, 0],
[np.nan, 1],
[np.nan, 0]], columns=['ids', 'dim'])
dim_array = np.array(df['dim'])
sort = dim_array.argsort()
final = np.array([df.ids[sort]]).reshape((3, 4)).T
final_result = tf.constant(final, dtype=tf.int32) # use tf.float32 to retain nan in tensor
print(final_result)
# <tf.Tensor: shape=(4, 3), dtype=int32, numpy=
# array([[ 1, 1, 1],
# [ 3, 2, 2],
# [-2147483648, 4, 3],
# [-2147483648, -2147483648, 4]],
# dtype=int32)>
In tensorflow nan
will loss by some value.
Upvotes: 1