How to remove the single quotation mark from every set of vector in the list variable?

Question

I'm new to this website and I've just picked up Python to carry out my project. I have come across this output when I was appending many sets of vectors into the variable "vectors" by appending it from a 'txt' file. The output shows that there are single quotation marks on every set of vectors from the 'txt' file after I had done appending it.

I want to remove every single quotation marks that can be seen in the variable list while maintaining the nature of the sets of vectors in the list. I have seen many good solutions from this website like using .join(), split(), map() and so on. I have tried it but it does not give me the solution that I want.

Here is the code for my programme

with open('output.txt', 'r') as result:
    df = pd.read_csv(result, header = None)
    grad = []
    vectors = []
    for row in range(len(df)):
        grad.append(df.iloc[row,0])
        for column in range(len(df.iloc[row,1:])):
            vectors.append(df.iloc[row,column+1])

The output for the variable "vectors" is as below:

In [30]:vectors
Out[30]: 
[' [-0.62338535 -0.62338535 -0.62338535 -0.62338535 -0.62338535]',
 ' [-0.6495707 -0.6495707 -0.6495707 -0.6495707 -0.6495707]',
 ' [-0.64999308 -0.64999308 -0.64999308 -0.64999308 -0.64999308]',
 ' [-0.64999989 -0.64999989 -0.64999989 -0.64999989 -0.64999989]',
 ' [-0.65 -0.65 -0.65 -0.65 -0.65]']

My ideal form for the variable should be like this:

[[-0.62338535 -0.62338535 -0.62338535 -0.62338535 -0.62338535],
 [-0.6495707 -0.6495707 -0.6495707 -0.6495707 -0.6495707],
 [-0.64999308 -0.64999308 -0.64999308 -0.64999308 -0.64999308],
 [-0.64999989 -0.64999989 -0.64999989 -0.64999989 -0.64999989],
 [-0.65 -0.65 -0.65 -0.65 -0.65]]

Here is the content from the "txt" file.

7.379024325749306, [-0.62338535 -0.62338535 -0.62338535 -0.62338535 -0.62338535]
0.1190243257493061, [-0.6495707 -0.6495707 -0.6495707 -0.6495707 -0.6495707]
0.0019198730746340876, [-0.64999308 -0.64999308 -0.64999308 -0.64999308 -0.64999308]
3.0967725290674006e-05, [-0.64999989 -0.64999989 -0.64999989 -0.64999989 -0.64999989]
4.995121929499646e-07, [-0.65 -0.65 -0.65 -0.65 -0.65]

Corralien · Accepted Answer

Use NumPy instead of Pandas:

vectors = []
with open('output.txt') as result:
    for line in result.readlines():
        vectors.append(np.fromstring(line.strip()[1:-1], sep=' '))
vectors = np.vstack(vectors)

>>> vectors
array([[-0.62338535, -0.62338535, -0.62338535, -0.62338535, -0.62338535],
       [-0.6495707 , -0.6495707 , -0.6495707 , -0.6495707 , -0.6495707 ],
       [-0.64999308, -0.64999308, -0.64999308, -0.64999308, -0.64999308],
       [-0.64999989, -0.64999989, -0.64999989, -0.64999989, -0.64999989],
       [-0.65      , -0.65      , -0.65      , -0.65      , -0.65      ]])

Now you have a real NumPy array.

Update

my "txt" file actually has another value before every vector set

vectors = []
values = []
with open('output.txt') as result:
    for line in result.readlines():
        value, vector = line.split(',', 1)
        values.append(float(value))
        vectors.append(np.fromstring(vector.strip()[1:-1], sep=' '))
values = np.array(values)
vectors = np.vstack(vectors)

>>> values
array([7.37902433e+00, 1.19024326e-01, 1.91987307e-03, 3.09677253e-05,
       4.99512193e-07])

>>> vectors
array([[-0.62338535, -0.62338535, -0.62338535, -0.62338535, -0.62338535],
       [-0.6495707 , -0.6495707 , -0.6495707 , -0.6495707 , -0.6495707 ],
       [-0.64999308, -0.64999308, -0.64999308, -0.64999308, -0.64999308],
       [-0.64999989, -0.64999989, -0.64999989, -0.64999989, -0.64999989],
       [-0.65      , -0.65      , -0.65      , -0.65      , -0.65      ]])

How to remove the single quotation mark from every set of vector in the list variable?

Answers (1)

Related Questions