user16454053
user16454053

Reputation: 109

reading data-frame with missing values

I am trying to read some df with few columns and few rows where in some rows data are missing. For example df looks like this, also elements of the df are separated sometimes with uneven number of spaces:

0.5 0.03   
0.1  0.2  0.3  2 
0.2  0.1   0.1  0.3
0.5 0.03  
0.1  0.2   0.3  2

Is there any way to extract this:

0.1  0.2  0.3  2 
0.2  0.1   0.1  0.3
0.1  0.2   0.3  2

Any suggestions.

Thanks.

Upvotes: 0

Views: 46

Answers (2)

Olasimbo
Olasimbo

Reputation: 1063

You can try this:

import pandas as pd
import numpy as np

df = {
    'col1': [0.5, 0.1, 0.2, 0.5, 0.1],
    'col2': [0.03, 0.2, 0.1, 0.03, 0.2],
    'col3': [np.nan, 0.3, 0.1, np.nan, 0.3],
    'col4': [np.nan, 2, 0.3, np.nan, 2]
}

data = pd.DataFrame(df)

print(data.dropna(axis=0))

Output:

   col1  col2  col3  col4
   0.1   0.2   0.3   2.0
   0.2   0.1   0.1   0.3
   0.1   0.2   0.3   2.0

Upvotes: 0

Corralien
Corralien

Reputation: 120409

You can parse manually your file:

import re

with open('data.txt') as fp:
    df = pd.DataFrame([re.split(r'\s+', l.strip()) for l in fp]).dropna(axis=0)

Output:

>>> df
     0    1    2    3
1  0.1  0.2  0.3    2
2  0.2  0.1  0.1  0.3
4  0.1  0.2  0.3    2

Upvotes: 1

Related Questions