dataminer
dataminer

Reputation: 41

Saving a long list into csv in pandas

I am trying to save a list(having 7000 values) into a csv file. But when i open the csv the list is truncated like this '[1 2 3 ... 6999 7000]' and is also stored as a string. Is there a way to store a list of long list in a csv without the values getting truncated.

x = []
a = np.arange(0,7000,1)
x.append(a)
b  = np.arange(7001,14000,1)
x.append(b)
x

Out: [array([   0,    1,    2, ..., 6997, 6998, 6999]),
 array([ 7001,  7002,  7003, ..., 13997, 13998, 13999])]


df = pd.DataFrame({"x":x})
df.to_csv("x.csv")
df = pd.read_csv("x.csv")
df["x"][0]

Out: '[   0    1    2 ... 6997 6998 6999]'

type(df["x"][0])
Out: str

Upvotes: 0

Views: 2008

Answers (2)

King Peanut
King Peanut

Reputation: 116

Because the string representation of numpy array is truncated. One other way is converting the numpy array to a python list before saving it to your csv file.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'long_list': [np.arange(0, 7000).tolist()]
})

df.to_csv('temp.csv')

Upvotes: 0

Ferris
Ferris

Reputation: 5601

if you want to save data to csv, just transform data type to string str.

import pandas as pd
import numpy as np
alist = []
a = np.arange(0,7000,1)
alist.append(a)
b  = np.arange(7001,14000,1)
alist.append(b)
df = pd.DataFrame({"alist":alist})

# merge data as string
df['alist'] = df['alist'].map(lambda x: ','.join(map(str, x)))
df.to_csv("list.csv", index=False)

Read the csv file:

dfn = pd.read_csv("list.csv")
dfn['alist'] = dfn['alist'].str.split(',')
dfn['alist'] = dfn['alist'].map(lambda x: list(map(int, x)))
dfn['alist'][0]

or just consider another way:

# Examples
# For the simplest code, use the dump() and load() functions.

import pickle

# An arbitrary collection of objects supported by pickle.
data = {
    'a': [1, 2.0, 3, 4+6j],
    'b': ("character string", b"byte string"),
    'c': {None, True, False}
}

with open('data.pickle', 'wb') as f:
    # Pickle the 'data' dictionary using the highest protocol available.
    pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
# The following example reads the resulting pickled data.

with open('data.pickle', 'rb') as f:
    # The protocol version used is detected automatically, so we do not
    # have to specify it.
    data = pickle.load(f)

Upvotes: 1

Related Questions