Reputation: 21223
I have a large text file where the columns are of the following form:
1255 32627 some random stuff which might have numbers 1245
1.I would like to use read_csv
to give me a data frame with three columns. The first two columns should be dtype uint32 and the third just has everything afterwards in a string. That is the line above should be split into 1255
, 32627
and some random stuff which might have numbers 1245
. This for example does not do it but at least shows the dtypes:
pd.read_csv("foo.txt", sep=' ', header=None, dtype={0:np.uint32, 1:np.uint32, 2:np.str})
2.My second question is about the str
dtype.How much RAM does it use and if I know the max length of a string can I reduce that?
Upvotes: 1
Views: 231
Reputation: 1210
Is there a reason you need to use pd.read_csv()
? The code below is straightforward and easily modifies your column values to your requirements.
from numpy import uint32
from csv import reader
from pandas import DataFrame
file = 'path/to/file.csv'
with open(file, 'r') as f:
r = reader(f)
for row in r:
column_1 = uint32(row[0])
column_2 = uint32(row[1])
column_3 = ' '.join([str(col) for col in row[2::]])
data = [column_1, column_2, column_3]
frame = DataFrame(data)
I don't understand the question. Do you expect your strings to be extremely long? A 32-bit Python installation is limited to a string 2-3GB long. A 64-bit installation is much much larger, limited only by the amount of RAM you can stuff into your system.
Upvotes: 1
Reputation: 1336
You can use the Series.str.cat method, documentation for which is available here:
df = pd.read_csv("foo.txt", sep=' ', header=None)
# Create a new column which concatenates all columns
df['new'] = df.apply(lambda row: row.iloc[2:].apply(str).str.cat(sep = ' '),axis=1)
df = df[[0,1,'new']]
Not sure exactly what you mean by your second question but if you want to check the size of a string in memory you can use
import sys
print (sys.getsizeof('some string'))
Sorry, I have no idea how knowing the maximum length will help you in saving memory and whether that is even possible
Upvotes: 1