Reputation: 7235
I have to add a column called 'sessions' to a dataframe called merged1
. The column sessions is update in the loop and it is the list y
. However the following operations don't work:
y.append(x * len(data))
merged1['sessions'] = y
Here the code
for i in users:
merged1 = pd.DataFrame()
name = "%s" %i
y = list()
for file in glob.glob("*.csv"):
if os.path.isfile(file): # make sure it's a file, not a directory entry
if name in file: # open file
data = pd.read_csv(file)
data = data.loc[[k for j, k in enumerate(data.index) if j % 10 == 0]]
data.lat = np.round(data.lat, 6)
merged1 = pd.concat([merged1,data], ignore_index=True)
x = re.findall(r'(?<=_session)\d+', file)
y.append(x * len(data))
merged1['sessions'] = y
if len(merged1) > 0:
merged1 = merged1[merged1.lat > 45]
merged1.to_csv(string,index=False)
Upvotes: 0
Views: 84
Reputation: 90899
When you do -
y.append(x * len(data))
you are actually appending lists of size len(data) * len(x)
into y
so y
becomes a list of lists.
Hence when you do - merged1['sessions'] = y
- and the size of y
differs from the size of merged1
, it causes issues.
If you are sure that x = re.findall(r'(?<=_session)\d+', file)
would always return only 1 element , then you can use -
y.extend(x * len(data))
instead of .append()
. .extend()
extends the list with the elements from the iterable that is passed to it.
Upvotes: 2