Reputation: 285
I have an array of unix timestamps:
d = {'timestamp': [1551675611, 1551676489, 1551676511, 1551676533, 1551676554]}
df = pd.DataFrame(data=d)
timestamps = df[['timestamp']].values
That I would like to format into a concatenated string, like so:
'1551675611;1551676489;1551676511;1551676533;1551676554'
So far I have prepared this:
def format_timestamps(timestamps: np.array) -> str:
timestamps = ";".join([f"{timestamp:f}" for timestamp in timestamps])
return timestamps
Running:
format_timestamps(timestamps)
Gives the following error:
TypeError: unsupported format string passed to numpy.ndarray.__format__
Since I'm new to python I'm having trouble understanding how I can fix the error
Upvotes: 0
Views: 371
Reputation: 1557
You're getting this error because of how you extract the 'timestamp'
column values with the following line:
timestamps = df[['timestamp']].values
Accessing DataFrame column values passing a list
of column names as here will return a multi-dimensional ndarray with the top-level containing ndarray objects containing values for each column name listed for each row in the DataFrame. This approach is generally only useful when selecting multiple columns by name.
The error is being thrown by your function because eachtimestamp
here:
";".join([f"{timestamp:f}" for timestamp in timestamps])
Is an ndarray containing a single value when timestamps
is defined as in your original post - where a str
value would be desirable/expected.
To remedy this error in your code, simply replace:
timestamps = df[['timestamp']].values
With:
timestamps = df['timestamp'].values
By passing a single str
to extract a single column from your DataFrame, timestamps
will here be defined as a one-dimensional ndarray with 'timestamp'
column values for each row stored within - which will pass through your original format_timestamps
without error.
format_timestamps
Running format_timestamps(timestamps)
using the above approach and your original implementation of format_timestamps
will return:
'1551675611.000000;1551676489.000000;1551676511.000000;1551676533.000000;1551676554.000000'
This is better (no errors at least) but still not quite what you want. This root of this issue is that you are passing f
as a format specifier when joining timestamp
values, this will format each value as a float
when in actuality you want to format each value as an int
(format specifier d
).
You can either, change your format specifier from f
to d
in your function definition.
def format_timestamps(timestamps: np.array) -> str:
timestamps = ";".join([f"{timestamp:d}" for timestamp in timestamps])
return timestamps
Or simply not pass a format specifier - as timestamps
values are already numpy.int64
type.
def format_timestamps(timestamps: np.array) -> str:
timestamps = ";".join([f"{timestamp}" for timestamp in timestamps])
return timestamps
Running format_timestamps(timestamps)
using either definition above will return what you're after:
'1551675611;1551676489;1551676511;1551676533;1551676554'
Upvotes: 1
Reputation: 402814
Since you have pandas, why not consider a pandaic solution with str.cat
:
df['timestamp'].astype(str).str.cat(sep=';')
# '1551675611;1551676489;1551676511;1551676533;1551676554'
If NaNs or invalid data are a possibility, you can handle them with pd.to_numeric
:
(pd.to_numeric(df['timestamp'], errors='coerce')
.dropna()
.astype(int)
.astype(str)
.str.cat(sep=';'))
# '1551675611;1551676489;1551676511;1551676533;1551676554'
Another idea is to iterate over the list of timestamps and join:
';'.join([f'{t}' for t in df['timestamp'].tolist()])
# '1551675611;1551676489;1551676511;1551676533;1551676554'
Upvotes: 2
Reputation: 96172
It's because in your list comprehension, timestamp
is a numpy.ndarray
object. Just flatten first and convert to string:
>>> ";".join(timestamps.flatten().astype(str))
'1551675611;1551676489;1551676511;1551676533;1551676554'
Upvotes: 2
Reputation: 1272
A quick fix to your code would be:
def format_timestamps(timestamps: np.array) -> str:
timestamps = ";".join([f"{timestamp[0]}" for timestamp in timestamps])
return timestamps
Here I only replaced timestamp:f
with timestamp[0]
, so you get each timestamp as a scalar instead of an array
Upvotes: 1