S23
S23

Reputation: 137

Convert times to designated time format and apply to y-axis of plotly graph

I am currently attempting to create a web dashboard for analytics in Formula1 using plotly and flask as per the article An Interactive Web Dashboard with Plotly and Flask.

I have lap times in string format of MM:SS:sss (where MM is minute and sss is milliseconds) and I have attempted (python script below) to convert this to quantifiable values using datetime.timedelta so that I am able to graph them and manipulate them ( e.g find the average time of a driver over a number of laps). However when I try graphing the timedelta objects in plotly, they are displayed in microseconds.

Is it possible to create timedelta objects in the specified time-format and quantify them so that plotly will graph them correctly?

from datetime import timedelta

times = ["1:23.921", "1:24.690", "1:24.790"]

# convert to timedelta object
def string_con(string_time):
    new_time = timedelta(minutes=int(string_time.split(
        ":")[0]), seconds=int((string_time.split(":")[1]).split(".")[0]),
        milliseconds=int((string_time.split(":")[1]).split(".")[1]))
    return new_time


# compute average pace using timedelta objects
def average_pace(laps):
    laps = list(map(string_con, laps))
    return (sum(laps, timedelta(0))/len(laps))

print(average_pace(times))

Upvotes: 1

Views: 2259

Answers (2)

martineau
martineau

Reputation: 123463

You need to convert a timedelta's numeric internal values into different time units from what it stores them in, which are days, seconds, and microseconds.

Since you say you can't convert them to strings yourself, a potential workaround might be to convert them into a timedelta subclass which converts itself into a string the way you want.

One thing to keep in mind is that a timedelta can potentially hold huge values, which must be dealt with in some manner — so just saying you want "MM:SS.sss" format ignores the fact that theoretically there could also be days and hours involved. The function below calculates them, but only displays their values when they're non-zero.

The code below defines a new MyTimeDelta subclass and uses it. I've defined the subclass' __str__() method to return a string in the desired format. This former function is now used whenever instances of the new class are converted to strings, but the class as a whole remains "numerical" like it's base class. The subclass' __str__() method uses a private helper method I also added named _convert_units().

from datetime import timedelta

class MyTimeDelta(timedelta):
    @classmethod
    def from_another(cls, other):
        if not isinstance(other, timedelta):
            raise TypeError('unsupported type')
        return cls(days=other.days, seconds=other.seconds, microseconds=other.microseconds)

    def __str__(self):
        """ Format a timedelta into this format D:H:MM:SS.sss """
        res = []
        days, hours, minutes, seconds = self._convert_units()
        if days:
            res.append(f'{days}:')
        if hours or days:
            res.append(f'{hours}:')
        if minutes or hours or days:
            res.append(f'{minutes:02d}:')
        res.append(f'{seconds}')
        return ''.join(res)

    def _convert_units(self):
        """ Convert a timedelta to days, hours, minutes, & seconds."""
        days = self.days
        hours, remainder = divmod(self.seconds, 3600)
        minutes, seconds = divmod(remainder, 60)
        seconds += self.microseconds / 1e6
        return days, hours, minutes, seconds

times = ["1:23.921", "1:24.690", "1:24.790"]

def string_con(string_time):
    """ Convert string_time to timedelta object. """
    split_time = string_time.split(":")
    split_secs = split_time[1].split(".")
    mins, secs, ms = map(int, (split_time[0], split_secs[0], split_secs[1]))
    return timedelta(minutes=mins, seconds=secs, milliseconds=ms)

def average_pace(laps):
    """ Compute average pace using timedelta objects. """
    laps = [string_con(lap) for lap in laps]
    return sum(laps, timedelta(0)) / len(laps)


avg = MyTimeDelta.from_another(average_pace(times))
print(f'{avg.days=}, {avg.seconds=}, {avg.microseconds=}')
print(avg)

Upvotes: 2

Rob Raymond
Rob Raymond

Reputation: 31166

  • you are correct to convert from text to an analytical representation. I have used Timedelta as well. In some ways it would be simpler to use nanoseconds
  • you also need to convert back in axis ticks and hover text. I've used a utility function for this
  • it all comes together such that you can create plotly plots of lap times that are correct and human readable ;-)
import requests
import pandas as pd
import plotly.express as px

# get some lap timing data
df = pd.concat([
        pd.json_normalize(requests.get(f"https://ergast.com/api/f1/2021/7/laps/{l}.json").json()
                          ["MRData"]["RaceTable"]["Races"][0]["Laps"][0]["Timings"]
        ).assign(lap=l)
        for l in range(1, 25)
    ]).reset_index(drop=True)
# convert to timedelta...
df["time"] = (
    df["time"]
    .str.extract(r"(?P<minute>[0-9]+):(?P<sec>[0-9]+).(?P<milli>[0-9]+)")
    .apply(
        lambda r: pd.Timestamp(year=1970,month=1,day=1,
                               minute=int(r.minute),second=int(r.sec),microsecond=int(r.milli) * 10 ** 3,
        ),
        axis=1,
    )
    - pd.to_datetime("1-jan-1970").replace(hour=0, minute=0, second=0, microsecond=0)
)

# utility build display string from nanoseconds
def strfdelta(t, fmt="{minutes:02d}:{seconds:02d}.{milli:03d}"):
    d = {}
    d["minutes"], rem = divmod(t, 10 ** 9 * 60)
    d["seconds"], d["milli"] = divmod(rem, 10 ** 9)
    d["milli"] = d["milli"] // 10**6
    return fmt.format(**d)

# build a figure with lap times data...  NB use of hover_name for formatted time
fig = px.scatter(
    df,
    x="lap",
    y="time",
    color="driverId",
    hover_name=df["time"].astype(int).apply(strfdelta),
    hover_data={"time":False},
    size=df.groupby("lap")["time"].transform(
        lambda s: s.rank(ascending=True).eq(1).astype(int)
    ),
)
# make figure more interesting... add best/worst and mean lap times...
fig.add_traces(
    px.line(
        df.groupby("lap")
        .agg(
            avg=("time", lambda s: s.mean()),
            min=("time", lambda s: s.min()),
            max=("time", lambda s: s.max()),
        )
        .reset_index(),
        x="lap",
        y=["avg", "min", "max"],
    ).data
)

# fix up tick labels
ticks = pd.Series(range(df["time"].astype(int).min() - 10 ** 10,df["time"].astype(int).max(),10 ** 10,))
fig.update_layout(
    yaxis={
        "range": [
            df["time"].astype(int).min() - 10 ** 10,
            df["time"].astype(int).max(),
        ],
        "tickmode": "array",
        "tickvals": ticks,
        "ticktext": ticks.apply(strfdelta)
    }
)



enter image description here

Upvotes: 1

Related Questions