Reputation: 160
I have a huge excel file that I'm trying to sort by extracting the last value (column D) collected at the end of each day (column B). These data are coming from a rain gauge. Values in column D always increases during the day (but can also stay at 0) and goes back to 0 after midnight. Data look like this:
nb | date | mm | avg |
---|---|---|---|
4669 | 23:39 23/9/2020 | 0.0 | 2.3 |
4670 | 23:44 23/9/2020 | 0.8 | 2.6 |
4671 | 23:49 23/9/2020 | 1.6 | 2.9 |
4672 | 23:54 23/9/2020 | 0.8 | 2.9 |
4673 | 23:59 23/9/2020 | 0.0 | 2.9 |
4674 | 0:04 24/9/2020 | 0.0 | 0.0 |
4675 | 0:09 24/9/2020 | 0.0 | 0.0 |
4769 | 23:39 24/9/2020 | 0.0 | 12.3 |
4770 | 23:44 24/9/2020 | 1.8 | 12.6 |
4771 | 23:49 24/9/2020 | 3.6 | 12.9 |
4772 | 23:54 24/9/2020 | 1.8 | 12.9 |
4773 | 23:59 24/9/2020 | 0.0 | 12.9 |
4774 | 0:04 25/9/2020 | 0.0 | 0.0 |
4775 | 0:09 25/9/2020 | 0.0 | 0.0 |
How can I do to simplify my table with only 1 value per day (last or maximum)?
Thank you
Upvotes: 0
Views: 308
Reputation: 160
I actually found a way, thanks to @JvdV AND @MrRightSA for their recommendations. This is what I did:
I don't think I would have found without your help. And I learned new stuff! I'm not giving up with Excel :) Thanks again.
Upvotes: 0
Reputation: 90
I would put this through PowerQuery.
You now have one value, the latest update, for each date.
The reason for the index column is because if you don't do this, the data will not remove duplicates based on the order. This forces the cache to update and store the new order of highest to lowest in step 4.
e: data posted instead of picture so here is the actual PowerQuery of it;
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"nb", Int64.Type}, {"date", type date}, {"mm", type number}, {"avg", type number}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"nb", Order.Descending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
#"Removed Duplicates" = Table.Distinct(#"Added Index", {"date"}),
#"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Index"})
in
#"Removed Columns"
To use this;
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
this line will change depending on what table this data is. So effectively just keep your line of this, not mine.
If this does not work due to the date not being a date, try the below code as it splits by space as a delimiter and then assigns the column as a date format;
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Table.TransformColumnTypes(Source, {{"date", type text}}, "en-GB"), "date", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"date.1", "date.2"}),
#"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter",{{"date.2", "date"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"date", type date}, {"nb", Int64.Type}, {"mm", type number}, {"avg", type number}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"nb", Order.Descending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
#"Removed Duplicates" = Table.Distinct(#"Added Index", {"date"}),
#"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Index", "date.1"})
in
#"Removed Columns"
Upvotes: 2
Reputation: 75840
I concur that PowerQuery could be a very nice go-to solution for you. But it's also possible with formulas, for example:
Formula in F2
:
=FILTER(A2:D15,B2:B15=XLOOKUP(INT(B2:B15),INT(B2:B15),B2:B15,,0,-1))
Upvotes: 1