Reputation: 13
I have two lists in python, the first with each element composed by a string and an integer:
delta[0:10]
[('conhecimento', 17),
('ciência', 14),
('interdisciplinaridade', 13),
('saber', 10),
('objeto', 10),
('pode', 10),
('processo', 9),
('conceito', 9),
('assim', 8),
('mundo', 8)]
And a second list composed by a string and a tuple:
echo[0:10]
[('mundo', [2024]),
('assim', [2022]),
('conceito', [1599, 1602, 1862, 1865]),
('processo', [1949, 1963, 1972]),
('pode', [2018]),
('objeto', [1566, 1605]),
('saber', [2016]),
('interdisciplinaridade', [2014]),
('ciência', [2013,756]),
('conhecimento, [2011, 2223])]
Both lists have the same length, because they were made with the same dataset, so they all share the same string elements.
len(echo)
1398
len(delta)
1398
All string elements are present in both lists but in a different order. I need to build a third list where the first index is the common string present in both lists, it also has to be followed by the integer, as in the first list, and the respective tuple, associated with the string that is also present in the second list. In the end, I intend the final merged list to look like this:
final[0:4]
[('conhecimento', 17, [2011, 2223]),
('ciência', 14, [2013,756]),
('interdisciplinaridade', 13, [2014]),
('saber', 10, [2016])]
And also, if possible, I want a method to sort the elements of the final list considering the value of the second element and another method to sort these elements considering the highest value of the third element on the final list.
Thanks in advance!
Upvotes: 1
Views: 886
Reputation: 31
The python pandas package easily solves your issue:
set up both dataframes based on your lists
import pandas as pd
# create both dataframes
# transform delta values into two columns dataframe
delta = [('conhecimento', 17),
('ciência', 14),
('interdisciplinaridade', 13),
('saber', 10),
('objeto', 10),
('pode', 10),
('processo', 9),
('conceito', 9),
('assim', 8),
('mundo', 8)]
delta_ids = [i[0] for i in delta]
delta_values = [i[1] for i in delta]
df1 = pd.DataFrame(
{'id': delta_ids, 'delta': delta_values}
)
echo = [('mundo', [2024]),
('assim', [2022]),
('conceito', [1599, 1602, 1862, 1865]),
('processo', [1949, 1963, 1972]),
('pode', [2018]),
('objeto', [1566, 1605]),
('saber', [2016]),
('interdisciplinaridade', [2014]),
('ciência', [2013,756]),
('conhecimento', [2011, 2223])]
echo_ids = [i[0] for i in echo]
echo_values = [i[1] for i in echo]
df2 = pd.DataFrame(
{'id': echo_ids, 'echo': echo_values}
)`
merge both dataframes based on "id"
df_merged = df1.merge(df2, how="left", on="id") # sort merged dataframe based on descending delta value df_merged = df_merged.sort_values(by=["delta",], ascending=[0,]) # output the final dataframe df_merged # output in list form output = df_merged.values.tolist() print(output)
output
id delta echo
0 conhecimento 17 [2011, 2223]
1 ciência 14 [2013, 756]
2 interdisciplinaridade 13 [2014]
3 saber 10 [2016]
4 objeto 10 [1566, 1605]
5 pode 10 [2018]
6 processo 9 [1949, 1963, 1972]
7 conceito 9 [1599, 1602, 1862, 1865]
8 assim 8 [2022]
9 mundo 8 [2024]
[['conhecimento', 17, [2011, 2223]], ['ciência', 14, [2013, 756]], ['interdisciplinaridade', 13, [2014]], ['saber', 10, [2016]], ['objeto', 10, [1566, 1605]], ['pode', 10, [2018]], ['processo', 9, [1949, 1963, 1972]], ['conceito', 9, [1599, 1602, 1862, 1865]], ['assim', 8, [2022]], ['mundo', 8, [2024]]]
Upvotes: 1
Reputation: 38502
You can do this way with iterating one list of tuple and make another list of tuple to dict
where you can look up for values and finally append it like below-
delta = [
("conhecimento", 17),
("ciência", 14),
("interdisciplinaridade", 13),
("saber", 10),
("objeto", 10),
("pode", 10),
("processo", 9),
("conceito", 9),
("assim", 8),
("mundo", 8),
]
echo = [
("mundo", [2024]),
("assim", [2022]),
("conceito", [1599, 1602, 1862, 1865]),
("processo", [1949, 1963, 1972]),
("pode", [2018]),
("objeto", [1566, 1605]),
("saber", [2016]),
("interdisciplinaridade", [2014]),
("ciência", [2013, 756]),
("conhecimento", [2011, 2223]),
]
final = []
lookup = dict(echo)
for a, b in delta:
final.append((a, b, lookup.get(a)))
print(final)
Output:
[
("conhecimento", 17, [2011, 2223]),
("ciência", 14, [2013, 756]),
("interdisciplinaridade", 13, [2014]),
("saber", 10, [2016]),
("objeto", 10, [1566, 1605]),
("pode", 10, [2018]),
("processo", 9, [1949, 1963, 1972]),
("conceito", 9, [1599, 1602, 1862, 1865]),
("assim", 8, [2022]),
("mundo", 8, [2024]),
]
Upvotes: 1
Reputation: 1491
as was mentioned above pandas is the obvious way to do such a kind of things, this is another approach:
import pandas as pd
res = (pd.concat([pd.Series(dict(delta)),pd.Series(dict(echo))],axis=1)
.reset_index().values.tolist())
>>> res
'''
[['conhecimento', 17, [2011, 2223]],
['ciência', 14, [2013, 756]],
['interdisciplinaridade', 13, [2014]],
['saber', 10, [2016]],
['objeto', 10, [1566, 1605]],
['pode', 10, [2018]],
['processo', 9, [1949, 1963, 1972]],
['conceito', 9, [1599, 1602, 1862, 1865]],
['assim', 8, [2022]],
['mundo', 8, [2024]]]
it will work till "All string elements are present in both lists"
Upvotes: 1