Reputation: 2419
I am looking at building lists of lists within a dictionary from an Excel spreadsheet.
My spreadsheet looks like this:
source_item_id | target_item_id | find_sting | replace_sting |
---|---|---|---|
source_id1 | target_id1 | abcd1 | efgh1 |
source_id1 | target_id1 | ijkl1 | mnop1 |
source_id1 | target_id2 | abcd2 | efgh2 |
source_id1 | target_id2 | ijkl2 | mnop2 |
source_id2 | target_id3 | qrst | uvwx |
source_id2 | target_id3 | yzab | cdef |
source_id2 | target_id4 | ghij | klmn |
source_id2 | target_id4 | opqr | stuv |
My output dictionary should looks like this:
{ "source_id1": [{ "target_id1": [{ "find_string": "abcd1", "replace_string": "efgh1" }, { "find_string": "ijkl1", "replace_string": "mnop1" }] }, { "target_id2": [{ "find_string": "abcd2", "replace_string": "efgh2" }, { "find_string": "ijkl2", "replace_string": "mnop2" }] }], "source_id2": [{ "target_id3": [{ "find_string": "qrst", "replace_string": "uvwx" }, { "find_string": "yzab", "replace_string": "cdef" }] }, { "target_id4": [{ "find_string": "ghij", "replace_string": "klmn" }, { "find_string": "opqr", "replace_string": "stuv" }] }] }
With the following code I only get the last values in each of the lists:
import xlrd xls_path = r"C:\data\ItemContent.xlsx" book = xlrd.open_workbook(xls_path) sheet_find_replace = book.sheet_by_index(1) find_replace_dict = dict() for line in range(1, sheet_find_replace.nrows): source_item_id = sheet_find_replace.cell(line, 0).value target_item_id = sheet_find_replace.cell(line, 1).value find_string = sheet_find_replace.cell(line, 2).value replace_sting = sheet_find_replace.cell(line, 3).value find_replace_list = [{"find_string": find_string, "replace_sting": replace_sting}] find_replace_dict[source_item_id] = [target_item_id] find_replace_dict[source_item_id].append(find_replace_list) print(find_replace_dict)
--> result
{ "source_id1": ["target_id2", [{ "find_string": "ijkl2", "replace_sting": "mnop2" } ]], "source_id2": ["target_id4", [{ "find_string": "opqr", "replace_sting": "stuv" } ]] }
Upvotes: 0
Views: 89
Reputation: 21269
Your problem is rather complicated by the fact that you have a list of single-key dictionaries as the value of your source ids, but you can follow a pattern of parsing each line for the relevant items and, and then using those to target where you insert appends, or alternatively create new lists:
def process_line(line) -> Tuple[str, str, dict]:
source_item_id = sheet_find_replace.cell(line, 0).value
target_item_id = sheet_find_replace.cell(line, 1).value
find_string = sheet_find_replace.cell(line, 2).value
replace_string = sheet_find_replace.cell(line, 3).value
return source_item_id, target_item_id, {
"find_string": find_string,
"replace_string": replace_string
}
def find_target(target: str, ls: List[dict]) -> int:
# Find the index of the target id in the list
for i in len(ls):
if ls[i].get(target):
return i
return -1 # Or some other marker
import xlrd
xls_path = r"C:\data\ItemContent.xlsx"
book = xlrd.open_workbook(xls_path)
sheet_find_replace = book.sheet_by_index(1)
result_dict = dict()
for line in range(1, sheet_find_replace.nrows):
source, target, replacer = process_line(line)
# You can check here that the above three are correct
source_list = result_dict.get(source, []) # Leverage the default value of the get function
target_idx = find_target(target, source_list)
target_dict = source_list[target_idx] if target_idx >=0 else {}
replace_list = target_dict.get(target, [])
replace_list.append(replacer)
target_dict[target] = replace_list
if target_idx >= 0:
source_list[target_idx] = target_dict
else:
source_list.append(target_dict)
result_dict[source] = source_list
print(result_dict)
I would note that if source_id
pointed to a dictionary rather than a list, this could be radically simplified, since we wouldn't need to search through the list for a potentially already-existing list item and then awkwardly replace or append as needed. If you can change this constraint (remember, you can always convert a dictionary to a list downstream), I might consider doing that.
Upvotes: 1