Shenanigator
Shenanigator

Reputation: 1066

Pandas dtypes.to_dict giving too many values to unpack

I am using pandas to send data to a Redshift and in order to make sure that pandas is using the right VARCHAR lengths I am trying to create an auto mapping routine. I am doing this because without it I am getting an error that there is a string too long for VARCHAR(256).

here is my code to create the dict:

dtypes = df.dtypes.apply(lambda x: x.name).to_dict()

Here is the dict output:

{'IssueId': 'object', 'Title': 'object', 'Status': 'object', 'Severity': 'float64', 'NextStepAction': 'object', 'AssignedFolderLabel': 'object', 'Labels': 'object', 'RequesterIdentity': 'object', 'CreateDate': 'datetime64[ns, UTC]', 'LastUpdatedDate': 'datetime64[ns, UTC]', 'ResolvedDate': 'datetime64[ns, UTC]', 'AssigneeIdentity': 'object', 'Priority': 'float64', 'Rank': 'float64', 'PlanningEstimate': 'float64', 'EstimatedStartDate': 'datetime64[ns, UTC]', 'EstimatedCompletionDate': 'datetime64[ns, UTC]', 'ActualStartDate': 'datetime64[ns, UTC]', 'ActualCompletionDate': 'datetime64[ns, UTC]', 'NeedByDate': 'datetime64[ns, UTC]', 'IssueUrl': 'object', 'SubmitterIdentity': 'object', 'ResolvedByIdentity': 'object', 'region': 'object', 'cluster': 'object', 'az': 'object', 'site_id': 'object', 'requesting_org': 'object', 'request_type': 'object', 'requested_need_by_date': 'datetime64[ns, UTC]', 'fe_discipline': 'object', 'related_sim_link': 'object', 'playbook_link': 'object', 'fbn': 'object', 'design_sow_complete_date': 'datetime64[ns, UTC]', 'po_sent_to_vendor_date': 'datetime64[ns, UTC]', 'design_qc_checklist_uploaded': 'float64', 'peer_review_alias': 'object', 'as_built_submittal_rfi_uploaded_to_ds_ticket_link': 'object', 'total_number_of_positions': 'float64', 'reason_sla_was_not_met': 'object', 'will_this_feasibility_study_require_an_fe_design': 'float64', 'initial_kva_total': 'float64', 'fefr_fsb': 'object', 'eam_asset_id': 'float64', 'eam_equipment_class': 'object', 'internal_or_external': 'object', 'if_internal_select_one': 'object', 'if_external_is_the_vendor_onboarded': 'object', 'is_the_training_in_person_or_online': 'object', 'Snapshot': 'datetime64[ns]'}

Here is the code I was attempting to use to only keep the object types:

for k,v in dtypes:
    if v != 'object':
        dtypes.pop(k)

When I run the above it gives me ValueError: too many values to unpack (expected 2)

I can't figure out why, is it because of the date columns with the [] in them? I tried dtypes = df.dtypes.to_dict() as well and it is the same error.

Upvotes: 1

Views: 179

Answers (2)

yatu
yatu

Reputation: 88305

Seems like you could simplify the above to:

  • Keeping only Object type columns with select_dtypes
  • Building a dict with the desired structure with dtypes.to_dict()

df.select_dtypes('O').dtypes.to_dict()

Upvotes: 3

Poojan
Poojan

Reputation: 3519

  • You need to use .items method to get key, value pair.
for k,v in dtypes.items():
    if v != 'object':
        dtypes.pop(k)

Upvotes: 2

Related Questions