holys
holys

Reputation: 14769

Python yaml dump confuse

Let's say I have a json file like below, call it src.json.

{
    "main": {
        "contenttype": "Document"
    },
    "dublin": {
        "title": "ダウンロード",
        "description": "",
        "creators": [
            "池田大作"
         ],
        "created": "2012-04-23 10:09:34.642940"
    }
}

I wanna dump the dublin contents to dst.yaml using yaml in python.

Here is my code:

import json
import yaml

with open('src.json') as f:
    data = json.load(f)

dump = {'title': data.get('dublin', {}).get('title', ''),
        'description': data.get('dublin', {}).get('description', ''),
        'creator': data.get('dublin', {}).get('creators',[''])[0],
        'created': data.get('dublin', {}).get('created', '')
       }
with open('dst.yaml', 'w') as stream:
    yaml.safe_dump(dump, stream, allow_unicode=True )

However, I am not satisfied with the result:

{created: '2010-03-26 09:26:44.002029', creator: 池田大作, description: '    ', title: ダウンロード}     

The satisfied one should be :

created: 2010-03-26 09:26:44.002029
creator: 池田大作
description: ''
title: ダウンロード

Then, my questions are:

  1. Why '2010-03-26 09:26:44.002029' has single quotes while title and creator not? How to remove single quotes around the date ?
  2. Where is the line break? It should have with yaml dumping operation.

Any one help!

Upvotes: 4

Views: 4543

Answers (1)

Amber
Amber

Reputation: 526553

  1. Because it has a space in it.
  2. Set default_flow_style=False in your call to yaml.dump().

Also, you don't need to do all of that reconstruction; you could just dump the existing dublin dict directly:

import json
import yaml

with open('src.json') as f:
    data = json.load(f)

with open('dst.yaml', 'w') as stream:
    yaml.safe_dump(data.get('dublin', {}), stream, allow_unicode=True,
                   default_flow_style=False)

Upvotes: 6

Related Questions