cssidy
cssidy

Reputation: 405

How To Migrate RichTextField to StreamField?

In my blog post model I am trying to migrate a RichTextField to a StreamField. I have followed the Wagtail docs "Migrating RichTextFields to StreamField" including the section on migrating blog posts with revisions. They were not effective. How do I turn a RichTextField into a StreamField?

This is for a blog using Django 1.11.13, Wagtail 2.1 and PostgreSQL. I have over 200 blog posts, many of them with the Live+Draft status meaning they have unpublished revisions. I inspected the blog posts in the database, it looks like their body fields are stored as HTML.

I copied over the code from the docs and changed all references to relate to my own project. Upon running migrate, I got an AttributeError that "raw_text" is not found. So I created an exception to pass over it. I applied the migration and it completed with an OK.

Then in models.py I changed my class's body attribute from a RichTextField to a StreamField with a RichFieldBlock. I also changed its content panel from a FieldPanel to a StreamFieldPanel. I applied this migration and it completed with an OK.

When I viewed some posts in Wagtail admin, all the posts with a Live+Draft status were converted to RichTextBlocks inside StreamFields, however, their content was wrapped inside a JSON object called {'rich_text': ''}. The JSON object was not styled like the rest of the text inside the editor. When I viewed those posts live no data showed up, I assume because the template could't read JSON. All the blog posts with a Live status also had the RichTextField converted to StreamField, but their content was empty. Their data was erased from the editor. When I viewed them live they were blank. However, when I inspect them in the database their body fields still contain the previous HTML that I saw.

This is a Live+Draft post in admin:

This is a Live+Draft post in admin

This is a Live post in admin:

This is a Live post in admin

I tried to install a fresh copy of the database after I ran the two migrations and was seeing odd data, and that didn't improve things.

template.html:

<section>
    {{ page.body }}
</section>

models.py before I ran the conversion migration:

class BlogPost(Page):

    body = RichTextField(blank=True)

    content_panels = Page.content_panels + [
            FieldPanel('body'),
        ]

migration.py, I added an exception for the AttributeError within the page_to_streamfield() function because raw_text was not found:

# -*- coding: utf-8 -*-
# Generated by Django 1.11.13 on 2019-05-01 13:46
from __future__ import unicode_literals

import json

from django.core.serializers.json import DjangoJSONEncoder
from django.db import migrations, models

from wagtail.core.rich_text import RichText


def page_to_streamfield(page):
    changed = False
    try:
        if page.body.raw_text and not page.body:
            page.body = [('rich_text', {'rich_text': RichText(page.body.raw_text)})]
            changed = True
    except AttributeError:
        pass
    return page, changed


def pagerevision_to_streamfield(revision_data):
    changed = False
    body = revision_data.get('body')
    if body:
        try:
            json.loads(body)
        except ValueError:
            revision_data['body'] = json.dumps(
                [{
                    "value": {"rich_text": body},
                    "type": "rich_text"
                }],
                cls=DjangoJSONEncoder)
            changed = True
        else:
            # It's already valid JSON. Leave it.
            pass
    return revision_data, changed


def page_to_richtext(page):
    changed = False
    if page.body.raw_text is None:
        raw_text = ''.join([
            child.value['rich_text'].source for child in page.body
            if child.block_type == 'rich_text'
        ])
        page.body = raw_text
        changed = True
    return page, changed


def pagerevision_to_richtext(revision_data):
    changed = False
    body = revision_data.get('body', 'definitely non-JSON string')
    if body:
        try:
            body_data = json.loads(body)
        except ValueError:
            # It's not apparently a StreamField. Leave it.
            pass
        else:
            raw_text = ''.join([
                child['value']['rich_text'] for child in body_data
                if child['type'] == 'rich_text'
            ])
            revision_data['body'] = raw_text
            changed = True
    return revision_data, changed


def convert(apps, schema_editor, page_converter, pagerevision_converter):
    BlogPage = apps.get_model("blog", "BlogPost")
    for page in BlogPage.objects.all():

        page, changed = page_converter(page)
        if changed:
            page.save()

        for revision in page.revisions.all():
            revision_data = json.loads(revision.content_json)
            revision_data, changed = pagerevision_converter(revision_data)
            if changed:
                revision.content_json = json.dumps(revision_data, cls=DjangoJSONEncoder)
                revision.save()


def convert_to_streamfield(apps, schema_editor):
    return convert(apps, schema_editor, page_to_streamfield, pagerevision_to_streamfield)


def convert_to_richtext(apps, schema_editor):
    return convert(apps, schema_editor, page_to_richtext, pagerevision_to_richtext)


class Migration(migrations.Migration):

    dependencies = [
        # leave the dependency line from the generated migration intact!
        ('blog', 'previous_migration'),
    ]

    operations = [
        migrations.RunPython(
            convert_to_streamfield,
            convert_to_richtext,
        ),
    ]

models.py after running the previous migration, I manually changed it to a StreamField and ran a second migration for just this change:

class BlogPost(Page):

    body = StreamField([
            ('rich_text', blocks.RichTextBlock())
        ], blank=True)

    content_panels = Page.content_panels + [
            StreamFieldPanel('body'),
        ]

I expected to see a blog post's data inside a StreamField within Wagtail admin, but instead it was blank or wrapped in a JSON object.

Upvotes: 3

Views: 1004

Answers (1)

cssidy
cssidy

Reputation: 405

I was able to migrate a RichTextField to StreamField with a RichTextBlock with this script (this assumes a schema that looks like the first 3 chapters of the Wagtail Getting Started tutorial). I found that it was easier to think about this process by breaking it into distinct steps: fresh db from backup/make backup, schema migration, data migration, and admin/template alterations. I found that I needed to loop through each BlogPost and all of its associated PageRevision. Editing the live published data was straightforward, but the drafts are stored as serialized JSON two levels deep, which was tricky to figure out how to interact with. Hopefully this script helps others. Note: this script doesn't migrate in reverse.

0004_convert_data.py

import json
from django.db import migrations
import wagtail.core.fields
from wagtail.core.rich_text import RichText


def convert_data(apps, schema_editor):
    blog_page = apps.get_model('blog', 'BlogPage')
    for post in blog_page.objects.all():
        print('\n', post.title)
        # edit the live post
        if post.body.raw_text and not post.body:
            post.body = [('paragraph', RichText(post.body.raw_text))]
            print('Updated ' + post.title)
            post.save()

        # edit drafts associated with post
        if post.has_unpublished_changes:
            print(post.title + ' has drafts...')
            for rev in post.revisions.all():
                data = json.loads(rev.content_json)
                body = data['body']
                print(body)

                print('This is current JSON:', data, '\n')
                data['body'] = json.dumps([{
                    "type": "paragraph",
                    "value": body
                }])
                rev.content_json = json.dumps(data)
                print('This is updated JSON:', rev.content_json, '\n')

                rev.save()

        print('Completed ' + post.title + '.' + '\n')


class Migration(migrations.Migration):

    dependencies = [
        ('blog', '0003_blogpage_stream'),
    ]

    operations = [
        migrations.AlterField(
            model_name='blogpage',
            name='body',
            field=wagtail.core.fields.StreamField([('paragraph', wagtail.core.blocks.RichTextBlock())], blank=True),
        ),

        migrations.RunPython(convert_data),
    ]

models.py

from django.db import models

from wagtail.core.models import Page
from wagtail.core import blocks
from wagtail.core.fields import RichTextField, StreamField
from wagtail.admin.edit_handlers import FieldPanel, StreamFieldPanel
from wagtail.images.blocks import ImageChooserBlock
from wagtail.search import index


class BlogIndexPage(Page):
    intro = RichTextField(blank=True)

    content_panels = Page.content_panels + [
        FieldPanel('intro', classname="full")
    ]


class BlogPage(Page):
    date = models.DateField("Post date")
    intro = models.CharField(max_length=250)
    # body = RichTextField(blank=True)
    body = StreamField([
        ('paragraph', blocks.RichTextBlock()),
    ], blank=True)
    stream = StreamField([
        ('heading', blocks.CharBlock(classname="full title")),
        ('paragraph', blocks.RichTextBlock()),
        ('image', ImageChooserBlock()),
    ], blank=True)

    search_fields = Page.search_fields + [
        index.SearchField('intro'),
        index.SearchField('body'),
    ]

    content_panels = Page.content_panels + [
        FieldPanel('date'),
        FieldPanel('intro'),
        StreamFieldPanel('body'),
        StreamFieldPanel('stream'),
    ]


templates/blog/blog_page.html

{% extends "base.html" %}

{% load wagtailcore_tags %}

{% block body_class %}template-blogpage{% endblock %}

{% block content %}
    <h1>{{ page.title }}</h1>
    <p class="meta">{{ page.date }}</p>

    <div class="intro">{{ page.intro }}</div>

    {{ page.body }}

    <p><a href="{{ page.get_parent.url }}">Return to blog</a></p>

{% endblock %}

Upvotes: 3

Related Questions