Delphine Massoud
Delphine Massoud

Reputation: 1

Use component tExtractJSONFields in Talend

I'm a beginner with Talend. I try to create a new job where I read JSON data by tREST component. Then I want to extract data to insert into a MySql database.

My component tREST give me out the following data in JSON format (validated on jsonlint)

{
"expand": "names,schema",
"startAt": 0,
"maxResults": 50,
"total": 1,
"issues": [
    {
        "expand": "editmeta,renderedFields,transitions,changelog,operations",
        "id": "97777",
        "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/issue/97777",
        "key": "MMMMMM-888",
        "fields": {
            "progress": {
                "progress": 7200,
                "total": 7200,
                "percent": 100
            },
            "summary": "Generate package for 88888888 1.0.3",
            "customfield_10083": "1_*:*_1_*:*_363000_*|*_10000_*:*_1_*:*_10000_*|*_5_*:*_1_*:*_0",
            "customfield_10082": null,
            "customfield_10360": null,
            "issuetype": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/issuetype/3",
                "id": "3",
                "description": "A task that needs to be done.",
                "iconUrl": "https://issuetracker.aaaaa.com/jira/images/icons/issuetypes/task.png",
                "name": "Task",
                "subtask": false
            },
            "customfield_10110": null,
            "customfield_10260": null,
            "customfield_10261": null,
            "resolution": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/resolution/1",
                "id": "1",
                "description": "A fix for this issue is checked into the tree and tested.",
                "name": "Fixed"
            },
            "fixVersions": [],
            "resolutiondate": "2014-12-05T17:24:01.000+0100",
            "timespent": 7200,
            "reporter": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/user?username=ext_aaaaaa_dupont",
                "name": "ext_aaaaaa_dupont",
                "emailAddress": "[email protected] ",
                "avatarUrls": {
                    "16x16": "https://issuetracker.aaaaa.com/jira/secure/useravatar?size=small&avatarId=10122",
                    "48x48": "https://issuetracker.aaaaa.com/jira/secure/useravatar?avatarId=10122"
                },
                "displayName": "Jean-Philippe Dupont",
                "active": true
            },
            "aggregatetimeoriginalestimate": null,
            "updated": "2014-12-05T17:24:01.000+0100",
            "created": "2014-12-05T17:17:48.000+0100",
            "description": null,
            "priority": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/priority/7",
                "iconUrl": "https://issuetracker.aaaaa.com/jira/images/icons/priority_normal.gif",
                "name": "Normal",
                "id": "7"
            },
            "customfield_10120": null,
            "duedate": null,
            "issuelinks": [],
            "watches": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/issue/MMMMMM-815/watchers",
                "watchCount": 0,
                "isWatching": false
            },
            "customfield_10101": "0",
            "subtasks": [],
            "customfield_10100": null,
            "status": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/status/5",
                "description": "A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.",
                "iconUrl": "https://issuetracker.aaaaa.com/jira/images/icons/statuses/resolved.png",
                "name": "Resolved",
                "id": "5"
            },
            "customfield_10090": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/customFieldOption/10071",
                "value": "Maintenance",
                "id": "10071"
            },
            "labels": [],
            "workratio": -1,
            "assignee": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/user?username=ext_aaaaaa_dupont",
                "name": "ext_aaaaaa_dupont",
                "emailAddress": "[email protected] ",
                "avatarUrls": {
                    "16x16": "https://issuetracker.aaaaa.com/jira/secure/useravatar?size=small&avatarId=10122",
                    "48x48": "https://issuetracker.aaaaa.com/jira/secure/useravatar?avatarId=10122"
                },
                "displayName": "Jean-Philippe Dupont",
                "active": true
            },
            "aggregatetimeestimate": 0,
            "project": {
                "self": "https://issuetracker.aaaaa.com/jira/rest/api/2/project/MMMMMM",
                "id": "10865",
                "key": "MMMMMM",
                "name": "Third Part Maintenance",
                "avatarUrls": {
                    "16x16": "https://issuetracker.aaaaa.com/jira/secure/projectavatar?size=small&pid=10865&avatarId=10299",
                    "48x48": "https://issuetracker.aaaaa.com/jira/secure/projectavatar?pid=10865&avatarId=10299"
                }
            },
            "versions": [],
            "customfield_10170": null,
            "environment": null,
            "timeestimate": 0,
            "aggregateprogress": {
                "progress": 7200,
                "total": 7200,
                "percent": 100
            },
            "lastViewed": null,
            "customfield_10464": null,
            "customfield_10463": null,
            "components": [],
            "customfield_10462": null,
            "timeoriginalestimate": null,
            "customfield_10461": null,
            "customfield_10460": null,
            "aggregatetimespent": 7200
        }
    }
]

}

I get the following error on the tExtractJSONFields component:

Error on line 1 of document : Le contenu des éléments doit inclure un balisage ou des caractères au format correct. Nested exception: Le contenu des éléments doit inclure un balisage ou des caractères au format correct.

with the following settings:

JSON Field : Body
Loop XPath Query : "/issues"
Mapping | Column | XPath query
    | id     |  "./id"

I don't see what's wrong with my setup ...

1

Upvotes: 0

Views: 5858

Answers (3)

Vinayak Bhat
Vinayak Bhat

Reputation: 1

  1. Check if the input row has Body as JSON field

  2. Change the 'Loop Xpath Query' to "/issues[*]"

  3. Add a tLogRow component after tREST component and filter only the Body, i.e block the ERROR_CODE which is generated and added by tREST component.

Upvotes: 0

Charudatta
Charudatta

Reputation: 45

Please find below the JIRA link from TalendForge tExtractJsonFields can't process Unicode

There is an issue with how tExtractJsonFields parses Json when encountered with the invalid xml characters.They have provided json-path in version 5.6.2, this can be used instead of Xpath Or else you can go with custom java code in Talend.Here is the link from stack overflow Json Parsing using custom routine in talend

Upvotes: 0

user4542931
user4542931

Reputation:

"16x16": "https://issuetracker.aaaaa.com/jira/secure/projectavatar?size=small&pid=10865&avatarId=10299",
                "48x48": "https://issuetracker.aaaaa.com/jira/secure/projectavatar?pid=10865&avatarId=10299"
            }

There are places where such tags appear "16x16" and "48x48". These tags may be valid in json but not in xml and the way things work in talend, the json will be first converted to xml and then extracted. These tags are then considered invalid in their xml format and cause issues.

Upvotes: 1

Related Questions