diborbi
diborbi

Reputation: 83

MySQL insert data from XML file

I have one xml file like this:

<dbReference type="PM" id="17224074"/>
<dbReference type="DOI" id="10.1186/bcr1637"/>
</citation>
<scope>VARIANTS ILE-282 AND ASN-777</scope>
</reference>
<comment type="function">
<text evidence="24">Calcium.</text>
</comment>
<comment type="function">
<text evidence="24">Has a strong inhibitory effect on APP C99 and C83 production.</text>
</comment>
<comment type="subunit">
<text evidence="5 13">Homodimer; disulfide-linked.</text>
</comment>
<comment type="interaction">
<interactant intactId="EBI-727477"/>
<interactant intactId="EBI-7644904">
<id>Q9JIY2</id>
<label>Cbll1</label>
</interactant>
<organismsDiffer>true</organismsDiffer>
<experiments>21</experiments>
</comment>

I want to extract only the information in

<comment type="function">...</comment>

In this example being: 'Calcium.' AND 'as a strong inhibitory effect on APP C99 and C83 production.'

I have this table where I want to save the data:

CREATE TABLE IF NOT EXISTS INFORMATION (id varchar(255) NOT NULL, name varchar(255), entry varchar(255), comment longtext, PRIMARY KEY (id)); 

where I would save 'Calcium.' AND 'as a strong inhibitory effect on APP C99 and C83 production.' in column named 'comment'. I thought that I could insert this info directly from the xml to the table with LOAD XML but my xml file has too many different fields. How do I do this? Do I have to first extract the data from the xml and then insert into the table?

Upvotes: 0

Views: 87

Answers (2)

wchiquito
wchiquito

Reputation: 16551

One option that can be useful:

File: '/path/to/xml/file/xmlfile.xml'

<dbReference type="PM" id="17224074"/>
<dbReference type="DOI" id="10.1186/bcr1637"/>
</citation>
<scope>VARIANTS ILE-282 AND ASN-777</scope>
</reference>
<comment type="function">
    <text evidence="24">Calcium.</text>
</comment>
<comment type="function">
    <text evidence="24">Has a strong inhibitory effect on APP C99 and C83 production.</text>
</comment>
<comment type="subunit">
    <text evidence="5 13">Homodimer; disulfide-linked.</text>
</comment>
<comment type="interaction">
    <interactant intactId="EBI-727477"/>
    <interactant intactId="EBI-7644904">
        <id>Q9JIY2</id>
        <label>Cbll1</label>
    </interactant>
    <organismsDiffer>true</organismsDiffer>
    <experiments>21</experiments>
</comment>

MySQL Command-line:

mysql> DROP TEMPORARY TABLE IF EXISTS `temp_information`;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> DROP TABLE IF EXISTS `information`;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TABLE IF NOT EXISTS `information` (
    -> `id` VARCHAR(255) NOT NULL,
    -> `name` VARCHAR(255),
    -> `entry` VARCHAR(255),
    -> `comment` LONGTEXT,
    -> PRIMARY KEY (`id`)
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TEMPORARY TABLE IF NOT EXISTS `temp_information` (
    -> `id` INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    -> `type` VARCHAR(20),
    -> `text` TEXT,
    -> `evidence` VARCHAR(20)
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> LOAD XML INFILE '/path/to/xml/file/xmlfile.xml'
    -> INTO TABLE `temp_information`
    -> ROWS IDENTIFIED BY '<comment>';
Query OK, 4 rows affected (0.00 sec)
Records: 4  Deleted: 0  Skipped: 0  Warnings: 0

mysql> INSERT INTO `information` (`id`, `comment`)
    -> SELECT UUID(), GROUP_CONCAT(`text` ORDER BY `id` SEPARATOR ' ')
    -> FROM
    -> `temp_information`
    -> WHERE
    -> `type` = 'function'
    -> GROUP BY `evidence`;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> SELECT
    -> `id`,
    -> `name`,
    -> `entry`,
    -> `comment`
    -> FROM
    -> `information`;
+--------------------------------------+------+-------+------------------------------------------------------------------------+
| id                                   | name | entry | comment                                                                |
+--------------------------------------+------+-------+------------------------------------------------------------------------+
| e720d259-fcde-11e5-be3f-a4badbf9ce21 | NULL | NULL  | Calcium. Has a strong inhibitory effect on APP C99 and C83 production. |
+--------------------------------------+------+-------+------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> DROP TEMPORARY TABLE IF EXISTS `temp_information`;
Query OK, 0 rows affected (0.00 sec)

Check:

Upvotes: 2

Paul Ostrowski
Paul Ostrowski

Reputation: 1966

use an XML parser (such as SAXParser) to parse the file first, then iterate over nodes, looking for comment nodes. For each one that has a type of "function", put the node text into the DB.

Upvotes: 1

Related Questions