Andrija
Andrija

Reputation: 1017

How can I store large amount of data from a database to XML (speed problem, part three)?

After getting some responses, the current situation is that I'm using this tip: http://www.ibm.com/developerworks/xml/library/x-tipbigdoc5.html (Listing 1. Turning ResultSets into XML), and XMLWriter for Java from http://www.megginson.com/downloads/ . Basically, it reads date from the database and writes them to a file as characters, using column names to create opening and closing tags. While doing so, I need to make two changes to the input stream, namely to the dates and numbers.

// Iterate over the set
        while (rs.next()) {
            w.startElement("row");
            for (int i = 0; i < count; i++) {
                Object ob = rs.getObject(i + 1);
                if (rs.wasNull()) {
                    ob = null;
                }
                String colName = meta.getColumnLabel(i + 1);
                if (ob != null ) {
                    if (ob instanceof Timestamp) {
                        w.dataElement(colName, Util.formatDate((Timestamp)ob, dateFormat)); 
                    }
                    else if (ob instanceof BigDecimal){
                        w.dataElement(colName, Util.transformToHTML(new Integer(((BigDecimal)ob).intValue())));
                    }
                    else {
                        w.dataElement(colName, ob.toString());
                    }

                } else {
                    w.emptyElement(colName);
                }

            }
            w.endElement("row");
        }

The SQL that gets the results has the to_number command (e.g. to_number(sif.ID) ID ) and the to_date command (e.g. TO_DATE (sif.datum_do, 'DD.MM.RRRR') datum_do). The problems are that the returning date is a timestamp, meaning I don't get 14.02.2010 but rather 14.02.2010 00:00:000 so I have to format it to the dd.mm.yyyy format. The second problem are the numbers; for some reason, they are in database as varchar2 and can have leading zeroes that need to be stripped; I'm guessing I could do that in my SQL with the trim function so the Util.transformToHTML is unnecessary (for clarification, here's the method):

public static String transformToHTML(Integer number) {
String result = "";
try {
  result = number.toString();
} catch (Exception e) {}
return result;

}

What I'd like to know is a) Can I get the date in the format I want and skip additional processing thus shortening the processing time? b) Is there a better way to do this? We're talking about XML files that are in the 50 MB - 250 MB filesize category.

Upvotes: 1

Views: 358

Answers (4)

ng.
ng.

Reputation: 7189

If you have control over the format of the XML you can use Simple XML. It offers a very high performance Java POJO to XML serialization and deserialization framework. You can also intercept the deserialization of various types like the dates you mention. By implementing a transform you can simply convert the date formats. The tutorial illustrates what is possible here Simple.

Upvotes: 0

Samuel
Samuel

Reputation: 2490

Have you tried SQLXML or dbms_xmlgen? It's probably the faster way, and you can get the result as a Blob. The output of dbms_xmlgen seems very similar to your output. Examples(source: http://awads.net/wp/2005/12/19/producing-xml-from-sql-using-cursor-expressions/) :

select dbms_xmlgen.getxml('
  2          select department_name
  3          from departments
  4          where department_id in (20,110)
  5          ') xml
  6  from dual
  7  /

XML
--------------------------------------------------
<?xml version="1.0"?>
<ROWSET>
 <ROW>
  <DEPARTMENT_NAME>Marketing</DEPARTMENT_NAME>
 </ROW>
 <ROW>
  <DEPARTMENT_NAME>Accounting</DEPARTMENT_NAME>
 </ROW>
</ROWSET>

select  xmlelement ("company",
           xmlagg (xmlelement ("department",
           xmlelement ("name", dept.department_name),
           xmlelement ("emps",
             (select xmlagg (xmlelement ("employee",
               xmlattributes (emp.employee_id as "id"),
               xmlforest (emp.first_name as "first_name")))
             from    employees emp
             where emp.department_id = dept.department_id)
                       )
                   )
           )
       ).getClobVal() xml
from departments dept

<company>
    <department>
        <name>Marketing</name>
        <emps>
            <employee id = "201">
                <first_name>Michael</first_name>
            </employee>
            <employee id = "202">
                <first_name>Pat</first_name>
            </employee>
        </emps>
    </department>
    <department>
        <name>Accounting</name>
        <emps>
            <employee id = "205">
                <first_name>Shelley</first_name>
            </employee>
            <employee id = "206">
                <first_name>William</first_name>
            </employee>
        </emps>
    </department>
</company>

As you see, you can get the output as a clob with getclobval().

Formatting the date is pretty trivial with to_char and to_number() know that leading zeroes are meaningless:

  > select to_char(sysdate, 'ddmmyyyy') from dual;

TO_CHAR(
--------
08062010
> select to_number('0003') from dual;

TO_NUMBER('0003')
-----------------
                3

Upvotes: 3

Matschie
Matschie

Reputation: 1247

If you're using MSSQL you should try the following for you date issue:

SELECT CONVERT(VARCHAR(10), sif.datum_do, 104) AS datum_do_format

For your second problem I don't have a solution right now, I'm sorry.

Upvotes: 0

cyphorious
cyphorious

Reputation: 829

I'm sorry, if I don't give an exact answer to your question, but you should have a look at the castor project. It provides a very good API for transferring xml-data to java objects to database persistence and vice versa.

Upvotes: 0

Related Questions