Reputation: 1574
I am using lxml to make an xml file and my sample program is :
from lxml import etree
import datetime
dt=datetime.datetime(2013,11,30,4,5,6)
dt=dt.strftime('%Y-%m-%d')
page=etree.Element('html')
doc=etree.ElementTree(page)
dateElm=etree.SubElement(page,dt)
outfile=open('somefile.xml','w')
doc.write(outfile)
And I am getting the following error output :
dateElm=etree.SubElement(page,dt)
File "lxml.etree.pyx", line 2899, in lxml.etree.SubElement (src/lxml/lxml.etree.c:62284)
File "apihelpers.pxi", line 171, in lxml.etree._makeSubElement (src/lxml/lxml.etree.c:14296)
File "apihelpers.pxi", line 1523, in lxml.etree._tagValidOrRaise (src/lxml/lxml.etree.c:26852)
ValueError: Invalid tag name u'2013-11-30'
I thought it of a Unicode Error, so tried changing encoding of 'dt' with codes like
str(dt)
unicode(dt).encode('unicode_escape')
dt.encocde('ascii','ignore')
dt.encode('ascii','decode')
and some others also, but none worked and same error msg generated.
Upvotes: 6
Views: 19141
Reputation: 414565
It is not about Unicode. There is no 2013-11-30
tag in HTML. You could use time
tag instead:
#!/usr/bin/env python
from datetime import date
from lxml.html import tostring
from lxml.html.builder import E
datestr = date(2013, 11, 30).strftime('%Y-%m-%d')
page = E.html(
E.title("date demo"),
E('time', "some value", datetime=datestr))
with open('somefile.html', 'wb') as file:
file.write(tostring(page, doctype='<!doctype html>', pretty_print=True))
Upvotes: 1
Reputation: 51002
You get the error because element names are not allowed to begin with a digit in XML. See http://www.w3.org/TR/xml/#sec-common-syn and http://www.w3.org/TR/xml/#sec-starttags. The first character of a name must be a NameStartChar
, which disallows digits.
An element such as <2013-11-30>...</2013-11-30>
is invalid.
An element such as <D2013-11-30>...</D2013-11-30>
is OK.
If your program is changed to use ElementTree instead of lxml (from xml.etree import ElementTree as etree
instead of from lxml import etree
), there is no error. But I would consider that a bug. lxml does the right thing, ElementTree does not.
Upvotes: 10