python - Read and Write special characters in XML with minidom -
i'm trying write , read set of strings in elements called objects has 2 attributes name
(simple string ) , body
body string special characters "\n" , "\" i'm using following code writiing xml file :
from xml.dom.minidom import document doc = document() root = doc.createelement('data') doc.appendchild(root) #create scene scene = doc.createelement('scene') root.appendchild(scene) #add object element object = doc.createelement('object') object.setattribute('name', 'obj1') txt= 'text\nsome text\nanother one\\and on\n' object.setattribute('body',txt ) scene.appendchild(object) #write file file_handle = open("filename.xml","wb") file_handle.write(bytes(doc.toprettyxml(indent='\t'), 'utf-8')) file_handle.close()
and produces file
<?xml version="1.0" ?> <data> <scene> <object body="text text one\and on " name="obj1"/> </scene> </data>
and parsing :
filepath = 'file.xml' dom = minidom.parse(filepath) scenes =dom.getelementsbytagname('scene') scene in scenes: txt_objs =scene.getelementsbytagname('object') obj in txt_objs: obj_name = obj.getattribute('name') obj_body = obj.getattribute('body') print(obj_name," ",obj_body)
the output of parser not same stored newline special char lost, how maintain same output input
#parser output obj1 text text one\and on
what proper way of storing , retrieving string special characters ?
that behavior demonstrated minidom align w3c recommendation. see following discussion: "are line breaks in xml attribute values valid?". quoted answer @jancetkovsky here easy reference :
it valid, according w3c recommendation xml parser should normalize whitespace characters space (0x20) - output of examples differ (you should have new line on output " ", space in first case). [source]
if have control on xml document structure (seems have constructed xml yourself), put text xml element value instead of xml attribute value :
..... #add object element obj = doc.createelement('object') obj.setattribute('name', 'obj1') txt = 'text\nsome text\nanother one\\and on\n' txt_node = doc.createtextnode(txt) obj.appendchild(txt_node) scene.appendchild(obj) .....
Comments
Post a Comment