I have a python script that reads a CSV file and writes in a XML file. I have been hitting a wall trying to find out how to read special characters such as: ç, á, é, í, etc. The script runs perfectly fine without special characters. That is the script header:
# coding=utf-8
'''
@modified by: Julierme Pinheiro
'''
import os
import sys
import unittest
from unittest import skip
import csv
import uuid
import xml
import xml.dom.minidom as minidom
import owslib
from owslib.iso import *
import pyproj
from decimal import *
import logging
The way I retrieve information from the csv file is shown bellow:
# add the title
title = data[1]
titleElement = identificationInfo[0].getElementsByTagName('gmd:title')[0]
titleNode = record.createTextNode(title)
titleElement.childNodes[1].appendChild(titleNode)
print "Title:" + title
Note: If data[1], second column in the csv file, contains a special character as found in "Navegação" the script fails (It does not write anything in the xml file).
The way a new XML file is created based on a blank Template XML is shown bellow:
# write out the gemini record
filename = '../output/%s.xml' % fileId
with open(filename,'w') as test_xml:
test_xml.write(record.toprettyxml(newl="", encoding="utf-8"))
except:
e = sys.exc_info()[1]
logging.debug("Import failed for entry %s" % data[0])
logging.debug("Specific error: %s" % e)
@skip('')
def testOWSMetadataImport(self):
raw_data = []
with open('../input/metadata_cartapapel.csv') as csvfile:
reader = csv.reader(csvfile, dialect='excel')
for columns in reader:
raw_data.append(columns)
md = MD_Metadata(etree.parse('gemini-template.xml'))
md.identification.topiccategory = ['farming','environment']
print md.identification.topiccategory
outfile = open('mdtest.xml','w')
# crap, can't update the model and write back out - this is badly needed!!
outfile.write(md.xml)
if __name__ == "__main__":
unittest.main()
Could someone help to solve this issue, please?
Thank you in advance for your time.
Aucun commentaire:
Enregistrer un commentaire