Getting Element Tree to read a SASXML file
Element Tree is a very nice and simpleĀ to use python library for loading, reading in, and writing out XML files. However there is on little trick that caught me out because I didn’t understand enough about XML and name spaces. Having got some data in sas xml format it took me a long while to work out why I couldn’t get the items tagged <Q> out…the simple problem being that I had to get out the items tagged {cansas1d/1.0}Q. The wibbly brackets give the name space for the tag which is in retrospect obvious that you would need for properly parsing and dealing with XML.
Jason Winget also has a nice example of using Element Tree up.
A little of the data file looks like this:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="cansasxml-html.xsl" ?>
<SASroot version="1.0"
xmlns="cansas1d/1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="cansas1d/1.0 http://svn.smallangles.net/svn/canSAS/1dwg/trunk/cansas1d.xsd">
<SASentry name="Workspace_1">
<Title> GluR0 + Gln 100 % D2O_SA </Title>
<Run> 48664 </Run>
<SASdata>
<Idata><Q unit="1/A"> 0.009000 </Q><I unit="1/cm"> 0.44416E+01 </I><Idev unit="1/cm"> 0.14E+00 </Idev><Qdev unit="1/A"> 0.00E+00 </Qdev></Idata>
<Idata><Q unit="1/A"> 0.011000 </Q><I unit="1/cm"> 0.37120E+01 </I><Idev unit="1/cm"> 0.86E-01 </Idev><Qdev unit="1/A"> 0.00E+00 </Qdev></Idata>
The source for the loader looks like this. xml.etree.ElementTree is imported as ‘ET’ at the top of the whole codeset:
def loadsasxml(file):
"""Loaded for SASxml 1.0 format data.
The loader uses xml.etree.ElementTree to parse the xml file and
then searches through the file to find the {cansas1d/1.0}Q and
{cansas1d/1,0}I tags and then extract the text attribute from each
of these. The list is then converted from text to floats and the Q
and I lists passed to a new SasData object. Currently nothing else
from the sas xml folder is loaded. ElementTree is imported as 'ET'
"""
# Check that file is a sasxml file
# assert (first line of file is what it should be) is True
# Parse the xml file and find the root element
tree = ET.parse('xmltest.xml')
elem = tree.getroot()
# return a list of all the <Q> tags and get the Q values
q_tags = elem.getiterator("{cansas1d/1.0}Q")
q_list = []
for elements in q_tags:
q_list.append(float(elements.text)) # need to convert text to float
# then do the same for the <I> tags and values
i_tags = elem.getiterator("{cansas1d/1.0}I")
i_list = []
for elements in i_tags:
i_list.append(float(elements.text))
# check everything is ok with q_list and i_list
assert len(q_list) == len(i_list), 'different number of q and i values?'
assert len(q_list) != 0, 'appear to be no q values'
assert len(i_list) != 0, 'appear to be no i values'
assert q_list[0] < q_list[-1], 'q values not in order?'
# generate and return a SasData object
return ExpSasData(q_list, i_list)
Getting the code to display nicely
In the end it should have been obvious and as Neil and @plausible pointed out the answer was there all the time, spelled out nicely at http://support.wordpress.com/code but actually finding that didn’t seem to be too straightforward at the time. So the answer for generating nicely formatted code in wordpress.com blogs is to wrap the code up in;
your code goes here
Problems with creating a new matplotlib ScaleBase class
So I have managed to create a new scale type in matplotlib using the ScaleBase class which is very useful and rather powerful. The scale subjects an axis to a squared transform, meaning that like a log plot, the x-values are spread out along the axis on a scale that is annotated with the actual values but visually and spatially scales as the square of those values. This is particularly useful for Guinier plots which are usually plotted as the square of Q versus the natural log of intensity.
At first the problem was that it was throwing errors all over the place when I attempted to move the view of the plot. Guessing that this was due to attempting to move across the axis and therefore taking the square root of a negative number on the inverse scale transform I first tried to modify the inverse transform to make sure it was multiplying negative numbers by minus one before taking the square root. Fiddling with this a variety of ways didn’t seem to help.
The next approach was to figure out how to set the scale defaults so that it stops you going to negative values on a squared scale. This seems fine, and now when re-scaling using the rectangle magnifier it no longer tries to expand the x-axis beyond zero. But now when using the cross hairs to move the data around in the plot it throws a segmentation fault when you attempt to move across the zero point on the x-axis. I’m not sure whether this is a real bug I’ve uncovered or whether I’m just doing something stupid but nonetheless, here is the code.
#!/usr/bin/env python
class SquaredScale(mscale.ScaleBase):
"""ScaleBase class for generating x axis of Guinier plots.
Uses the built in scalebase to generate a transformed axis type
called SquaredScale which can be called using ax.set_xscale('q_squared').
Currently uses the default ticker and scale setup which will need
to be changed in the future.
The class requires the import of pylab (AutoLocator, ScalarFormatter
NullLocator and NullFormatter) matplotlib.transforms (imported as
mtransforms, required for the mtransforms.Transform class) and
matplotlib.scale (imported as mscale, required for the mscale.ScaleBase
class for inheritance of the scale).
"""
name = 'q_squared'
def __init__(self, axis, **kwargs):
mscale.ScaleBase.__init__(self)
def set_default_locators_and_formatters(self, axis):
"""
Set the locators and formatters to reasonable defaults for
scaling. Not really too sure what these do at the moment.
"""
axis.set_major_locator(AutoLocator())
axis.set_major_formatter(ScalarFormatter())
axis.set_minor_locator(NullLocator())
axis.set_minor_formatter(NullFormatter())
def limit_range_for_scale(self, vmin, vmax, minpos):
return 0, vmax
class SquaredTransform(mtransforms.Transform):
input_dims = 1
output_dims = 1
is_separable = True
def transform(self, a): return a**2
def inverted(self):
return SquaredScale.InvertedSquaredTransform()
class InvertedSquaredTransform(mtransforms.Transform):
input_dims = 1
output_dims = 1
is_separable = True
def transform(self, a):
return sqrt(a)
def inverted(self):
return SquaredScale.SquaredTransform()
def get_transform(self):
"""Set the actual transform for the axis coordinates.
"""
return self.SquaredTransform()
mscale.register_scale(SquaredScale)
leave a comment