DataFrame to XML

Overview:

  • XML is a text data format that has data enclosed in tags. 
  • The tags explain the data in some way. For example, the XML tags could mean style of the text contained inside a tag, they could act as a column name similar to the ones in a schema of a RDBMS relation. The tag describes the data contained. There could be several attributes inside a tag which further describes the data inside the tag.
  • When performance enhancements required, an XML is represented in binary format as well.
  • XML can be used to define schemas leading to various data definition formats restrictive to several application domains.
  • An XML document can be applied an XSLT which is a combination of another XML schema aided through an associated processor, to get transformations of an XML document. 
  • XSLT can be used to produce another XML or any other file format by selecting portions of the source XML or by selectively repeating portions of an XML document along with other selected portions.
  • In a document based view, an XML document contains data in elements(tags) starting from the root element and spanning as a tree. 
  • In an event based view, XML data flows as a stream of elements generating events for each element. Specific events can be subscribed for processing.
  • XML due to its simplicity and extensibility is widely used as a serialisation format in several application areas.
  • The pandas framework supports getting data as XML into a DataFrame and writing a pandas DataFrame as XML.
  • The to_xml() method of DataFrame converts the DataFrame contents to an XML string and writes it to a buffer or a file path. It also returns the XML as a string.

Example:

# Example Python program that exports the contents
# of a pandas DataFrame into an XML file
import pandas
import xml.etree.ElementTree as ET

# A Python dictionary containing data
nobleGases = {"Helium":["He", 2],
              "Neon":["Ne", 10],
              "Argon":["Ar", 18],
              "Krypton":["Kr", 36],
              "Xenon":["Xe", 54],
              "Radon":["Rn", 86],
              "Oganesson":["Og", 118]}             
cols = ["Symbol", "Electron Count"]

# DataFrame creation
df = pandas.DataFrame(data=nobleGases, index=cols)
print("Data as a pandas DataFrame:")
print(df)

# Export to an XML string
xmlString = df.to_xml()
print("Data as an XML string:")
print(xmlString)

Output:

Data as a pandas DataFrame:

               Helium Neon Argon Krypton Xenon Radon Oganesson

Symbol             He   Ne    Ar      Kr    Xe    Rn        Og

Electron Count      2   10    18      36    54    86       118

Data as an XML string:

<?xml version='1.0' encoding='utf-8'?>

<data>

  <row>

    <index>Symbol</index>

    <Helium>He</Helium>

    <Neon>Ne</Neon>

    <Argon>Ar</Argon>

    <Krypton>Kr</Krypton>

    <Xenon>Xe</Xenon>

    <Radon>Rn</Radon>

    <Oganesson>Og</Oganesson>

  </row>

  <row>

    <index>Electron Count</index>

    <Helium>2</Helium>

    <Neon>10</Neon>

    <Argon>18</Argon>

    <Krypton>36</Krypton>

    <Xenon>54</Xenon>

    <Radon>86</Radon>

    <Oganesson>118</Oganesson>

  </row>

</data>

 


Copyright 2024 © pythontic.com