Elm  2
ELM is a library providing generic data structures, OS-independent interface, plugins and XML.
Serializer Class Reference

#include <elm/xom/Serializer.h>

Public Member Functions

 Serializer (io::OutStream &out_stream)
 
 Serializer (io::OutStream &out, string encoding)
 
const stringgetEncoding (void) const
 
int getIndent (void) const
 
const stringgetLineSeparator (void) const
 
int getMaxLength (void) const
 
bool getPreserveBaseURI (void) const
 
bool getUnicodeNormalizationFormC () const
 
void setIndent (int indent)
 
void setLineSeparator (string line_separator)
 
void setMaxLength (int max_length)
 
void setOutputStream (io::OutStream &out)
 
void setPreserveBaseURI (bool preserve)
 
void setUnicodeNormalizationFormC (bool normalize)
 
virtual void write (Document *doc)
 
void flush (void)
 

Protected Member Functions

int getColumnNumber (void)
 
virtual void breakLine (void)
 
virtual void write (Attribute *attribute)
 
virtual void write (Comment *comment)
 
virtual void write (DocType *doctype)
 
virtual void write (Element *element)
 
virtual void write (ProcessingInstruction *instruction)
 
virtual void write (Text *text)
 
virtual void writeAttributes (Element *element)
 
virtual void writeAttributeValue (String value)
 
virtual void writeChild (Node *node)
 
virtual void writeEmptyElementTag (Element *element)
 
virtual void writeEndTag (Element *element)
 
virtual void writeEscaped (String text)
 
virtual void writeNamespaceDeclaration (const string &prefix, const string &uri)
 
virtual void writeNamespaceDeclarations (Element *element)
 
virtual void writeRaw (String text, int length=-1)
 
virtual void writeStartTag (Element *element)
 
virtual void writeXMLDeclaration (void)
 

Detailed Description

Outputs a Document object in a specific encoding using various options for controlling white space, normalization, indenting, line breaking, and base URIs. However, in general these options do affect the document's infoset. In particular, if you set either the maximum line length or the indent size to a positive value, then the serializer will not respect input white space. It may trim leading and trailing space, condense runs of white space to a single space, convert carriage returns and linefeeds to spaces, add extra space where none was present before, and otherwise muck with the document's white space. The defaults, however, preserve all significant white space including ignorable white space and boundary white space.

Warning
This is a very limited version of the serializer:
supports only UTF-8 encoding,
  • no indentation, space, newline suppport.
Author
H. Cassé casse.nosp@m.@iri.nosp@m.t.fr

Constructor & Destructor Documentation

◆ Serializer() [1/2]

Create a new serializer that uses the UTF-8 encoding.

Parameters
out_streamthe output stream to write the document on

◆ Serializer() [2/2]

Serializer ( io::OutStream out,
string  encoding 
)

Create a new serializer that uses the specified encoding. The encoding must be recognized by the libxml.

Parameters
outthe output stream to write the document on
encodingthe character encoding for the serialization

Member Function Documentation

◆ breakLine()

void breakLine ( void  )
protectedvirtual

Writes the current line break string onto the underlying output stream and indents as specified by the current level and the indent property.

Referenced by Serializer::writeEmptyElementTag(), Serializer::writeEndTag(), Serializer::writeStartTag(), and Serializer::writeXMLDeclaration().

◆ flush()

void flush ( void  )

Flush the out stream.

References Output::flush().

Referenced by XOMSerializer::~XOMSerializer().

◆ getColumnNumber()

int getColumnNumber ( void  )
protected

Returns the current column number of the output stream. This method useful for subclasses that implement their own pretty printing strategies by inserting white space and line breaks at appropriate points. Columns are counted based on Unicode characters, not UTF-8 chars. A surrogate pair counts as one character in this context, not two. However, a character followed by a combining character (e.g. e followed by combining accent acute) counts as two characters. This latter choice (treating combining characters like regular characters) is under review, and may change in the future if it's not too big a performance hit.

Returns
the current column number

◆ getEncoding()

const string& getEncoding ( void  ) const
inline

◆ getIndent()

int getIndent ( void  ) const
inline

◆ getLineSeparator()

const string& getLineSeparator ( void  ) const
inline

◆ getMaxLength()

int getMaxLength ( void  ) const
inline

◆ getPreserveBaseURI()

bool getPreserveBaseURI ( void  ) const
inline

◆ getUnicodeNormalizationFormC()

bool getUnicodeNormalizationFormC ( ) const
inline

◆ setIndent()

void setIndent ( int  indent)
inline

◆ setLineSeparator()

void setLineSeparator ( string  line_separator)
inline

◆ setMaxLength()

void setMaxLength ( int  max_length)
inline

◆ setOutputStream()

void setOutputStream ( io::OutStream out)
inline

References Output::setStream().

◆ setPreserveBaseURI()

void setPreserveBaseURI ( bool  preserve)
inline

◆ setUnicodeNormalizationFormC()

void setUnicodeNormalizationFormC ( bool  normalize)
inline

◆ write() [1/7]

void write ( Attribute attribute)
protectedvirtual

Writes an attribute in the form name="value". Characters in the attribute value are escaped as necessary.

Parameters
attributethe Attribute to write

References Attribute::getLocalName(), Attribute::getValue(), Serializer::writeAttributeValue(), and Serializer::writeRaw().

◆ write() [2/7]

void write ( Comment comment)
protectedvirtual

Writes a comment onto the output stream using the current options. Since character and entity references are not resolved in comments, comments can only be serialized when all characters they contain are available in the current encoding.

Parameters
commentthe Comment to serialize

References Comment::getText(), and Serializer::writeRaw().

◆ write() [3/7]

void write ( DocType *  doctype)
protectedvirtual

Writes a DocType object onto the output stream using the current options.

Parameters
doctypethe document type declaration to serialize

◆ write() [4/7]

void write ( Document doc)
virtual

Serializes a document onto the output stream using the current options.

Parameters
docthe Document to serialize

References Document::getRootElement(), and Serializer::writeXMLDeclaration().

Referenced by Serializer::writeAttributes(), Serializer::writeChild(), and XOMSerializer::~XOMSerializer().

◆ write() [5/7]

void write ( Element element)
protectedvirtual

Serializes an element onto the output stream using the current options. The result is guaranteed to be well-formed. If element does not have a parent element, the output will also be namespace well-formed.

If the element is empty, this method invokes writeEmptyElementTag. If the element is not empty, then:

  1. It calls writeStartTag.
  2. It passes each of the element's children to writeChild in order.
  3. It calls writeEndTag.

It may break lines or add white space if the serializer has been configured to indent or use a maximum line length.

References ParentNode::getChild(), ParentNode::getChildCount(), Serializer::writeChild(), Serializer::writeEmptyElementTag(), Serializer::writeEndTag(), and Serializer::writeStartTag().

◆ write() [6/7]

void write ( ProcessingInstruction *  instruction)
protectedvirtual

Writes a processing instruction onto the output stream using the current options. Since character and entity references are not resolved in processing instructions, processing instructions can only be serialized when all characters they contain are available in the current encoding.

Parameters
instructionthe ProcessingInstruction to serialize

◆ write() [7/7]

void write ( Text text)
protectedvirtual

Writes a Text object onto the output stream using the current options. Reserved characters such as <, > and " are escaped using the standard entity references such as <, >, and ".

Characters which cannot be encoded in the current character set (for example, Ω in ISO-8859-1) are encoded using character references.

Parameters
textthe Text to serialize

References Text::getValue(), and Serializer::writeEscaped().

◆ writeAttributes()

void writeAttributes ( Element element)
protectedvirtual

Writes all the attributes of the specified element onto the output stream, one at a time, separated by white space. If preserveBaseURI is true, and it is necessary to add an xml:base attribute to the element in order to preserve the base URI, then that attribute is also written here. Each individual attribute is written by invoking write(Attribute).

Parameters
elementthe Element whose attributes are written

References Element::getAttribute(), Element::getAttributeCount(), and Serializer::write().

Referenced by Serializer::writeEmptyElementTag(), and Serializer::writeStartTag().

◆ writeAttributeValue()

void writeAttributeValue ( String  value)
protectedvirtual

Writes a string onto the underlying output stream. Non-ASCII characters that are not available in the current character set are escaped using hexadecimal numeric character references. Carriage returns, line feeds, and tabs are also escaped using hexadecimal numeric character references in order to ensure their preservation on a round trip. The four reserved characters <, >, &, and " are escaped using the standard entity references <, >, &, and ". The single quote is not escaped.

Parameters
valuethe attribute value to serialize

References elm::xom::escapeSimple(), elm::xom::isAttrEscape(), and Serializer::writeRaw().

Referenced by Serializer::write().

◆ writeChild()

void writeChild ( Node node)
protectedvirtual

Writes a child node onto the output stream using the current options. It is invoked when walking the tree to serialize the entire document. It is not called, and indeed should not be called, for either the Document node or for attributes.

Parameters
nodethe Node to serialize

References Node::COMMENT, Node::DOCUMENT, Node::ELEMENT, Node::kind(), Node::TEXT, and Serializer::write().

Referenced by Serializer::write().

◆ writeEmptyElementTag()

void writeEmptyElementTag ( Element element)
protectedvirtual

Writes an empty-element tag for the element including all its namespace declarations and attributes.

The writeAttributes method is called to write all the non-namespace-declaration attributes. The writeNamespaceDeclarations method is called to write all the namespace declaration attributes.

If subclasses don't wish empty-element tags to be used, they can override this method to simply invoke writeStartTag followed by writeEndTag.

Parameters
elementthe element whose empty-element tag is written

References Serializer::breakLine(), Element::getLocalName(), Serializer::writeAttributes(), and Serializer::writeRaw().

Referenced by Serializer::write().

◆ writeEndTag()

void writeEndTag ( Element element)
protectedvirtual

Writes the end-tag for an element in the form </name>.

Parameters
elementthe element whose end-tag is written

References Serializer::breakLine(), Node::ELEMENT, String::free(), ParentNode::getChild(), ParentNode::getChildCount(), Element::getQualifiedName(), Node::kind(), and Serializer::writeRaw().

Referenced by Serializer::write().

◆ writeEscaped()

void writeEscaped ( String  text)
protectedvirtual

Writes a string onto the underlying output stream. Non-ASCII characters that are not available in the current character set are encoded with numeric character references. The three reserved characters <, >, and & are escaped using the standard entity references <, >, and &. Double and single quotes are not escaped.

Parameters
textthe parsed character data to serialize

References CString::chars(), elm::xom::escapeSimple(), elm::xom::isTextEscape(), CString::length(), and Serializer::writeRaw().

Referenced by Serializer::write().

◆ writeNamespaceDeclaration()

void writeNamespaceDeclaration ( const string prefix,
const string uri 
)
protectedvirtual

Writes a namespace declaration in the form xmlns:prefix="uri" or xmlns="uri". It does not write the spaces on either side of the namespace declaration. These are written by writeNamespaceDeclarations.

Parameters
prefixthe namespace prefix; the empty string for the default namespace
urithe namespace URI

◆ writeNamespaceDeclarations()

void writeNamespaceDeclarations ( Element element)
protectedvirtual

Writes all the namespace declaration attributes of the specified element onto the output stream, one at a time, separated by white space. Each individual declaration is written by invoking writeNamespaceDeclaration.

Parameters
elementthe Element whose namespace declarations are written

◆ writeRaw()

void writeRaw ( String  text,
int  length = -1 
)
protectedvirtual

Writes a string onto the underlying output stream. without escaping any characters. Non-ASCII characters that are not available in the current character set cause an IOException.

Parameters
textthe String to serialize
lengthlength of the string (optional)

References CString::chars(), CString::length(), Output::stream(), and OutStream::write().

Referenced by Serializer::write(), Serializer::writeAttributeValue(), Serializer::writeEmptyElementTag(), Serializer::writeEndTag(), Serializer::writeEscaped(), Serializer::writeStartTag(), and Serializer::writeXMLDeclaration().

◆ writeStartTag()

void writeStartTag ( Element element)
protectedvirtual

Writes the start-tag for the element including all its namespace declarations and attributes.

The writeAttributes method is called to write all the non-namespace-declaration attributes. The writeNamespaceDeclarations method is called to write all the namespace declaration attributes.

Parameters
elementthe element whose start-tag is written

References Serializer::breakLine(), Node::ELEMENT, String::free(), ParentNode::getChild(), ParentNode::getChildCount(), Element::getNamespaceDeclarationCount(), Element::getNamespacePrefix(), Element::getNamespaceURI(), Element::getQualifiedName(), Node::kind(), Serializer::writeAttributes(), and Serializer::writeRaw().

Referenced by Serializer::write().

◆ writeXMLDeclaration()

void writeXMLDeclaration ( void  )
protectedvirtual

Writes the XML declaration onto the output stream, followed by a line break.

References Serializer::breakLine(), String::toCString(), and Serializer::writeRaw().

Referenced by Serializer::write().


The documentation for this class was generated from the following files: