xmlformat API Notation is Ruby-like. API for Perl version is similar. CHANGES TO MAKE: - Hide the option hash representation by replacing with accessor methods. module XMLFormat Module methods: warn(arg, ...) Print message given by arguments to stderr. die(arg, ...) Print message given by arguments to stderr and exit(1). class XMLFormatter Class Methods: obj = new Generate new XMLFormatter object, and set up initial formatting options hash. read_config(filename) Read configuration file containing formatting options. val, err_msg = check_option(opt_name, opt_val) (private) Check and option name/value for legality, return possibly type-converted option value and error message. If err_msg is nil, the option is legal. If err_msg is not nil, the option is illegal and err_msg contains a string indicating the problem. opts = get_opts(elt_name) (private) Look up formatting options for element and return them. This never fails, because if no options are known for the given element name, it returns the default options, which are guaranteed to be defined. display_config Display the configuration (formatting options). display_unconfigured_elements Produce a report of which elements are named in the input document but for which no formatting options were given in the configuration file. shallow_parse(xml_document) Parse an XML document (specified in the form of a string) into array tokens and store the array internally. array = tokens Acessor method that returns the token list. name = extract_tag_name(tag) (private) Given a tag (an angle-bracket sring), extract the tag name and return it. assign_line_numbers Assigns an input line number to each token (for use in error messages). (private) err_count = report_errors Check the internal token list for errors, print information on bad tokens, and return an error count. The count is zero if no errors are found. tokens_to_tree Convert the internal token list to tree form and store the tree. hash = node(type, content) hash = text_node(content) hash = comment_node(content) hash = pi_node(content) hash = doctype_node(content) hash = cdata_node(content) hash = element_node(open_tag, close_tag, children) (private) Tree node generators. str = tree_stringify(children = @tree) Convert the node list back to a string and return the string. If the argument is missing, use the entire tree. In this case, you get back the original input document. tree_canonize Canonize the document tree to remove extraneous all-whitespace nodes and normalize text nodes. tree_canonize2(children, par_name = "*DOCUMENT) (private) Helper function for tree_canonize. Canonize a document subtree and return the modified subtree. bool = is_normalized_elt(node) (private) Return true/false to indicate whether the node is a normalized element. tree_format(par_name = "*DOCUMENT", children = @tree, indent = 0) Format the tree or a subtree to produce a string representing the reformatted XML document. Store string in @out_doc class variable. If the children argument is missing, use the entire tree. flush_pending(indent) (private) Flush pending text, using indent if text is line-wrapped. Side-effect: advances the break type to element-break. array = line_wrap(str, first_indent, rest_indent, max_len) (private) Perform line-wrapping on a string and return the result as an array of lines. str = the string to wrap first_indent = indent for first line rest_indent = indent for any subsequent lines max_len = maximum allowed length of lines (including indent) emit_break(indent) (private) Put out a break -- the number of newlines appropriate for the current break type (entry-break, element-break, or exit-break). If the break count > zero and indent is > 0, put out that many spaces as well.