Play Now Login Create Account
illyriad
  New Posts New Posts RSS Feed - XML Parsing
  FAQ FAQ  Forum Search   Register Register  Login Login

Topic ClosedXML Parsing

 Post Reply Post Reply Page  <12
Author
HonoredMule View Drop Down
Postmaster General
Postmaster General
Avatar

Joined: 05 Mar 2010
Location: Canada
Status: Offline
Points: 1650
Direct Link To This Post Posted: 31 Jan 2011 at 16:16
I have some big plans for new services and tools for Illyriad players, the most prominent of which being HarmlessButler 2.0 for the next interface.  The new version won't use the heavy, clunky local database like the current one, but that means I need an online database HB can query.  That in turn means I might as well make full use of that database with all kinds of other public tools and services, and believe me, I've got lots of ideas.

However, before doing any server-side stuff, I'm first hell-bent on finishing a very dear personal project on which I wish to depend for many of these services.  Seeing any of these new goodies (including HB 2.0, unfortunately) will be a few weeks yet at least.  In the mean time, I thought I'd share an important puzzle piece with which other boffins no doubt struggle: how to usefully parse through those big XML files.  Hopefully this will help other developers build their own little tools quickly and easily.

The "nice" XML handling tools can't deal with such large files and run out of memory, and the stream-based parsers are pretty ugly and hard to use.  So I wrote an abstracted class-based system specially for Illyriad's data files that anyone can use.  Rather than being a finished but also inflexible solution, it's up to you to write the classes that read specific information and post it to whatever data store you like using whatever structure you like.  In this manner, it supports all past, present, and likely future formats at any file size while remaining extremely simple to use.  The classes can be as complete or as brief as desired.

Each class you write (like AllianceParser extends RowParser) handles a single table, and each instance of the class handles one row.  Each protected method you write (like parse_allianceticker($data, $attrs) ) receives the data specific to that node which you can compile as desired into $this->data using whatever structure and format you like.  When the information within the handled row is complete, the method processData() will be called--this is the only one you must implement.  From there, you can add information from parent classes and push $this->data wherever you like.  Because YOU defined the data layout, query builders can easily and automatically submit the data to database tables YOU defined, and other software you write can work with it easily as well, having just what it needs and where it wants it.  For added convenience, when you run your PHP through a web server, the classes will spit out lists of the nodes that were found and not parsed, serving as a guide while you're still writing your classes.

The parser classes can be nested as well--for example, RoleParser will parse the roles in an alliance, and has access to the information from its parent AllianceParser.  The classes can stack to any depth and still find a particular parent by node name, so all the structural/relational information you need will be preserved.  The system is also clever enough to know an alliance node within an alliance node isn't really an alliance within an alliance--the outer one is an alliance object and the inner one is some of the important information.  So, when quirks like that are encountered, only one AllianceParser is created, with the inner data-rich node getting handled by parse_alliance($data, $attrs) and the encapsulating one by parse_outer_alliance($data, $attrs).

Developers can fetch the code and a usage example at http://illyriad.honoredsoft.com/wiki/Tool:IllyriadParser
Back to Top
 Post Reply Post Reply Page  <12
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 12.03
Copyright ©2001-2019 Web Wiz Ltd.