Thursday, December 24, 2009

CF: Parsing Complex Fixed (Flat) Files

Despite many years of XML, the use of simple and complex flat files as a means of exchanging information between systems is still fairly common.
Parsing simple, tabular, flat files like the one below does not really pose a large challenge as either the OS (Text Drivers) or CF (CFHTTP) have tools to read them in without much fanfare.

Example 1: Tabular Fixed File

However, many times over, the examples are not as simple. The data on each row may be different; relationships may exist between lines of data etc.

Example 2: Complex Fixed File

For the complex files we normally write specialized parsers or transformations. Certainly a whole industry exists that deals with Extract Transform Load processes. Either choose requires considerable programming efforts.
After writing a parser for the umpteens fixed file that someone wanted to have loaded, I thought there needs to be a better way. The short of all this is that I created a more generic component to deal with this in native ColdFusion. The solution was to split up the code and the file definition.

Thus, in order to parse complex files, you provide an XML definition file which describes the flat file. Then call the component to do the heavy lifting.

Example 3: Flat File Definition XML

The outcome of such processing are standard ColdFusion objects that are easier to deal with and require less programmatic effort to implement.

I posted this project on RIAForge (FixedFileReader). It contains examples for different scenarios of varying complexity. I added skeletons for EDI X.12 and VCF4 formats.

May not work for all, but if it does, it is likely that it will save a lot of time.



suspiria said...

thanks for this! i was actually going to develop something along these lines myself but no need now :-)

suspiria said...

please excuse this question/request as i haven't played around with the package yet but was wondering if this also handles the WRITING of files based on a definition file? if not, that would be a straightforward feature addition for developers involved with enterprise apps concerned with data exchange between online and financial backend systems...

bman said...

thanks for your feedback.
The component is meant for reading files only. I have not found a good way to provide definitions and data to describe the files in generic format that is faster than writing the files directly.