Is there an smart way to write a fixed length flat file?

Mikhas picture Mikhas · May 4, 2011 · Viewed 36.5k times · Source

Is there any framework/library to help writing fixed length flat files in java?

I want to write a collection of beans/entities into a flat file without worrying with convertions, padding, alignment, fillers, etcs

For example, I'd like to parse a bean like:

public class Entity{
    String name = "name"; // length = 10; align left; fill with spaces
    Integer id = 123; // length = 5; align left; fill with spaces
    Integer serial = 321 // length = 5; align to right; fill with '0'
    Date register = new Date();// length = 8; convert to yyyyMMdd
}

... into ...

name      123  0032120110505
mikhas    5000 0122120110504
superuser 1    0000120101231

...

Answer

Dave Smith picture Dave Smith · May 6, 2011

You're not likely to encounter a framework that can cope with a "Legacy" system's format. In most cases, Legacy systems don't use standard formats, but frameworks expect them. As a maintainer of legacy COBOL systems and Java/Groovy convert, I encounter this mismatch frequently. "Worrying with conversions, padding, alignment, fillers, etcs" is primarily what you do when dealing with a legacy system. Of course, you can encapsulate some of it away into handy helpers. But most likely, you'll need to get real familiar with java.util.Formatter.

For example, you might use the Decorator pattern to create decorators to do the conversion. Below is a bit of groovy (easily convertible into Java):

class Entity{
    String name = "name"; // length = 10; align left; fill with spaces
    Integer id = 123; // length = 5; align left; fill with spaces
    Integer serial = 321 // length = 5; align to right; fill with '0'
    Date register = new Date();// length = 8; convert to yyyyMMdd
}

class EntityLegacyDecorator {
     Entity d
     EntityLegacyDecorator(Entity d) { this.d = d }

     String asRecord() {
         return String.format('%-10s%-5d%05d%tY%<tm%<td',
                               d.name,d.id,d.serial,d.register)
   }
 }

def e = new Entity(name: 'name', id: 123, serial: 321, register: new Date('2011/05/06'))

assert new EntityLegacyDecorator(e).asRecord() == 'name      123  0032120110506'

This is workable if you don't have too many of these and the objects aren't too complex. But pretty quickly the format string gets intolerable. Then you might want decorators for Date, like:

class DateYMD {
     Date d
     DateYMD(d) { this.d = d }
     String toString() { return d.format('yyyyMMdd') }
 }

so you can format with %s:

    String asRecord() {
         return String.format('%-10s%-5d%05d%s',
                               d.name,d.id,d.serial,new DateYMD(d.register))
   }

But for significant number of bean properties, the string is still too gross, so you want something that understands columns and lengths that looks like the COBOL spec you were handed, so you'll write something like this:

 class RecordBuilder {

    final StringBuilder record 

    RecordBuilder(recordSize) {
        record = new StringBuilder(recordSize)
        record.setLength(recordSize)
     }

    def setField(pos,length,String s) { 
       record.replace(pos - 1, pos + length, s.padRight(length))
    }

    def setField(pos,length,Date d) { 
      setField(pos,length, new DateYMD(d).toString())
    }

   def setField(pos,length, Integer i, boolean padded) { 
       if (padded) 
          setField(pos,length, String.format("%0" + length + "d",i))
       else 
          setField(pos,length, String.format("%-" + length + "d",i))
    }

    String toString() { record.toString() }
}

class EntityLegacyDecorator {

     Entity d

     EntityLegacyDecorator(Entity d) { this.d = d }

     String asRecord() {
         RecordBuilder record = new RecordBuilder(28)
         record.setField(1,10,d.name)
         record.setField(11,5,d.id,false)
         record.setField(16,5,d.serial,true)
         record.setField(21,8,d.register)
         return record.toString()
     }

}

After you've written enough setField() methods to handle you legacy system, you'll briefly consider posting it on GitHub as a "framework" so the next poor sap doesn't have to to it again. But then you'll consider all the ridiculous ways you've seen COBOL store a "date" (MMDDYY, YYMMDD, YYDDD, YYYYDDD) and numerics (assumed decimal, explicit decimal, sign as trailing separate or sign as leading floating character). Then you'll realize why nobody has produced a good framework for this and occasionally post bits of your production code into SO as an example... ;)