Like any large concern Port of Seattle has some very complex data recording and billing systems. One of these is the billing system for crane usage. Large cranes are used to move containers from the numerous cargo ships that visit the Port daily. These cranes are billed based on any number of criteria specific to the user of that crane. Some may bill by time, others by count or a combination of these.
Originally this data was gathered by had, proofed by hand, audited by numerous spreadsheets, billed, then audited again. This was a very complex and time consuming process. To make matters even more difficult the contract terms for each crane user were complex and could change based on quantity discounts, over usage or time of year.
Canfield Research Group analyzed the process and created flowcharts documenting how the data moved, who contributed to and who used the results of that data. A set of recommendations were then made to inclemently automate the system. The migration recommendation was slow, methodical and structured to show an excellent return on investment. For large projects we like to show an ROI of under 12 months, this often includes the time it took to develop the application.
Billing calculation and database
The first step was to automate the calculation of the data. A special calculation engines was written that used a configuration period to define how cranes were build, who was build and how the billing changes over time using an inheritance model. The database was an SQL based system. All billing related changes to data were kept in a transaction log for audits or recovery. This system was extensively tested and then implemented.
After a number of months of flawless calculations everyone was satisfied with its accuracy over an extended period the next was made. That involved automating the billing and integrating the system with the current back-end billing package.
Individuals or groups could be given access to parts of the visit data, while leaving the other areas as read only or hidden. This allowed for multiple departments to reliable contribute data to the system and have it all kept as a single entity. The web interface was designed for high speed entry and the total time to enter data dropped as compared to the old method of using a complex Microsoft Access application with multiple forms.
The server worked with a data hierarchically. This allowed multiple rows from multiple tables to be manipulated as a single transaction during viewing and editing. All data entry access to the SQL database was through this hierarchical system providing reliability, consistency, audit trails and easy data manipulation.
Finally the market research system, previously a Microsoft Access database was integrated into the system. For the first time all elements of a visit could be viewed and edited on a single screen.
A number of reports were required of the system. The reports are built internally as XML and the output as HTML, PDF or CVS (for import into spreadsheets). Reports automatically rotate to landscape if they are too long for portrait mode, ignore summary data for CVS and perform a number of other optimizations making them powerful, flexible and easy to maintain.
One of the hardest parts of the development was the migration of old data into the system. Data came from multiple sources none of which had reliable shared elements.
- Vessel names were often entered manually they were misspelled, especially foreign names
- Arrival times of vessels were record differently sometimes by more than a week
- Vessels had been renamed at various times as had the shipping lines
Canfield Research Group developed a set of algorithms that scanned all the data and linked the data based on a closest fit. A custom string comparison algorithm based on differential text (soundex failed utterly for this application), a correction dictionary and synonyms for vessel names was developed. This combined with date ranges provided 98+% data match. The remaining data was also the oldest (10 years or more back) and was left as separate elements so it would appear as totals but might appear to the system as two or more separate visits. An audit trail of every merge was kept and used for research purposes for a considerable time. The string comparison code was migrated to the application allow for auto correction of vessel names during submit processing.
The system has performed for over 5 years with no miscalculations, and no downtime related to the application. System maintenance has been very minimal.
Some additional details
- All code is based on GPL’d software. This provided maximum flexibility to the Port in deploying the application
- The PDF is generated from XML by translating the XML into Latex
- The web server is Apache
- The code is written in Perl as a number of object oriented modules with regression tests
- The calculation can audit itself so it can detect if there is a calculation error when it is modified
- Any application errors, hard or soft, are captured and reported to the system administrator as an email. A full trace log with CGI data is provided with password data automatically masked.
Contact Canfield Research Group to see how we can help with your data migration, merging an analysis needs.