2008-05-26

Fighting with duplication - attack of the clones

Last week I proceeded with clean up tasks on one project I'm working now. It's big number of jsp generated reports. Some time ago I setup initial environment for such reports, so even analytics without special JSP or JAVA knowledge could do some reports. It was temporary solution, before jumping to more enterprise tool. Results were not so bad - new reports were added very quick. Everything worked and was usable so enterprise tool is just forgotten mention.

Over two years it overgrowth overall code base and many problem issues and problems were related with those reports. In harder cases I had to dig into that and what I have seen is a massive duplication. I any possible form. So when we found bug in one section of code there was huge probability that the same problem was duplicated more times elsewhere. It was even worse. Instead of new features added to base report there where created new versions of the same report with new parameters or features. And divergence between those reports arises over time. So you can imagine how much time maintenance started cost at some point.

I like to stick a specially with one coding rule: do not create new and remove existing duplication.

Duplication starts when some functional block of code appears at least in two places in code. Sometimes it is just simple one liner - but even then you should consider pros and cons for wrapping that piece of code in some procedural statement.

Let's back to vicious mechanism of avalanche of duplicated code in non programmers environment - when people don't know good "coding" practices. One of reports (let's call it A) was copied with some functional difference (as B). We have got almost two identical files. Then comes another feature that seems to interfering with previous one. Because new feature ticket was assigned only to one of those reports (after some time everybody sees two reports and thinks - those are two separate reports) and new feature seems so "new" and practice before was to create separate functional report (for "clear solution") so there comes new report C derived from A. After some time somebody realized that there is no such feature for B report. Continuing previous process we have now 4 reports with D version. Then comes major change - added new module that shows new values in similar manner so base reports A,B,C,D are copied as 2A,2B,2C,2D and then changed some values and layout. Almost pure copy'n'paste coding.

Having one simple report at the beginning that would have parametrized additional features it's now eight versions of duplicated code with less than 10% differences. Cost of removing bug or implementing new feature is to about eight times bigger than for single but more complicated file. Not including effects of further "extension" mechanism - avalanche just gains new mass in exponential rate.

You could think what kind of procedures you have there to allow that kind of practices. Ok - team of report analytics was told to eliminate visible duplications, but it wasn't enough. When somebody concentrates on complex analytical problem doesn't think much about removing duplication. Even when report is complex and amount of code and sql is enormous. It's just additional burden that seems not helpful for analytical problem solving.

What could help? Some quick course of coding practices and techniques in context of that environment - basics how to write reusable pieces of code, avoid common pitfalls and reduce codebase to ease development and maintenance. And more practical explanatory examples. Sometimes duplication problem is visible but there is no simple solution - then pairing with more experienced programmer would help. It's just organizational issue.

So now the team is fighting with reports bugs and feature requests. And the biggest impact we are gaining now by not touching those new features or bugs but by merging and removing duplicates. It's ironic but number of bugs is decreasing now in reverse exponential rate.

2008-05-19

CHDK - second life for Canon point and shoot digital cameras

When I've bought my Canon digicam, I've heard rumors about additional software uploaded into Canon cameras. But last year there was still no support for my A570 model. Last week I've found article on Lifehacker mentioning hacking Canon firmware.

I've uploaded CHDK software using SD card reader, and I'm playing now with new features. There is lot of interesting things like making RAW pictures, overriding default range of exposure settings, bracketing and lots of cool of tools like live histogram and under/overexpose region marking.

It feels like I've got new camera - it's indeed second life for that piece of hardware.

I'm trying to get some HDR pictures now using RAW format and auto bracketing features.

So, happy shooting!

2008-05-12

JavaOne 2008 - Java+You

Last week Sun organized JavaOne 2008 conference in San Francisco. I weren't on place, but there is lot of interesting information on conferrence web site.
There were tracks for every of Java technologies, applications in web, SOA and some new ideas introduced like JavaFX - new (?) rich interface and scripting platform.

I invite you to reading, and see what are proposals for Java future.

2008-05-05

Not too much sports

I'm after weekend bike trip. I had such muscle aches I could barely sleep. Definitely I have to to do something with my fitness.

It's hard for me to schedule such activities as these trips, but I'm thinking about another kind of sport activities. Maybe jogging- it takes less time and is more exhausting.

Maybe I should think about some in home gear like walking treadmill or fitness bike.

Do you have any good ideas how to make work together keyboard time, some kind of sport and dense deadlines?