Test Data Gen Techniques

A developer came onto the project and made some sweeping changing to the software. Then he left when funding problems arose. I was left to deal with the mess. My goal was to test the changes to see if anything got broke. I found myself lacking data to run all the regression tests.

I dreaded creating data from scratch. There are so many fields to each of the records. They have varying formats and rules. Plus the data is interrelated. Ouch. There was no time for this. Then I read a somewhat related article on generating data if you already have some.

The idea is to take some existing data, distort it, creating new data sets. Specifically I saw this applied to image data. You stretch an image in a non-uniform fashion. The result is a new image that can be used for test. The key is that you cannot use random stretching across the image. The streching of points must be related to the stretching of points in the local area.

There is a whole science behind image distortion. I did not dig deep. The idea let me implement something similar in my own development database. I first identified a really good record. Then I wrote some routines to replicate that data, distorting the fields that could not be the same. The result was a quick way to generate tons of data.

I am now a happy tester. The icing on the cake was that the changes made by the long gone developer all seem to work just like the original code. I am going to use the distortion technique to cook up a batch of test data for my own new changes to the system.