[NEW] Generating Test/Bench Data for Data Focused Apps (Part 1)

There are several database tools available for generating records for table T with some random data. Usually these tools can…

  • generate the test data itself;
  • format the data for replication of some bug

Yes, both very useful.  But as speed junkies and test pilots, we also want to use this feature to

  • generate data for use in benchmarks

The difference between test and bench data, is that for benchmarking today, tomorrow and months or years later, we should generate the same records into a Table. Otherwise how we can compare results of a benchmarks as computer scientists? For tests it is okay to use random values in records, but benchmarks require exactness.

We were going to add such feature into Valentina Studio, but then we started thinking about benchmarking the Valentina engine (made in C++).  It is clear then that we need such a feature right in the engine.  So how to implement it?

  • Implementation of some C++ classes?
  • SQL using some new functions?
  • SQL – sequences?
  • SQL – some new commands?
  • SQL – using Stored Procedures?

We want to get solution that allows the writer of both test or benchmarks to describe needed records in minimal lines of code, yet in a declarative way.  Something as

fld_bool = $i mod 2,  NULLs each 3rd row
fld_byte = $i mod 256,  NULLs each 3rd row
fld_short = $i,  NULLs each 4th row
fld_short = -$i,  NULLs each 4th row

fld_float = $i * 3.1415,  NULLs each 6th row

Is Valentina SQL and Stored Procedures is enough to solve this task?

Can we write stored procedures, using  ‘format string’ parameter, parse it, and then in loop adds N records into T, using expressions specified in the format string?

In the next part, we will show our solution.

Published by

Ruslan Zasukhin

VP Engineering and New Technology Paradigma Software, Inc