v5.6 Builds Index of String/VarChar field in 10-15 times faster!

Bench Description

Prepare Steps

  • We have table T1 with field String[120] or VarChar.
  • We add 1 million records with unique values using METHOD( ‘concat( rand_string(80), RecID’ ) ) 

 So we get table with one only field, 1 millions records, values are about 80-90 chars length.

Benches

  • Build index for this field.

Bench Hardware

MacBook Pro 2.2 GHz Intel Core i7 (Early 2011),
RAM 8GB 1333 MHz,
HDD 500GB 7200 rpm, about 45-50 Mb/sec.

 

Bench using v5.5.8

=========================================================
Build Index about 120 sec.
=========================================================

Bench using v5.6b10

We have implemented optimization, which improved time to:
=========================================================
Build Index about 75 sec.
=========================================================

At this point we have bench Sqlite. After export/import of the same data.

* Sqlite, which uses single-byte encoding, did bench in 35-40 sec.

This is okay because x2 smaller data to store on disk. Let me underline 1 field in table, so we have fair comparison of build index algs.  If Table will have N fields, then Valentina gets additional benefit because of columnar format.

 

Bench using v5.6b25

We have implemented one more optimization and now we have time:
=========================================================
Build Index about 7-8 sec.
=========================================================

The best possible time for this task on this hardware, we estimate in about 5 seconds.

Published by

Ruslan Zasukhin

VP Engineering and New Technology Paradigma Software, Inc