We have realize problem of unicode text normalization and have implement tools into 4.9.1 to work with normalization issue. Detailed description of this issue is given on the following wiki page.
Category: Valentina KERNEL
[DEPRECATED] I_Field::put_ID()
We have note that put_ID() should not be in the public interfaces of Schema Objects, because developer should not be able change ID of a scheme object.
We have check our sources, and found that only I_Field interface did have such method. So we move it into internal I_FieldEx interface. In the C++ header this method marked as DEPRECATED, and do nothing now. We believe that nobody from C++ developers have used it.
We will consider future ability to do “SET PROPERTY ID of Object name” via SQL, but this will be recommended for use by our own SQL scripts only.
[NEW] VLink.Count Property
We have discover that interface of VLink miss Count property to easy ask VLink how much pairs it contains.
Now this property is implemented in the vkernel. It works for all kinds of links (Binary, ObjectPtr and ForeignKey). It is implemented in effective way — only header of index is asked actually.
Adding of this property do minor extension of VSERVER protocol also.
[Improve] Warnings.log gets Header with Description
We have realize that if to add simple header to Warnings.log that describes what this log contains, when it should be used, and how it can be disabled for RELEASE then this will help to developers and remove often questions about this.
Index search warnings
There are many find-like methods which are able to employ indexes to speed up the searching (FindLike, FindRegEx, FindContains, and so on). Of course, the field must be indexed and search preference is set to kPreferIndexed.
But currently there are two issues when index search may bring a wrong result:
1.Index keeps only first 256 symbols of the value – so rest of the symbols are just ignored in such a search.
2. Index-By-Words index and multi-word searching pattern.
Some of the methods like FindRegEx don’t use an index-search but scan the table instead if any of the above occurred. The rest of the methods are less strict choosing the searching algorithm. Now you will get the warning in such cases and you can switch algorithm to not-index search (kPreferNotIndexed) if it is relevant (less speedy but more accurate). Also you should consider to change index type of such a lengthly fields to the index-by-words.
Warning examples:
“Index search in the first 256 symbols only : FindContains t1:f1”
“Index-By-Words search : FindContains t1:f1”
[TIP] Partial Index in Valentina Database.
Today one of user have asked if Valentina supports partial index. Answer is YES OF COURSE, you can easy create a partial index using Table Methods.
NEW TIP about Reindex Event to Improve Performance
Index fragmentation can cause problems with query performance. Indexes therefore need to be occasionally rebuilt. Read the full article in the “Valentina Performance Guide” document.
New LogFilesLocation property
There is a new property – LogFilesLocation for both server and local.
It defines a folder where the logs should be placed in.
Default folder is “./vlogs”
Altering ObjectPtr
ObjectPtr field contains RecID values of some “target” table. If you need to change ObjectPtr in order to point another table those values are not valid anymore – so default behavior is to nullify them. But sometime it looks to be too strong. For instance – changing field or link name or on-delete policy should not cause such data zeroing.
There are two ways to keep existed values altering ObjectPtr:
- There is a global property – OnChangeTypeZeroPtr. It is set to true by default. But you may change it with following statement – “SET PROPERTY OnChangeTypeZeroPtr TO false”. After this altering ObjectPtr will prevent zeroing data (sure it will be nulls for records which are absent in the new target table). This property is not “schema-storable” – so you should set it up on each run if you need it.
- The first approach is a global setting – so you will be able to keep some ObjectPtr values even changing “target” table. So we implement one more way – much more simple because the only thing you should keep in your mind is – the values will be preserved until you touch “target table” ObjectPtr property (because all values will be valid for the same target anyway).
NEW Function Added: hash MD5()
Ivan have implement today MD5() function, which can be used in EXPRESSION of Valentina SQL.
Bart have asked this function to be able produce UNIQUE INDEX for strings/texts bigger of 128 chars. This limit comes from fact that indexes of Valentina use pascal-like storage format for strings, e.g. 256 bytes maximum, 128 for UTF16.
SPEED of DB Open for Database with 1000+ Tables: -20 sec
44 Seconds for that 1023 tables DB still not looks good for me, and I have continue search for ways to improve it. And after one more day of work time have go down to 34 sec from original 54.
And this new optimization is even better! It is also made on INDEX-level and affects absolutely ANY index search operation!! Especially we will see effect for searches that return many records into Set.
I see yet two possible steps to optimize speed of open for database.
* [minor] – …
* [major] – start to cache data in the algorithms that read schema from recursive sysItem table.
Target is to make time few seconds only.
SPEED of DB Open for Database with 1000+ Tables: -10 sec
One developer have produce Valentina database with 1023 tables. And note us that it takes long time to open it. On my macbook it have take 54 sec.
After some profiling and optimization, time is now 44 sec.
Nice thing here is that optimization was made on INDEX-level for functions that return ArraySets. This means that all other index-based operations may benefit from this.
[NEW] REGEX_REPLACE() function.
In 4.5.1 release we have add a new REGEX_REPLACE() function to Valentina Expresions, which can be used in Table Methods, SQL and therefore, in Reports.
Up to now, we was able to do searches with the help of REGEX operator:
… WHERE fld REGEX ‘ab.[cdf]*’
New function brings the awesome power of REGEX to manipulate by strings stored in the database! Now we can not only search using REGEX, but modify strings! You can do that in the
- SELECT EXPR, …
- UPDATE TABLE T SET fld = EXPR
where EXPR is REGEX_REPLACE() function or contains it as a part.
Read details about this function here.
FindContains() speed improved x10 times
Kem Tekinay have found that FindContains is not fast enough. We was able speed up this place x10 times. Indexed search on table with 7 million rows have go down from 46 sec to 4.6 sec.
You can get this using
- API – VField.FindContains()
- SQL – WHERE LOCATE( “what”, fld ) <> 0
Working with big vDatabases
Valentina is a perfect choice to work with a big data. It employs vertical storage approach. It means – each column has own virtual file to be stored in. Most of db vendors use a horizontal storage – it stores a table data in record-by-record manner.
There are some vertical storage advantages: