Choosing appropriate database’s segment size

Each Valentina’s volume consists of set of segments, even internal service-data placed in such segments.

Database storage is implemented similar to some file system. There are volumes (.vdb, .dat, .blb, .ind, .tmp) and there are some embedded files on that volumes (all the data like field-data, indexes and so on go to that files). Each volume operates with own segment map – so we can find (and allocate new) segments for particular embedded file easy and fast.

Assume, you create some field – we create a new embedded file for this field and ask SegmentMap to find a free (or allocate new) first segment for this file. Then, you create an index for this field – the same scenario – new embedded file and additional segment is allocated for it. And on and on… After that, you start filling field with some data. They go to the first segment (allocated for embedded file) until the segment is full of data and segment map allocates another segment for this file…

Segment map is able to maintain limited segment count – 2^21. So there is a limit for volume size and it depends on a segment size.

for Db.SegmentSize = 1KB max volume size is 2 GB
for Db.SegmentSize = 4KB max volume size is 8 GB
for Db.SegmentSize = 32KB max volume size is 64 GB
for Db.SegmentSize = 512KB max volume size is 1 TB

  •  what is the default size of the segment file?

Default value for database segment size is 32Kb. It means each volume may grow up to 64Gb (32Kb * 2^21). If you think it is not enough for your planned data you may choose a bigger segment size.

  • what size is best for me?

bigger than default value:

Let segment size be 512Kb. The good news is:

– Each volume may grow up to 1Tb;
– There is relatively low level of new segment assignments;
– Db size stays unchanged after adding some more data – until the first segment is full of data and we need to allocate additional segments.

But there are some drawbacks as well:

– Assume your data is about 1KB only, but you choose 512KB as db’s segment size – so, you will get approximately 1.5MB (first segment for service internal data + second segment for your db-structure + third segment for a single field data). All of these is just for storing your 1KB data. Obviously it is too much if your data never grows;

– If you have a lot of fields (let it be 1000) (especially “small” fields like boolean, char(1), integer) you need at least one segment for every field – so, you will get 1000 segments – it would be ~512MB database even without any data;

– Assume your data is slightly bigger then a segment size. Even if it’s 1 byte bigger, the next segment will be allocated (512KB) to store that additional 1 byte.

smaller than default value:

The only reason to choose smaller segment size is reducing disk space usage. Assume, you choose something really small, like 1KB segment size.

Pros:
– You get a smaller db – only few KBytes initially;
– There is a low level of wasting db-space;
– 1000-fields database from example above looks much more compact – ~1MB instead of 512MB.

Cons:
– Each volume may grow up to 2GB only;
– More segment find/allocations which can significantly reduce performance;

  • why do I need to manage this manually?

Default segment size lets you have a db up to 64GB. It is hard to say how much is it – obviously, it depends on db nature. But in average, it seems to be a good compromise between a “db chunk size” (32KB is not so much to spare even to keep that additional single byte) and “growing capabilities”.

But if you plan to keep some genome data or lot of pictures, documents and so on, then, probably, 64GB limit is too low – so you should choose bigger segment size. On the other hand, it could be db for storing preferences which is obviously never be as big as 64GB and may have a lot of “small fields” – so, you should consider smaller segment size.

Anyway, you should remember two things:

    1. There is no easy way to change db’s segment size after db creation – the only way for now is creating another db and copy data with SQL/XML dumps;
    2. The less segment size you have, the less data you can write to the Database.
  • what happens if you hit the max number of segments?

You should get an error, but this area is very complex, complicated, with lot of optimizations. You’d better stay away from this situation. And finally you’ll get no option besides migration data to the new db with dumps.

BTW, starting from v.5.0 you can clone the database overriding source-db properties (including segment size). The syntax is similar to “CREATE DATABASE” statement.

Example :

CLONE DATABASE TO ‘cloned.vdb’ WITH SET PROPERTY [SegmentSize] of DATABASE TO 65536