Dave's Free Press

Technology: Data-CompactReadonly-0.0.5

 

Older stuff

No Title


NAME

Data::CompactReadonly


DESCRIPTION

A Compact Read Only Database that consumes very little memory. Once created a database can not be practically updated except by re-writing the whole thing. The aim is for random-access read performance to be on a par with DBM::Deep and for files to be much smaller.


VERSION 'NUMBERS'

This module uses semantic versioning. That means that the version 'number' isn't really a number but has three parts: major.minor.patch.

The major number will increase when the API changes incompatibly;

The minor number will increase when backward-compatible additions are made to the API;

The patch number will increase when bugs are fixed backward-compatibly.


FILE FORMAT VERSIONS

All versions so far support file format version 0 only.

See Data::CompactReadonly::V0::Format for details of what that means.


METHODS

create

Takes two arguments, the name of file into which to write a database, and some data. The data can be undef, a number, some text, or a reference to an array or hash that in turn consists of undefs, numbers, text, references to arrays or hashes, and so on ad infinitum.

This method may be very slow. It constructs a file by making lots of little writes and seek()ing all over the place. It doesn't do anything clever to figure out what pointer size to use, it just tries the shortest first, and then if that's not enough tries again, and again, bigger each time. See Data::CompactReadonly::Format for more on pointer sizes. It may also eat lots of memory. It keeps a cache of everything it has seen while building your database, so that it can re-use data by just pointing at it instead of writing multiple copies of the same data into the file.

Note that it will carefully preserve things that look like numbers but have extraneous leading or trailing zeroes. ``007'', for instance, is text, not a number, the leading zeroes are important. And while 7.10 is a number, the extra zero has meaning - it tells you that the value is accurate to three significant figures. If it were stored as a number, it would be retrieved as merely 7.1, accurate to only two significant figures. We are happy to spend a little extra storage in the interested of correctly storing your data. If you then go on to just treat 7.10 as a number in perl, and so as equivalent to 7.1 that is of course up to you.

Finally, while the file format permits numeric keys in hashes, this method always coerces them to text. This is because if you allow numeric keys, numbers that can't be represented in an int, such as 1e100 or 3.14 will be subject to floating point imprecision, and so it is unlikely that you will ever be able to retrieve them as no exact match is possible.

read

Takes a single compulsory argument, which is a filename or an already open file handle, and some options.

If the first argument is a filehandle, the current file pointer should be at the start of the database (not necessarily at the start of the file; the database could be in a __DATA__ segment) and must have been opened in ``just the bytes ma'am'' mode.

It is a fatal error to pass in a filehandle which was not opened correctly or the name of a file that can't be opened or which doesn't contain a valid database.

The options are name/value pairs. Valid options are:

tie
If true return tied objects instead of normal objects. This means that you will be able to access data by de-referencing and pretending to access elements directly. Under the bonnet this wraps around the objects as documented below, so is just a layer of indirection. On modern hardware you probably won't notice the concomittant slow down but may appreciate the convenience.

fast_collections
If true Dictionary keys and values will be permanently cached in memory the first time they are seen, instead of being fetched from the file when needed. Yes, this means that objects will grow in memory, potentially very large. Only use this if if it an acceptable pay-off for much faster access.

This is not yet implemented for Arrays.

Returns the ``root node'' of the database. If that root node is a number, some piece of text, or Null, then it is decoded and the value returned. Otherwise an object (possibly a tied object) representing an Array or a Dictionary is returned.


OBJECTS

If you asked for normal objects to be returned instead of tied objects, then these are sub-classes of either Data::CompactReadonly::Array or Data::CompactReadonly::Dictionary. Both implement the following three methods:

id

Returns a unique id for this object within the database. Note that circular data structures are supported, and looking at the id is the only way to detect them.

This is not accessible when using tied objects.

count

Returns the number of elements in the structure.

indices

Returns a list of all the available indices in the structure.

element

Takes a single argument, which must match one of the values that would be returned by indices, and returns the associated data.

If the data is a number, Null, or text, the value will be returned directly. If the data is in turn another array or dictionary, an object will be returned.

exists

Takes a single argument and tell you whether an index exists for it. It will still die if you ask it fomr something stupid such as a floating point array index or a Null dictionary entry.


UNSUPPORTED PERL TYPES

Globs, Regexes, References (except to Arrays and Dictionaries)


BUGS/FEEDBACK

Please report bugs by at https://github.com/DrHyde/perl-modules-Data-CompactReadonly/issues, including, if possible, a test case.


SEE ALSO

DBM::Deep if you need updateable databases.


SOURCE CODE REPOSITORY

git://github.com/DrHyde/perl-modules-Data-CompactReadonly.git


AUTHOR, COPYRIGHT and LICENCE

Copyright 2020 David Cantrell <david@cantrell.org.uk>

This software is free-as-in-speech software, and may be used, distributed, and modified under the terms of either the GNU General Public Licence version 2 or the Artistic Licence. It's up to you which one you use. The full text of the licences can be found in the files GPL2.txt and ARTISTIC.txt, respectively.


CONSPIRACY

This module is also free-as-in-mason software.