Dave's Free Press

Technology: Scalar-Type-0.3.1

 

Older stuff

No Title


NAME

Scalar::Type


DESCRIPTION

Figure out what type a scalar is


SYNOPSIS

  use Scalar::Type qw(is_number);
  if(is_number(2)) {
      # yep, 2 is a number
      # it is_integer too
  }
  if(is_number("2")) {
      # no, "2" is a string
  }


OVERVIEW

Perl scalars can be either strings or numbers, and normally you don't really care which is which as it will do all the necessary type conversions automagically. This means that you can perform numeric operations on strings and provided that they looks like a number you'll get a sensible result:

    my $string = "4";
    my $number = 1;
    my $result = $string + $number; # 5

But in some rare cases, generally when you are serialising data, the difference matters. This package provides some useful functions to help you figure out what's what. The following functions are available. None of them are exported by default. If you want them all, export ':all':

    use Scalar::Type qw(:all);

and if you just want the 'is_*' functions you can get them all in one go:

    use Scalar::Type qw(is_*);

For Reasons, :is_* is equivalent.


FUNCTIONS

All of these functions require an argument. It is a fatal error to call them without.

type

Returns the type of its argument.

If the argument is a reference then it returns either blessed($argument) (if it's an object), or 'REF_TO_'.ref($argument).

If the argument is undef then it returns 'UNDEF'.

If you are using perl 5.35.7 or later and the argument is the result of a comparison then it returns 'BOOL'.

Otherwise it looks for the IOK or NOK flags on the underlying SV (see GORY DETAILS for the exact mechanics) and returns INTEGER or NUMBER as appropriate. Finally, if neither of those are set it returns SCALAR.

bool_supported

Returns true if the 'BOOL' type is supported on this perl (ie if your perl version is 5.35.7 or later) and false otherwise.

sizeof

Returns the size, in bytes, of the underlying storage for numeric types, and die()s for any other type.

is_integer

Returns true if its argument is an integer. Note that ``1'' is not an integer, it is a string. 1 is an integer. 1.1 is obviously not an integer. 1.0 is also not an integer, as it makes a different statement about precision - 1 is *exactly* one, but 1.0 is only one to two significant figures.

All integers are of course also numbers.

is_number

Returns true if its argument is a number. ``1'' is not a number, it is a string. 1 is a number. 1.0 and 1.1 are numbers too.

is_bool

It is a fatal error to call this on perl versions earlier than 5.35.7.

Returns true if its argument is a Boolean - ie, the result of a comparison.


GORY DETAILS

PERL VARIABLE INTERNALS

As far as Perl code is concerned scalars will present themselves as integers, floats or strings on demand. Internally scalars are stored in a C structure, called an SV (scalar value), which contains several slots. The important ones for our purposes are:

IV
an integer value

UV
an unsigned integer value, only used for ints > MAXINT / 2.

NV
a numeric value (ie a float)

PV
a pointer value (ie a string)

When a value is created one of those slots will be filled. As various operations are done on a value the slot's contents may change, and other slots may be filled.

For example:

    my $foo = "4";        # fill $foo's PV slot, as "4" is a string
    my $bar = $foo + 1;   # fill $bar's IV slot, as 4 + 1 is an int,
                          # and fill $foo's IV slot, as we had to figure
                          # out the numeric value of the string
    $foo = "lemon";       # fill $foo's PV slot, as "lemon" is a string

That last operation immediately shows a problem. $foo's IV slot was filled with the integer value 4, but the assignment of the string "lemon" only filled the PV slot. So what's in the IV slot? There's a handy tool for that, Devel::Peek, which is distributed with perl. Here's part of Devel::Peek's output:

    $ perl -MDevel::Peek -E 'my $foo = 4; $foo = "lemon"; Dump($foo);'
      IV = 4
      PV = 0x7fe6e6c04c90 "lemon"\0

So how, then, does perl know that even thought there's a value in the IV slot it shouldn't be used? Because once you've assigned "lemon" to the variable you can't get that 4 to show itself ever again, at least not from pure perl code.

The SV also has a flags field, which I missed out above. (I've also missed out some of the flags here, I'm only showing you the relevant ones):

    $ perl -MDevel::Peek -E 'my $foo = 4; $foo = "lemon"; Dump($foo);'
      FLAGS = (POK)
      IV = 4
      PV = 0x7fe6e6c04c90 "lemon"\0

The POK flag means, as you might have guessed, that the PV slot has valid contents - in case you're wondering, the PV slot there contains a pointer to the memory address 0x7fe6e6c04c90, at which can be found the word lemon.

It's possible to have multiple flags set. That's the case in the second line of code in the example. In that example a variable contains the string "4", so the PV slot is filled and the POK flag is set. We then take the value of that variable, add 1, and assign the result to another variable. Obviously adding 1 to a string is meaningless, so the string has to first be converted to a number. That fills the IV slot:

    $ perl -MDevel::Peek -E 'my $foo = "4"; my $bar = $foo + 1; Dump($foo);'
      FLAGS = (IOK,POK)
      IV = 4
      PV = 0x7fd6e7d05210 "4"\0

Notice that there are now two flags. IOK means that the IV slot's contents are valid, and POK that the PV slot's contents are valid. Why do we need both slots in this case? Because a non-numeric string such as "lemon" is treated as the integer 0 if you perform numeric operations on it.

All that I have said above about IVs also applies to NVs, and you will sometimes come across a variable with both the IV and NV slots filled, or even all three:

    $ perl -MDevel::Peek -E 'my $foo = 1e2; my $bar = $foo + 0; $bar = $foo . ""; Dump($foo)'
      FLAGS = (IOK,NOK,POK)
      IV = 100
      NV = 100
      PV = 0x7f9ee9d12790 "100"\0

Finally, it's possible to have multiple flags set even though the slots contain what looks (to a human) like different values:

    $ perl -MDevel::Peek -E 'my $foo = "007"; $foo + 0; Dump($foo)'
      FLAGS = (IOK,POK)
      IV = 7
      PV = 0x7fcf425046c0 "007"\0

That code initialises the variable to the string "007", then uses it in a numeric operation. That causes the string to be numified, the IV slot to be filled, and the IOK flag set. It should, of course, be clear to any fan of classic literature that ``007'' and 7 are very different things. ``007'' is not an integer.

Booleans

In perl 5.35.7 and later, Boolean values - ie the results of comparisons - have some extra magic. As well as their value, which is either 1 (true, an integer) or '' (false, an empty string), they have a flag to indicate their Booleanness. This is exposed via the builtin::isbool perl function so we don't need to do XS voodoo to interrogate it.

WHAT Scalar::Type DOES (at least in version 0.1.0)

NB that this section documents an internal function that is not intended for public use. The interface of _scalar_type should be considered to be unstable, not fit for human consumption, and subject to change without notice. This documentation is correct as of version 0.1.0 but may not be updated for future versions - its purpose is pedagogical only.

The is_* functions are just wrappers around the type function. That in turn delegates most of the work to a few lines of C code which grovel around looking at the contents of the individual slots and flags. That function isn't exported, but if you really want to call it directly it's called _scalar_type and will return one of four strings, INTEGER, NUMBER, or SCALAR. It will return SCALAR even for a reference or undef, which is why I said that the type function only *mostly* wraps around it :-)

The first thing that _scalar_type does is look at the IOK flag. If it's set, and the POK flag is not set, the it returns INTEGER. If IOK and POK are set it stringifies the contents of the IV slot, compares to the contents of the PV slot, and returns INTEGER if they are the same, or SCALAR otherwise.

The reason for jumping through those hoops is so that we can correctly divine the type of "007" in the last example above.

If IOK isn't set we then look at NOK. That follows exactly the same logic, looking also at POK, and returning either NUMBER or SCALAR, being careful about strings like "007.5".

If neither IOK nor NOK is set then we return SCALAR.

And what about UVs? They are treated exactly the same as IVs, and a variable with a valid UV slot will have the IOK flag set. It will also have the IsUV flag set, which we use to determine how to stringify the number.


SEE ALSO

Scalar::Util in particular its blessed function.

builtin if you have perl 5.35.7 or later.


BUGS

If you find any bugs please report them on Github, preferably with a test case.

Integers that are specifed using exponential notation, such as if you say 1e2 instead of 100, are *not* internally treated as integers. The perl parser is lazy and only bothers to convert them into an integer after you perform int-ish operations on them, such as adding 0. Likewise if you add 0 to the thoroughly non-numeric ``100'' perl will convert it to an integer. These edge cases are partly why you almost certainly don't care about what this module does. If they irk you, complain to p5p.


FEEDBACK

I welcome feedback about my code, especially constructive criticism.


AUTHOR, COPYRIGHT and LICENCE

Copyright 2021 David Cantrell <david@cantrell.org.uk>

This software is free-as-in-speech software, and may be used, distributed, and modified under the terms of either the GNU General Public Licence version 2 or the Artistic Licence. It's up to you which one you use. The full text of the licences can be found in the files GPL2.txt and ARTISTIC.txt, respectively.


CONSPIRACY

This module is also free-as-in-mason software.