git.subgeniuskitty.com - OpenSPARC-T2-DV/.git/blame_incremental - tools/perlmod/SunOS-i386/Bit/Vector/String.pod

... / ...

Commit	Line	Data
	1
	2	=head1 NAME
	3
	4	Bit::Vector::String - Generic string import/export for Bit::Vector
	5
	6	=head1 SYNOPSIS
	7
	8	use Bit::Vector::String;
	9
	10	to_Oct
	11	$string = $vector->to_Oct();
	12
	13	from_Oct
	14	$vector->from_Oct($string);
	15
	16	new_Oct
	17	$vector = Bit::Vector->new_Oct($bits,$string);
	18
	19	String_Export
	20	$string = $vector->String_Export($type);
	21
	22	String_Import
	23	$type = $vector->String_Import($string);
	24
	25	new_String
	26	$vector = Bit::Vector->new_String($bits,$string);
	27	($vector,$type) = Bit::Vector->new_String($bits,$string);
	28
	29	=head1 DESCRIPTION
	30
	31	=over 2
	32
	33	=item *
	34
	35	C<$string = $vector-E<gt>to_Oct();>
	36
	37	Returns an octal string representing the given bit vector.
	38
	39	Note that this method is not particularly efficient, since it
	40	is almost completely realized in Perl, and moreover internally
	41	operates on a Perl list of individual octal digits which it
	42	concatenates into the final string using "C<join('', ...)>".
	43
	44	A benchmark reveals that this method is about 40 times slower
	45	than the method "C<to_Bin()>" (which is realized in C):
	46
	47	Benchmark: timing 10000 iterations of to_Bin, to_Hex, to_Oct...
	48	to_Bin: 1 wallclock secs ( 1.09 usr + 0.00 sys = 1.09 CPU)
	49	to_Hex: 1 wallclock secs ( 0.53 usr + 0.00 sys = 0.53 CPU)
	50	to_Oct: 40 wallclock secs (40.16 usr + 0.05 sys = 40.21 CPU)
	51
	52	Note that since an octal digit is always worth three bits,
	53	the length of the resulting string is always a multiple of
	54	three bits, regardless of the true length (in bits) of the
	55	given bit vector.
	56
	57	Also note that the B<LEAST> significant octal digit is
	58	located at the B<RIGHT> end of the resulting string, and
	59	the B<MOST> significant digit at the B<LEFT> end.
	60
	61	Finally, note that this method does B<NOT> prepend any uniquely
	62	identifying format prefix (such as "0o") to the resulting string
	63	(which means that the result of this method only contains valid
	64	octal digits, i.e., [0-7]).
	65
	66	However, this can of course most easily be done as needed,
	67	as follows:
	68
	69	$string = '0o' . $vector->to_Oct();
	70
	71	=item *
	72
	73	C<$vector-E<gt>from_Oct($string);>
	74
	75	Allows to read in the contents of a bit vector from an octal string,
	76	such as returned by the method "C<to_Oct()>" (see above).
	77
	78	Note that this method is not particularly efficient, since it is
	79	almost completely realized in Perl, and moreover chops the input
	80	string into individual characters using "C<split(//, $string)>".
	81
	82	Remember also that the least significant bits are always to the
	83	right of an octal string, and the most significant bits to the left.
	84	Therefore, the string is actually reversed internally before storing
	85	it in the given bit vector using the method "C<Chunk_List_Store()>",
	86	which expects the least significant chunks of data at the beginning
	87	of a list.
	88
	89	A benchmark reveals that this method is about 40 times slower than
	90	the method "C<from_Bin()>" (which is realized in C):
	91
	92	Benchmark: timing 10000 iterations of from_Bin, from_Hex, from_Oct...
	93	from_Bin: 1 wallclock secs ( 1.13 usr + 0.00 sys = 1.13 CPU)
	94	from_Hex: 1 wallclock secs ( 0.80 usr + 0.00 sys = 0.80 CPU)
	95	from_Oct: 46 wallclock secs (44.95 usr + 0.00 sys = 44.95 CPU)
	96
	97	If the given string contains any character which is not an octal digit
	98	(i.e., [0-7]), a fatal syntax error ensues ("unknown string type").
	99
	100	Note especially that this method does B<NOT> accept any uniquely
	101	identifying format prefix (such as "0o") in the given string; the
	102	presence of such a prefix will also lead to the fatal "unknown
	103	string type" error.
	104
	105	If the given string contains less octal digits than are needed to
	106	completely fill the given bit vector, the remaining (most significant)
	107	bits all remain cleared (i.e., set to zero).
	108
	109	This also means that, even if the given string does not contain
	110	enough digits to completely fill the given bit vector, the previous
	111	contents of the bit vector are erased completely.
	112
	113	If the given string is longer than it needs to fill the given bit
	114	vector, the superfluous characters are simply ignored.
	115
	116	This behaviour is intentional so that you may read in the string
	117	representing one bit vector into another bit vector of different
	118	size, i.e., as much of it as will fit.
	119
	120	=item *
	121
	122	C<$vector = Bit::Vector-E<gt>new_Oct($bits,$string);>
	123
	124	This method is an alternative constructor which allows you to create
	125	a new bit vector object (with "C<$bits>" bits) and to initialize it
	126	all in one go.
	127
	128	The method internally first calls the bit vector constructor method
	129	"C<new()>" and then stores the given string in the newly created
	130	bit vector using the same approach as the method "C<from_Oct()>"
	131	(described above).
	132
	133	Note that this approach is not particularly efficient, since it
	134	is almost completely realized in Perl, and moreover chops the input
	135	string into individual characters using "C<split(//, $string)>".
	136
	137	An exception will be raised if the necessary memory cannot be allocated
	138	(see the description of the method "C<new()>" in L<Bit::Vector(3)> for
	139	possible causes) or if the given string cannot be converted successfully
	140	(see the description of the method "C<from_Oct()>" above for details).
	141
	142	Note especially that this method does B<NOT> accept any uniquely
	143	identifying format prefix (such as "0o") in the given string and that
	144	such a prefix will lead to a fatal "unknown string type" error.
	145
	146	In case of an error, the memory occupied by the new bit vector is
	147	released again before the exception is actually thrown.
	148
	149	If the number of bits "C<$bits>" given has the value "C<undef>",
	150	the method will automatically allocate a bit vector with a size
	151	(i.e., number of bits) of three times the length of the given string
	152	(since every octal digit is worth three bits).
	153
	154	Note that this behaviour is different from that of the methods
	155	"C<new_Hex()>", "C<new_Bin()>", "C<new_Dec()>" and "C<new_Enum()>"
	156	(which are realized in C, internally); these methods will silently
	157	assume a value of 0 bits if "C<undef>" is given (and may warn
	158	about the "Use of uninitialized value" if warnings are enabled).
	159
	160	=item *
	161
	162	C<$string = $vector-E<gt>String_Export($type);>
	163
	164	Returns a string representing the given bit vector in the
	165	format specified by "C<$type>":
	166
	167	1 \| b \| bin => binary (using "to_Bin()")
	168	2 \| o \| oct => octal (using "to_Oct()")
	169	3 \| d \| dec => decimal (using "to_Dec()")
	170	4 \| h \| hex \| x => hexadecimal (using "to_Hex()")
	171	5 \| e \| enum => enumeration (using "to_Enum()")
	172	6 \| p \| pack => packed binary (using "Block_Read()")
	173
	174	The case (lower/upper/mixed case) of "C<$type>" is ignored.
	175
	176	If "C<$type>" is omitted or "C<undef>" or false ("0"
	177	or the empty string), a hexadecimal string is returned
	178	as the default format.
	179
	180	If "C<$type>" does not have any of the values described
	181	above, a fatal "unknown string type" will occur.
	182
	183	Beware that in order to guarantee that the strings can
	184	be correctly parsed and read in by the methods
	185	"C<String_Import()>" and "C<new_String()>" (described
	186	below), the method "C<String_Export()>" provides
	187	uniquely identifying prefixes (and, in one case,
	188	a suffix) as follows:
	189
	190	1 \| b \| bin => '0b' . $vector->to_Bin();
	191	2 \| o \| oct => '0o' . $vector->to_Oct();
	192	3 \| d \| dec => $vector->to_Dec(); # prefix is [+-]
	193	4 \| h \| hex \| x => '0x' . $vector->to_Hex();
	194	5 \| e \| enum => '{' . $vector->to_Enum() . '}';
	195	6 \| p \| pack => ':' . $vector->Size() .
	196	':' . $vector->Block_Read();
	197
	198	This is necessary because certain strings can be valid
	199	representations in more than one format.
	200
	201	All strings in binary format, i.e., which only contain "0"
	202	and "1", are also valid number representations (of a different
	203	value, of course) in octal, decimal and hexadecimal.
	204
	205	Likewise, a string in octal format is also valid in decimal
	206	and hexadecimal, and a string in decimal format is also valid
	207	in hexadecimal.
	208
	209	Moreover, if the enumeration of set bits (as returned by
	210	"C<to_Enum()>") only contains one element, this element could
	211	be mistaken for a representation of the entire bit vector
	212	(instead of just one bit) in decimal.
	213
	214	Beware also that the string returned by format "6" ("packed
	215	binary") will in general B<NOT BE PRINTABLE>, because it will
	216	usually consist of many unprintable characters!
	217
	218	=item *
	219
	220	C<$type = $vector-E<gt>String_Import($string);>
	221
	222	Allows to read in the contents of a bit vector from a string
	223	which has previously been produced by "C<String_Export()>",
	224	"C<to_Bin()>", "C<to_Oct()>", "C<to_Dec()>", "C<to_Hex()>",
	225	"C<to_Enum()>", "C<Block_Read()>" or manually or by another
	226	program.
	227
	228	Beware however that the string must have the correct format;
	229	otherwise a fatal "unknown string type" error will occur.
	230
	231	The correct format is the one returned by "C<String_Export()>"
	232	(see immediately above).
	233
	234	The method will also try to automatically recognize formats
	235	without identifying prefix such as returned by the methods
	236	"C<to_Bin()>", "C<to_Oct()>", "C<to_Dec()>", "C<to_Hex()>"
	237	and "C<to_Enum()>".
	238
	239	However, as explained above for the method "C<String_Export()>",
	240	due to the fact that a string may be a valid representation in
	241	more than one format, this may lead to unwanted results.
	242
	243	The method will try to match the format of the given string
	244	in the following order:
	245
	246	If the string consists only of [01], it will be considered
	247	to be in binary format (although it could be in octal, decimal
	248	or hexadecimal format or even be an enumeration with only
	249	one element as well).
	250
	251	If the string consists only of [0-7], it will be considered
	252	to be in octal format (although it could be in decimal or
	253	hexadecimal format or even be an enumeration with only
	254	one element as well).
	255
	256	If the string consists only of [0-9], it will be considered
	257	to be in decimal format (although it could be in hexadecimal
	258	format or even be an enumeration with only one element as well).
	259
	260	If the string consists only of [0-9A-Fa-f], it will be considered
	261	to be in hexadecimal format.
	262
	263	If the string only contains numbers in decimal format, separated
	264	by commas (",") or dashes ("-"), it is considered to be an
	265	enumeration (a single decimal number also qualifies).
	266
	267	And if the string starts with ":[0-9]:", the remainder of the
	268	string is read in with "C<Block_Store()>".
	269
	270	To avoid misinterpretations, it is therefore recommendable to
	271	always either use the method "C<String_Export()>" or to provide
	272	some uniquely identifying prefix (and suffix, in one case)
	273	yourself:
	274
	275	binary => '0b' . $string;
	276	octal => '0o' . $string;
	277	decimal => '+' . $string; # in case "$string"
	278	=> '-' . $string; # has no sign yet
	279	hexadecimal => '0x' . $string;
	280	=> '0h' . $string;
	281	enumeration => '{' . $string . '}';
	282	=> '[' . $string . ']';
	283	=> '<' . $string . '>';
	284	=> '(' . $string . ')';
	285	packed binary => ':' . $vector->Size() .
	286	':' . $vector->Block_Read();
	287
	288	Note that case (lower/upper/mixed case) is not important
	289	and will be ignored by this method.
	290
	291	Internally, the method uses the methods "C<from_Bin()>",
	292	"C<from_Oct()>", "C<from_Dec()>", "C<from_Hex()>",
	293	"C<from_Enum()>" and "C<Block_Store()>" for actually
	294	importing the contents of the string into the given
	295	bit vector. See their descriptions here in this document
	296	and in L<Bit::Vector(3)> for any further conditions that
	297	must be met and corresponding possible fatal error messages.
	298
	299	The method returns the number of the format that has been
	300	recognized:
	301
	302	1 => binary
	303	2 => octal
	304	3 => decimal
	305	4 => hexadecimal
	306	5 => enumeration
	307	6 => packed binary
	308
	309	=item *
	310
	311	C<$vector = Bit::Vector-E<gt>new_String($bits,$string);>
	312
	313	C<($vector,$type) = Bit::Vector-E<gt>new_String($bits,$string);>
	314
	315	This method is an alternative constructor which allows you to create
	316	a new bit vector object (with "C<$bits>" bits) and to initialize it
	317	all in one go.
	318
	319	The method internally first calls the bit vector constructor method
	320	"C<new()>" and then stores the given string in the newly created
	321	bit vector using the same approach as the method "C<String_Import()>"
	322	(described immediately above).
	323
	324	An exception will be raised if the necessary memory cannot be allocated
	325	(see the description of the method "C<new()>" in L<Bit::Vector(3)> for
	326	possible causes) or if the given string cannot be converted successfully
	327	(see the description of the method "C<String_Import()>" above for details).
	328
	329	In case of an error, the memory occupied by the new bit vector is
	330	released again before the exception is actually thrown.
	331
	332	If the number of bits "C<$bits>" given has the value "C<undef>", the
	333	method will automatically determine this value for you and allocate
	334	a bit vector of the calculated size.
	335
	336	Note that this behaviour is different from that of the methods
	337	"C<new_Hex()>", "C<new_Bin()>", "C<new_Dec()>" and "C<new_Enum()>"
	338	(which are realized in C, internally); these methods will silently
	339	assume a value of 0 bits if "C<undef>" is given (and may warn
	340	about the "Use of uninitialized value" if warnings are enabled).
	341
	342	The necessary number of bits is calculated as follows:
	343
	344	binary => length($string);
	345	octal => 3 * length($string);
	346	decimal => int( length($string) * log(10) / log(2) + 1 );
	347	hexadecimal => 4 * length($string);
	348	enumeration => maximum of values found in $string + 1
	349	packed binary => $string =~ /^:(\d+):/;
	350
	351	If called in scalar context, the method returns the newly created
	352	bit vector object.
	353
	354	If called in list context, the method additionally returns the
	355	number of the format which has been recognized, as explained
	356	above for the method "C<String_Import()>".
	357
	358	=back
	359
	360	=head1 SEE ALSO
	361
	362	Bit::Vector(3), Bit::Vector::Overload(3).
	363
	364	=head1 VERSION
	365
	366	This man page documents "Bit::Vector::String" version 6.4.
	367
	368	=head1 AUTHOR
	369
	370	Steffen Beyer
	371	mailto:sb@engelschall.com
	372	http://www.engelschall.com/u/sb/download/
	373
	374	=head1 COPYRIGHT
	375
	376	Copyright (c) 2004 by Steffen Beyer. All rights reserved.
	377
	378	=head1 LICENSE
	379
	380	This package is free software; you can redistribute it and/or
	381	modify it under the same terms as Perl itself, i.e., under the
	382	terms of the "Artistic License" or the "GNU General Public License".
	383
	384	The C library at the core of this Perl module can additionally
	385	be redistributed and/or modified under the terms of the "GNU
	386	Library General Public License".
	387
	388	Please refer to the files "Artistic.txt", "GNU_GPL.txt" and
	389	"GNU_LGPL.txt" in this distribution for details!
	390
	391	=head1 DISCLAIMER
	392
	393	This package is distributed in the hope that it will be useful,
	394	but WITHOUT ANY WARRANTY; without even the implied warranty of
	395	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
	396
	397	See the "GNU General Public License" for more details.
	398