CXC::Data::Visitor - Invoke a callback on every element at every level of a data structure.
version 0.09
use CXC::Data::Visitor 'visit', 'RESULT_CONTINUE';
my %struct = (
fruit => {
berry => 'purple',
apples => [ 'fuji', 'macoun' ],
} );
visit(
\%struct,
sub ( $kydx, $vref, @ ) {
$vref->$* = 'blue' if $kydx eq 'berry';
return RESULT_CONTINUE;
} );
say $struct{fruit}{berry} # 'blue'
CXC::Data::Visitor::visit performs a depth-first traversal of a data structure, invoking a provided subroutine on elements in the structure.
Here's a partial list of features;
The type of element passed to the callback (containers, terminal elements) can be selected.
The ordering of traversal at any depth is customizeable.
The callback can modify the traversal process.
The complete path from the structure to an element (both the ancestor containers and the keys and indexes required to traverse the path) is available to the callback.
Cycles are detected upon traversing a container a second time in a depth first search, and the resultant action is customizeable.
Objects are treated as terminal elements and are not traversed.
Containers that can be reached multiple times without cycling are visited once per parent.
"visit" has the following signature:
( $completed, $context, $metadata ) = visit( $struct, $callback, %options )
The two mandatory arguments are $struct, a reference to either a hash or an array, and $callback, a reference to a subroutine.
"visit" returns the following:
true if all elements were visited, false if $callback requested a premature return.
The variable of the same name passed to $callback; see the "context" option.
collected metadata. See "Metadata".
"visit" invokes $callback on selected elements of $struct (see "Element Filters"). $callback is invoked as
$directive = $callback->( $kydx, $vref, $context, \%metadata );
The returned value, $directive, informs "visit" how it should proceed. Its values are described in "Traversal Directives"
The arguments passed to $callback are:
The location (key or index) of the element in its parent container (hash or array).
A reference to the element. Use $vref->$* to extract or modify the element's value. Do not cache this value; the full path to the element is provided via the "$metadata" argument.
A reference to data reserved for use by $callback. See the "context" option.
A hash of state information used to keep track of progress. While primarily of use by "visit", some may be of interest to $callback. See "Metadata"
$struct is traversed in depth-first order. At each level elements are traversed in sorted order. For hashes, this is alphabetical, for arrays, numerical. The defaults may be changed with the "key_sort" and "idx_sort" options.
The default traversal order for the structure in the "SYNOPSIS" is
+---------------------------+-------------------------+
| Path | Value |
+---------------------------+-------------------------+
| $struct{fruit} | \$struct{fruit} |
| $struct{fruit}{apples} | \$struct{fruit}{apples} |
| $struct{fruit}{apples}[0] | fuji |
| $struct{fruit}{apples}[1] | macoun |
| $struct{fruit}{berry} | purple |
+---------------------------+-------------------------+
Containers that can be reached multiple times without cycling, e.g.
%hash = ( a => { b => 1 }, );
$hash{c} = $hash{a};
are visited once per parent, e.g.
{a}, {a}{b}, {a}{b}[0]
{c}, {c}{b}, {c}{b}[0]
"$callback" must return a constant (see "EXPORTS") indicating what "visit" should do next.
continue traversing as normal.
return immediately to the caller of "visit".
If the current element is a hash or array, do not visit its contents.
For non-container elements, this is equivalent to "RESULT_CONTINUE".
For example, If RESULT_STOP_DESCENT is returned when $struct{fruit}{apples} is traversed, the traversal would look like this:
+------------------------+--------=----------------+
| Path | Value |
+------------------------+----------=--------------+
| $struct{fruit} | \$struct{fruit} |
| $struct{fruit}{apples} | \$struct{fruit}{apples} |
| $struct{fruit}{berry} | purple |
+------------------------+-------------------------+
Stop processing the current container, and start traversing it again.
For example, if RESULT_REVISIT_CONTAINER is returned the first time $struct{fruit}{apples}[1] is traversed, the traversal would look like this:
+---------------------------+-------------------------+
| Path | Value |
+---------------------------+-------------------------+
| $struct{fruit} | \$struct{fruit} |
| $struct{fruit}{apples} | \$struct{fruit}{apples} |
| $struct{fruit}{apples}[0] | fuji |
| $struct{fruit}{apples}[1] | macoun |
| $struct{fruit}{apples}[0] | fuji |
| $struct{fruit}{apples}[1] | macoun |
| $struct{fruit}{berry} | purple |
+---------------------------+-------------------------+
To avoid inadvertent infinite loops, the number of revisits during a traversal of a container is limited (see "revisit_limit"). Containers with multiple parents are traversed once per parent; The limit is reset for each traversal.
If the element is a container, it will be revisited after its contents are traversed.
During the initial visit
$metadata->{pass} & PASS_VISIT_ELEMENT
will be true. During the followup visit
$metadata->{pass} & PASS_REVISIT_ELEMENT
will be true. During this visit, "$callback" must return either RESULT_RETURN or RESULT_CONTINUE, otherwise an exception will be thrown.
$callback is passed a hash of state information ($metadata) kept by CXC::Data::Visitor::visit, some of which may be of interest to the callback:
$metadata has the following entries:
A reference to the hash or array which contains the element being visited.
An array which contains the path (keys and indices) used to arrive at the current element from $struct.
An array containing references to the ancestor containers of the current element.
A constant indicating the current visit pass through an element. See "RESULT_REVISIT_CONTAINER".
visit may b passed the following options:
Arbitrary data to be passed to "$callback" via the $context argument. Use it for whatever you'd like. If not specified, a hash will be created.
How cycles within $struct should be handled. See "Cycles".
filter out elements passed to $callback. See "Element Filtering"
$coderefThe order of keys when traversing hashes. If true (the default), the order is that returned by Perl's sort routine. If false, it is the order routined that Perl's keys routine.
If a coderef, it is used to sort the keys. It is called as
\@sorted_keys = $coderef->( \@unsorted_keys );
$coderefBy default array elements are traversed in order of their ascending index. Use "idx_sort" to specify a subroutine which returns them in an alternative order. It is called as
\@indices = $coderef->( $n );
where $n is the number of elements in the array.
I<DEPRECATED>
An optional coderef which implements a caller specific sort order. It is passed two keys as arguments. It should return -1, 0, or 1 indicating that the sort order of the first argument is less than, equal to, or greater than that of the second argument.
If "$callback" returns RESULT_REVISIT_CONTAINER the element's parent container is re-scanned for its elements and revisited. To avoid an inadvertent infinite loop, an exception is thrown if the parent container is revisited more than this number of times. It defaults to 10; Set it to 0 to indicate no limit.
The parts of the structure that will trigger a callback. See "EXPORTS" to import the constants.
Invoke "$callback" on containers (either hashes or arrays). For example, the elements in the following structure
$struct = { a => { b => 1, c => [ 2, 3 ] } }
passed to "$callback" are:
a => {...} # $struct->{a}
c => [...] # $struct->{c}
Only visit containers of the given type.
Invoke "$callback" on terminal (leaf) elements. For example, the elements in the following structure
$struct = { a => { b => 1, c => [ 2, 3 ] } }
passed to "$callback" are:
b => 1 # $struct->{a}{b}
0 => 2 # $struct->{a}{c}[0]
1 => 3 # $struct->{a}{c}[1]
Invoke "$callback" on all elements. This is the default.
Throw an exception (the default).
Pretend we haven't seen it before. Will cause stack exhaustion if $callback does handle this.
Truncate before entering the cycle a second time.
Examine the situation and request a particular resolution. $coderef is called as
$coderef->( $container, $context, $metadata );
where $container is the hash or array which has already been traversed. See below for "$context" and "$metadata".
$coderef should return one of CYCLE_DIE, CYCLE_CONTINUE, or CYCLE_TRUNCATE, indicating what should be done.
This module uses Exporter::Tiny, which provides enhanced import utilities.
The following symbols may be exported:
visit
VISIT_CONTAINER VISIT_LEAF VISIT_ALL
CYCLE_DIE CYCLE_CONTINUE CYCLE_TRUNCATE
RESULT_RETURN RESULT_CONTINUE
RESULT_REVISIT_CONTAINER RESULT_REVISIT_ELEMENT
RESULT_STOP_DESCENT
PASS_VISIT_ELEMENT PASS_REVISIT_ELEMENT
The available tags and their respective imported symbols are:
Import all symbols.
RESULT_RETURN RESULT_CONTINUE
RESULT_REVISIT_CONTAINER RESULT_REVISIT_ELEMENT
RESULT_STOP_DESCENT
CYCLE_DIE CYCLE_CONTINUE CYCLE_TRUNCATE
VISIT_CONTAINER VISIT_LEAF VISIT_ALL
PASS_VISIT_ELEMENT PASS_REVISIT_ELEMENT
Import tags results, cycles, visits.
Please report any bugs or feature requests to bug-cxc-data-visitor@rt.cpan.org or through the web interface at: https://rt.cpan.org/Public/Dist/Display.html?Name=CXC-Data-Visitor
Source is available at
https://gitlab.com/djerius/cxc-data-visitor
and may be cloned from
https://gitlab.com/djerius/cxc-data-visitor.git
Please see those modules/websites for more information related to this module.
Diab Jerius <djerius@cpan.org>
This software is Copyright (c) 2024 by Smithsonian Astrophysical Observatory.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007
Hey! The above document had some coding errors, which are explained below:
You forgot a '=back' before '=head2'
You forgot a '=back' before '=head2'
'=item' outside of any '=over'