whisker.random

whisker/random.py


Copyright © 2011-2020 Rudolf Cardinal (rudolf@pobox.com).

This file is part of the Whisker Python client library.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


Randomization functions that may be used by Whisker tasks.

class whisker.random.ShuffleLayerMethod(flat: bool = False, layer_key: int | None = None, layer_attr: str | None = None, layer_func: Callable[[Any], Any] | None = None, shuffle_func: Callable[[Sequence[Any]], List[int]] | None = None)[source]

Class to representing instructions to layered_shuffle() (q.v.).

Parameters:
  • flat – take data as x[index]?

  • layer_key – take data as x[index][layer_key]?

  • layer_attr – take data as getattr(x[index], layer_attr)?

  • layer_func – take data as layer_func(x[index])?

  • shuffle_func – function (N.B. may be a lambda function with parameters attached) that takes a list of objects and returns a list of INDEXES, suitably shuffled.

Typical values of shuffle_func:

get_indexes_for_value(x: List[Any], value: Any) List[int][source]

Returns a list of indexes of x where its value (as defined by this layer) is value.

get_unique_values(x: List[Any]) List[Any][source]

Returns all the unique values of x for this layer.

get_values(x: List[Any]) List[Any][source]

Returns all the values of interest from x for this layer.

whisker.random.block_shuffle_by_attr(x: List[Any], attrorder: List[str], start: int | None = None, end: int | None = None) None[source]

DEPRECATED: layered_shuffle() is more powerful.

Exactly as for block_shuffle_by_item(), but by item attribute rather than item index number.

For example:

from collections import namedtuple
import itertools
from whisker.random import block_shuffle_by_attr

p = list(itertools.product("ABC", "xyz", "123"))
Trio = namedtuple("Trio", ["upper", "lower", "digit"])
q = [Trio(*x) for x in p]
block_shuffle_by_attr(q, ['upper', 'lower', 'digit'])

q started off as:

[
    Trio(upper='A', lower='x', digit='1'),
    Trio(upper='A', lower='x', digit='2'),
    Trio(upper='A', lower='x', digit='3'),
    Trio(upper='A', lower='y', digit='1'),
    Trio(upper='A', lower='y', digit='2'),
    Trio(upper='A', lower='y', digit='3'),
    Trio(upper='A', lower='z', digit='1'),
    Trio(upper='A', lower='z', digit='2'),
    Trio(upper='A', lower='z', digit='3'),
    Trio(upper='B', lower='x', digit='1'),
    Trio(upper='B', lower='x', digit='2'),
    Trio(upper='B', lower='x', digit='3'),
    Trio(upper='B', lower='y', digit='1'),
    Trio(upper='B', lower='y', digit='2'),
    Trio(upper='B', lower='y', digit='3'),
    Trio(upper='B', lower='z', digit='1'),
    Trio(upper='B', lower='z', digit='2'),
    Trio(upper='B', lower='z', digit='3'),
    Trio(upper='C', lower='x', digit='1'),
    Trio(upper='C', lower='x', digit='2'),
    Trio(upper='C', lower='x', digit='3'),
    Trio(upper='C', lower='y', digit='1'),
    Trio(upper='C', lower='y', digit='2'),
    Trio(upper='C', lower='y', digit='3'),
    Trio(upper='C', lower='z', digit='1'),
    Trio(upper='C', lower='z', digit='2'),
    Trio(upper='C', lower='z', digit='3')
]

but after the shuffle q might now be:

[
    Trio(upper='B', lower='z', digit='1'),
    Trio(upper='B', lower='z', digit='3'),
    Trio(upper='B', lower='z', digit='2'),
    Trio(upper='B', lower='x', digit='1'),
    Trio(upper='B', lower='x', digit='3'),
    Trio(upper='B', lower='x', digit='2'),
    Trio(upper='B', lower='y', digit='3'),
    Trio(upper='B', lower='y', digit='2'),
    Trio(upper='B', lower='y', digit='1'),
    Trio(upper='A', lower='z', digit='2'),
    Trio(upper='A', lower='z', digit='1'),
    Trio(upper='A', lower='z', digit='3'),
    Trio(upper='A', lower='x', digit='1'),
    Trio(upper='A', lower='x', digit='2'),
    Trio(upper='A', lower='x', digit='3'),
    Trio(upper='A', lower='y', digit='3'),
    Trio(upper='A', lower='y', digit='1'),
    Trio(upper='A', lower='y', digit='2'),
    Trio(upper='C', lower='x', digit='2'),
    Trio(upper='C', lower='x', digit='3'),
    Trio(upper='C', lower='x', digit='1'),
    Trio(upper='C', lower='y', digit='2'),
    Trio(upper='C', lower='y', digit='1'),
    Trio(upper='C', lower='y', digit='3'),
    Trio(upper='C', lower='z', digit='1'),
    Trio(upper='C', lower='z', digit='2'),
    Trio(upper='C', lower='z', digit='3')
]

You can see that the A/B/C group has been shuffled as blocks. Then, within B, the x/y/z groups have been shuffled (and so on for A and C). Then, within B.z, the 1/2/3 values have been shuffled (and so on).

whisker.random.block_shuffle_by_item(x: List[Any], indexorder: List[int], start: int | None = None, end: int | None = None) None[source]

DEPRECATED: layered_shuffle() is more powerful.

Shuffles the list x[start:end] hierarchically, in place.

Parameters:
  • x – list to shuffle

  • indexorder – a list of indexes of each item of x The first index varies slowest; the last varies fastest.

  • start – start index of x

  • end – end index of x

For example:

p = list(itertools.product("ABC", "xyz", "123"))

x is now a list of tuples looking like ('A', 'x', '1').

block_shuffle_by_item(p, [0, 1, 2])

p might now look like:

C z 1 } all values of "123" appear  } first "xyz" block
C z 3 } once, but randomized        }
C z 2 }                             }
                                    }
C y 2 } next "123" block            }
C y 1 }                             }
C y 3 }                             }
                                    }
C x 3                               }
C x 2                               }
C x 1                               }

A y 3                               } second "xyz" block
...                                 } ...

A clearer explanation is in block_shuffle_by_attr().

whisker.random.block_shuffle_indexes_by_value(x: List[Any]) List[int][source]

Returns a list of indexes of x, block-shuffled by value.

That is: we aggregate items into blocks, defined by value, and shuffle those blocks, returning the corresponding indexes of the original list.

whisker.random.dwor_shuffle_indexes(x: List[Any], multiplier: int = 1) List[int][source]

Returns a list of indexes of x, DWOR-shuffled by value.

This is a bit tricky as we don’t have a guarantee of equal numbers. It does sensible things in those circumstances.

whisker.random.gen_dwor(values: Iterable[Any], multiplier: int = 1) Generator[Any, None, None][source]

Generates values using a draw-without-replacement (DWOR) system.

Parameters:
  • values – values to generate

  • multiplier – DWOR multiplier; see below.

Yields:

successive values

Here’s how it works.

  • Suppose values == [A, B, C].

  • We’ll call n the number of values (here, 3), and k the “multiplier” parameter.

  • If you iterate through gen_dwor(values, multiplier=1), you will get a sequence that might look like this (with spaces added for clarity):

    CAB ABC BCA BAC BAC ACB CBA ...
    

    That is, individual are drawn randomly from a “hat” of size n =
3, containing one of each thing from values. When the hat is empty, it is refilled with n more.

  • If you iterate through gen_dwor(values, multiplier=2), however, you might get this:

    AACBBC CABBAC BAACCB ...
    

    The computer has put k copies of each value in the hat, and then draws one each time at random (so the hat starts with nk values in it). When the hat is exhausted, it re-populates.

The general idea is to provide randomness, but randomness that is constrained to prevent unlikely but awkward sequences like

AAAAAAAAAAAAAAAA ... unlikely but possible with full randomness!

yet also have the option to avoid predictability. With k = 1, then a clever subject could infer exactly what’s coming up on every nth trial. So a low value of k brings very few “runs” but some predictability; as k approaches infinity, it’s equivalent to full randomness; some reasonably low value of k in between may be a useful experimental sweet spot.

See also, for example:

whisker.random.get_dwor_list(values: Iterable[Any], length: int, multiplier: int = 1) List[Any][source]

Makes a fixed-length list via gen_dwor().

Parameters:
  • values – values to pick from

  • length – list length

  • multiplier – DWOR multiplier

Returns:

list of length length

Example:

from whisker.random import get_dwor_list
values = ["a", "b", "c"]
print(get_dwor_list(values, length=24, multiplier=1))
print(get_dwor_list(values, length=24, multiplier=2))
print(get_dwor_list(values, length=24, multiplier=3))
whisker.random.get_indexes_for_value(x: List[Any], value: Any) List[int][source]

Returns a list of indexes of x where its value is value.

whisker.random.get_unique_values(iterable: Iterable[Any]) List[Any][source]

Gets the unique values of its input. See https://stackoverflow.com/questions/12897374/get-unique-values-from-a-list-in-python.

(We don’t use list(set(x)), because if the elements of x are themselves lists (perfectly common!), that gives TypeError: unhashable type: 'list'.)

whisker.random.last_index_of(x: List[Any], value: Any) int[source]

Gets the index of the last occurrence of value in the list x.

whisker.random.layered_shuffle(x: List[Any], layers: List[ShuffleLayerMethod]) None[source]

Most powerful hierarchical shuffle command here.

Shuffles x in place in a layered way as specified by the sequence of methods.

In more detail:

  • for each layer, it shuffles values of x as defined by the ShuffleLayerMethod (for example: “shuffle x in blocks based on the value of x.someattr”, or “shuffle x randomly”)

  • it then proceeds to deeper layers within sub-lists defined by each unique value from the previous layer.

Parameters:
  • x – sequence (e.g. list) to shuffle

  • layers – list of ShuffleLayerMethod instructions

Examples:

from collections import namedtuple
import itertools
import logging
import random
from whisker.random import *
logging.basicConfig(level=logging.DEBUG)

startlist = ["a", "b", "c", "d", "a", "b", "c", "d", "a", "b", "c", "d"]
x1 = startlist[:]
x2 = startlist[:]
x3 = startlist[:]
x4 = startlist[:]

do_nothing_method = ShuffleLayerMethod(flat=True, shuffle_func=None)
do_nothing_method.get_unique_values(x1)
do_nothing_method.get_indexes_for_value(x1, "b")

layered_shuffle(x1, [do_nothing_method])
print(x1)

flat_randomshuffle_method = ShuffleLayerMethod(
    flat=True, shuffle_func=random_shuffle_indexes)
flat_randomshuffle_method.get_unique_values(x1)
flat_randomshuffle_method.get_indexes_for_value(x1, "b")
layered_shuffle(x1, [flat_randomshuffle_method])
print(x1)

flat_blockshuffle_method = ShuffleLayerMethod(
    flat=True, shuffle_func=block_shuffle_indexes_by_value)
layered_shuffle(x2, [flat_blockshuffle_method])
print(x2)

flat_dworshuffle_method = ShuffleLayerMethod(
    flat=True, shuffle_func=dwor_shuffle_indexes)
layered_shuffle(x3, [flat_dworshuffle_method])
print(x3)

flat_dworshuffle2_method = ShuffleLayerMethod(
    flat=True, shuffle_func=lambda x: dwor_shuffle_indexes(x, multiplier=2))
layered_shuffle(x4, [flat_dworshuffle2_method])
print(x4)

p = list(itertools.product("ABC", "xyz", "123"))
Trio = namedtuple("Trio", ["upper", "lower", "digit"])
q = [Trio(*x) for x in p]
print("\n".join(str(x) for x in q))

upper_method = ShuffleLayerMethod(
    layer_attr="upper", shuffle_func=block_shuffle_indexes_by_value)
lower_method = ShuffleLayerMethod(
    layer_attr="lower", shuffle_func=reverse_sort_indexes)
digit_method = ShuffleLayerMethod(
    layer_attr="digit", shuffle_func=random_shuffle_indexes)

layered_shuffle(q, [upper_method, lower_method, digit_method])
print("\n".join(str(x) for x in q))
whisker.random.make_dwor_hat(values: Iterable[Any], multiplier: int = 1) List[Any][source]

Makes a “hat” to draw values from. See gen_dwor(). Does not modify the starting list; returns a copy.

whisker.random.random_shuffle_indexes(x: List[Any]) List[int][source]

Returns a list of indexes of x, randomly shuffled.

whisker.random.reverse_sort_indexes(x: List[Any]) List[int][source]

Returns the indexes of x in an order that would reverse-sort x by value.

whisker.random.shuffle_list_chunks(x: List[Any], chunksize: int) None[source]

Divides a list into chunks and shuffles the chunks themselves (in place). For example:

x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
shuffle_list_chunks(x, 4)

x might now be

[5, 6, 7, 8, 1, 2, 3, 4, 9, 10, 11, 12]
 ^^^^^^^^^^  ^^^^^^^^^^  ^^^^^^^^^^^^^

Uses cardinal_pythonlib.lists.flatten_list() and cardinal_pythonlib.lists.sort_list_by_index_list(). (I say that mainly to test Intersphinx, when it is enabled.)

whisker.random.shuffle_list_slice(x: List[Any], start: int | None = None, end: int | None = None) None[source]

Shuffles a segment of a list, x[start:end], in place.

Note that start=None means “from the beginning” and end=None means “to the end”.

whisker.random.shuffle_list_subset(x: List[Any], indexes: List[int]) None[source]

Shuffles some elements of a list (in place). The elements to interchange (shuffle) as specified by indexes.

whisker.random.shuffle_list_within_chunks(x: List[Any], chunksize: int) None[source]

Divides a list into chunks and shuffles WITHIN each chunk (in place). For example:

x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
shuffle_list_within_chunks(x, 4)

x might now be:

[4, 1, 3, 2, 7, 5, 6, 8, 9, 12, 11, 10]
 ^^^^^^^^^^  ^^^^^^^^^^  ^^^^^^^^^^^^^
whisker.random.shuffle_where_equal_by_attr(x: List[Any], attrname: str) None[source]

DEPRECATED: layered_shuffle() is more powerful.

Shuffles a list x, in place, where list members are equal as judged by the attribute attrname.

This is easiest to show by example:

from collections import namedtuple
import itertools
from whisker.random import shuffle_where_equal_by_attr

p = list(itertools.product("ABC", "xyz", "123"))
Trio = namedtuple("Trio", ["upper", "lower", "digit"])
q = [Trio(*x) for x in p]
shuffle_where_equal_by_attr(q, 'digit')

q started off as:

[
    Trio(upper='A', lower='x', digit='1'),
    Trio(upper='A', lower='x', digit='2'),
    Trio(upper='A', lower='x', digit='3'),
    Trio(upper='A', lower='y', digit='1'),
    Trio(upper='A', lower='y', digit='2'),
    Trio(upper='A', lower='y', digit='3'),
    Trio(upper='A', lower='z', digit='1'),
    Trio(upper='A', lower='z', digit='2'),
    Trio(upper='A', lower='z', digit='3'),
    Trio(upper='B', lower='x', digit='1'),
    Trio(upper='B', lower='x', digit='2'),
    Trio(upper='B', lower='x', digit='3'),
    Trio(upper='B', lower='y', digit='1'),
    Trio(upper='B', lower='y', digit='2'),
    Trio(upper='B', lower='y', digit='3'),
    Trio(upper='B', lower='z', digit='1'),
    Trio(upper='B', lower='z', digit='2'),
    Trio(upper='B', lower='z', digit='3'),
    Trio(upper='C', lower='x', digit='1'),
    Trio(upper='C', lower='x', digit='2'),
    Trio(upper='C', lower='x', digit='3'),
    Trio(upper='C', lower='y', digit='1'),
    Trio(upper='C', lower='y', digit='2'),
    Trio(upper='C', lower='y', digit='3'),
    Trio(upper='C', lower='z', digit='1'),
    Trio(upper='C', lower='z', digit='2'),
    Trio(upper='C', lower='z', digit='3')
]

but after the shuffle q might now be:

[
    Trio(upper='A', lower='x', digit='1'),
    Trio(upper='A', lower='y', digit='2'),
    Trio(upper='A', lower='z', digit='3'),
    Trio(upper='B', lower='z', digit='1'),
    Trio(upper='A', lower='z', digit='2'),
    Trio(upper='C', lower='x', digit='3'),
    Trio(upper='B', lower='y', digit='1'),
    Trio(upper='A', lower='x', digit='2'),
    Trio(upper='C', lower='y', digit='3'),
    Trio(upper='A', lower='y', digit='1'),
    Trio(upper='C', lower='y', digit='2'),
    Trio(upper='C', lower='z', digit='3'),
    Trio(upper='C', lower='y', digit='1'),
    Trio(upper='C', lower='z', digit='2'),
    Trio(upper='A', lower='y', digit='3'),
    Trio(upper='B', lower='x', digit='1'),
    Trio(upper='B', lower='z', digit='2'),
    Trio(upper='B', lower='y', digit='3'),
    Trio(upper='C', lower='z', digit='1'),
    Trio(upper='C', lower='x', digit='2'),
    Trio(upper='B', lower='z', digit='3'),
    Trio(upper='C', lower='x', digit='1'),
    Trio(upper='B', lower='x', digit='2'),
    Trio(upper='A', lower='x', digit='3'),
    Trio(upper='A', lower='z', digit='1'),
    Trio(upper='B', lower='y', digit='2'),
    Trio(upper='B', lower='x', digit='3')
]

As you can see, the digit attribute seems to have stayed frozen and everything else has jumbled. What has actually happened is that everything with digit == 1 has been shuffled among themselves, and similarly for digit == 2 and digit == 3.

whisker.random.sort_indexes(x: List[Any]) List[int][source]

Returns the indexes of x in an order that would sort x by value.

See https://stackoverflow.com/questions/7851077/how-to-return-index-of-a-sorted-list