Trait halo2_middleware::multicore::ParallelSliceMut

pub trait ParallelSliceMut<T>
where T: Send,
{
Show 15 methods // Required method fn as_parallel_slice_mut(&mut self) -> &mut [T]; // Provided methods fn par_split_mut<P>(&mut self, separator: P) -> SplitMut<'_, T, P> where P: Fn(&T) -> bool + Sync + Send { ... } fn par_split_inclusive_mut<P>( &mut self, separator: P ) -> SplitInclusiveMut<'_, T, P> where P: Fn(&T) -> bool + Sync + Send { ... } fn par_chunks_mut(&mut self, chunk_size: usize) -> ChunksMut<'_, T> { ... } fn par_chunks_exact_mut( &mut self, chunk_size: usize ) -> ChunksExactMut<'_, T> { ... } fn par_rchunks_mut(&mut self, chunk_size: usize) -> RChunksMut<'_, T> { ... } fn par_rchunks_exact_mut( &mut self, chunk_size: usize ) -> RChunksExactMut<'_, T> { ... } fn par_sort(&mut self) where T: Ord { ... } fn par_sort_by<F>(&mut self, compare: F) where F: Fn(&T, &T) -> Ordering + Sync { ... } fn par_sort_by_key<K, F>(&mut self, f: F) where K: Ord, F: Fn(&T) -> K + Sync { ... } fn par_sort_by_cached_key<K, F>(&mut self, f: F) where F: Fn(&T) -> K + Sync, K: Ord + Send { ... } fn par_sort_unstable(&mut self) where T: Ord { ... } fn par_sort_unstable_by<F>(&mut self, compare: F) where F: Fn(&T, &T) -> Ordering + Sync { ... } fn par_sort_unstable_by_key<K, F>(&mut self, f: F) where K: Ord, F: Fn(&T) -> K + Sync { ... } fn par_chunk_by_mut<F>(&mut self, pred: F) -> ChunkByMut<'_, T, F> where F: Fn(&T, &T) -> bool + Send + Sync { ... }
}
Expand description

Parallel extensions for mutable slices.

Required Methods§

fn as_parallel_slice_mut(&mut self) -> &mut [T]

Returns a plain mutable slice, which is used to implement the rest of the parallel methods.

Provided Methods§

fn par_split_mut<P>(&mut self, separator: P) -> SplitMut<'_, T, P>
where P: Fn(&T) -> bool + Sync + Send,

Returns a parallel iterator over mutable subslices separated by elements that match the separator.

§Examples
use rayon::prelude::*;
let mut array = [1, 2, 3, 0, 2, 4, 8, 0, 3, 6, 9];
array.par_split_mut(|i| *i == 0)
     .for_each(|slice| slice.reverse());
assert_eq!(array, [3, 2, 1, 0, 8, 4, 2, 0, 9, 6, 3]);

fn par_split_inclusive_mut<P>( &mut self, separator: P ) -> SplitInclusiveMut<'_, T, P>
where P: Fn(&T) -> bool + Sync + Send,

Returns a parallel iterator over mutable subslices separated by elements that match the separator, including the matched part as a terminator.

§Examples
use rayon::prelude::*;
let mut array = [1, 2, 3, 0, 2, 4, 8, 0, 3, 6, 9];
array.par_split_inclusive_mut(|i| *i == 0)
     .for_each(|slice| slice.reverse());
assert_eq!(array, [0, 3, 2, 1, 0, 8, 4, 2, 9, 6, 3]);

fn par_chunks_mut(&mut self, chunk_size: usize) -> ChunksMut<'_, T>

Returns a parallel iterator over at most chunk_size elements of self at a time. The chunks are mutable and do not overlap.

If the number of elements in the iterator is not divisible by chunk_size, the last chunk may be shorter than chunk_size. All other chunks will have that exact length.

§Examples
use rayon::prelude::*;
let mut array = [1, 2, 3, 4, 5];
array.par_chunks_mut(2)
     .for_each(|slice| slice.reverse());
assert_eq!(array, [2, 1, 4, 3, 5]);

fn par_chunks_exact_mut(&mut self, chunk_size: usize) -> ChunksExactMut<'_, T>

Returns a parallel iterator over chunk_size elements of self at a time. The chunks are mutable and do not overlap.

If chunk_size does not divide the length of the slice, then the last up to chunk_size-1 elements will be omitted and can be retrieved from the remainder function of the iterator.

§Examples
use rayon::prelude::*;
let mut array = [1, 2, 3, 4, 5];
array.par_chunks_exact_mut(3)
     .for_each(|slice| slice.reverse());
assert_eq!(array, [3, 2, 1, 4, 5]);

fn par_rchunks_mut(&mut self, chunk_size: usize) -> RChunksMut<'_, T>

Returns a parallel iterator over at most chunk_size elements of self at a time, starting at the end. The chunks are mutable and do not overlap.

If the number of elements in the iterator is not divisible by chunk_size, the last chunk may be shorter than chunk_size. All other chunks will have that exact length.

§Examples
use rayon::prelude::*;
let mut array = [1, 2, 3, 4, 5];
array.par_rchunks_mut(2)
     .for_each(|slice| slice.reverse());
assert_eq!(array, [1, 3, 2, 5, 4]);

fn par_rchunks_exact_mut(&mut self, chunk_size: usize) -> RChunksExactMut<'_, T>

Returns a parallel iterator over chunk_size elements of self at a time, starting at the end. The chunks are mutable and do not overlap.

If chunk_size does not divide the length of the slice, then the last up to chunk_size-1 elements will be omitted and can be retrieved from the remainder function of the iterator.

§Examples
use rayon::prelude::*;
let mut array = [1, 2, 3, 4, 5];
array.par_rchunks_exact_mut(3)
     .for_each(|slice| slice.reverse());
assert_eq!(array, [1, 2, 5, 4, 3]);

fn par_sort(&mut self)
where T: Ord,

Sorts the slice in parallel.

This sort is stable (i.e., does not reorder equal elements) and O(n * log(n)) worst-case.

When applicable, unstable sorting is preferred because it is generally faster than stable sorting and it doesn’t allocate auxiliary memory. See par_sort_unstable.

§Current implementation

The current algorithm is an adaptive merge sort inspired by timsort. It is designed to be very fast in cases where the slice is nearly sorted, or consists of two or more sorted sequences concatenated one after another.

Also, it allocates temporary storage the same size as self, but for very short slices a non-allocating insertion sort is used instead.

In order to sort the slice in parallel, the slice is first divided into smaller chunks and all chunks are sorted in parallel. Then, adjacent chunks that together form non-descending or descending runs are concatenated. Finally, the remaining chunks are merged together using parallel subdivision of chunks and parallel merge operation.

§Examples
use rayon::prelude::*;

let mut v = [-5, 4, 1, -3, 2];

v.par_sort();
assert_eq!(v, [-5, -3, 1, 2, 4]);

fn par_sort_by<F>(&mut self, compare: F)
where F: Fn(&T, &T) -> Ordering + Sync,

Sorts the slice in parallel with a comparator function.

This sort is stable (i.e., does not reorder equal elements) and O(n * log(n)) worst-case.

The comparator function must define a total ordering for the elements in the slice. If the ordering is not total, the order of the elements is unspecified. An order is a total order if it is (for all a, b and c):

  • total and antisymmetric: exactly one of a < b, a == b or a > b is true, and
  • transitive, a < b and b < c implies a < c. The same must hold for both == and >.

For example, while f64 doesn’t implement Ord because NaN != NaN, we can use partial_cmp as our sort function when we know the slice doesn’t contain a NaN.

use rayon::prelude::*;

let mut floats = [5f64, 4.0, 1.0, 3.0, 2.0];
floats.par_sort_by(|a, b| a.partial_cmp(b).unwrap());
assert_eq!(floats, [1.0, 2.0, 3.0, 4.0, 5.0]);

When applicable, unstable sorting is preferred because it is generally faster than stable sorting and it doesn’t allocate auxiliary memory. See par_sort_unstable_by.

§Current implementation

The current algorithm is an adaptive merge sort inspired by timsort. It is designed to be very fast in cases where the slice is nearly sorted, or consists of two or more sorted sequences concatenated one after another.

Also, it allocates temporary storage the same size as self, but for very short slices a non-allocating insertion sort is used instead.

In order to sort the slice in parallel, the slice is first divided into smaller chunks and all chunks are sorted in parallel. Then, adjacent chunks that together form non-descending or descending runs are concatenated. Finally, the remaining chunks are merged together using parallel subdivision of chunks and parallel merge operation.

§Examples
use rayon::prelude::*;

let mut v = [5, 4, 1, 3, 2];
v.par_sort_by(|a, b| a.cmp(b));
assert_eq!(v, [1, 2, 3, 4, 5]);

// reverse sorting
v.par_sort_by(|a, b| b.cmp(a));
assert_eq!(v, [5, 4, 3, 2, 1]);

fn par_sort_by_key<K, F>(&mut self, f: F)
where K: Ord, F: Fn(&T) -> K + Sync,

Sorts the slice in parallel with a key extraction function.

This sort is stable (i.e., does not reorder equal elements) and O(m * n * log(n)) worst-case, where the key function is O(m).

For expensive key functions (e.g. functions that are not simple property accesses or basic operations), par_sort_by_cached_key is likely to be significantly faster, as it does not recompute element keys.

When applicable, unstable sorting is preferred because it is generally faster than stable sorting and it doesn’t allocate auxiliary memory. See par_sort_unstable_by_key.

§Current implementation

The current algorithm is an adaptive merge sort inspired by timsort. It is designed to be very fast in cases where the slice is nearly sorted, or consists of two or more sorted sequences concatenated one after another.

Also, it allocates temporary storage the same size as self, but for very short slices a non-allocating insertion sort is used instead.

In order to sort the slice in parallel, the slice is first divided into smaller chunks and all chunks are sorted in parallel. Then, adjacent chunks that together form non-descending or descending runs are concatenated. Finally, the remaining chunks are merged together using parallel subdivision of chunks and parallel merge operation.

§Examples
use rayon::prelude::*;

let mut v = [-5i32, 4, 1, -3, 2];

v.par_sort_by_key(|k| k.abs());
assert_eq!(v, [1, 2, -3, 4, -5]);

fn par_sort_by_cached_key<K, F>(&mut self, f: F)
where F: Fn(&T) -> K + Sync, K: Ord + Send,

Sorts the slice in parallel with a key extraction function.

During sorting, the key function is called at most once per element, by using temporary storage to remember the results of key evaluation. The key function is called in parallel, so the order of calls is completely unspecified.

This sort is stable (i.e., does not reorder equal elements) and O(m * n + n * log(n)) worst-case, where the key function is O(m).

For simple key functions (e.g., functions that are property accesses or basic operations), par_sort_by_key is likely to be faster.

§Current implementation

The current algorithm is based on pattern-defeating quicksort by Orson Peters, which combines the fast average case of randomized quicksort with the fast worst case of heapsort, while achieving linear time on slices with certain patterns. It uses some randomization to avoid degenerate cases, but with a fixed seed to always provide deterministic behavior.

In the worst case, the algorithm allocates temporary storage in a Vec<(K, usize)> the length of the slice.

All quicksorts work in two stages: partitioning into two halves followed by recursive calls. The partitioning phase is sequential, but the two recursive calls are performed in parallel. Finally, after sorting the cached keys, the item positions are updated sequentially.

§Examples
use rayon::prelude::*;

let mut v = [-5i32, 4, 32, -3, 2];

v.par_sort_by_cached_key(|k| k.to_string());
assert!(v == [-3, -5, 2, 32, 4]);

fn par_sort_unstable(&mut self)
where T: Ord,

Sorts the slice in parallel, but might not preserve the order of equal elements.

This sort is unstable (i.e., may reorder equal elements), in-place (i.e., does not allocate), and O(n * log(n)) worst-case.

§Current implementation

The current algorithm is based on pattern-defeating quicksort by Orson Peters, which combines the fast average case of randomized quicksort with the fast worst case of heapsort, while achieving linear time on slices with certain patterns. It uses some randomization to avoid degenerate cases, but with a fixed seed to always provide deterministic behavior.

It is typically faster than stable sorting, except in a few special cases, e.g., when the slice consists of several concatenated sorted sequences.

All quicksorts work in two stages: partitioning into two halves followed by recursive calls. The partitioning phase is sequential, but the two recursive calls are performed in parallel.

§Examples
use rayon::prelude::*;

let mut v = [-5, 4, 1, -3, 2];

v.par_sort_unstable();
assert_eq!(v, [-5, -3, 1, 2, 4]);

fn par_sort_unstable_by<F>(&mut self, compare: F)
where F: Fn(&T, &T) -> Ordering + Sync,

Sorts the slice in parallel with a comparator function, but might not preserve the order of equal elements.

This sort is unstable (i.e., may reorder equal elements), in-place (i.e., does not allocate), and O(n * log(n)) worst-case.

The comparator function must define a total ordering for the elements in the slice. If the ordering is not total, the order of the elements is unspecified. An order is a total order if it is (for all a, b and c):

  • total and antisymmetric: exactly one of a < b, a == b or a > b is true, and
  • transitive, a < b and b < c implies a < c. The same must hold for both == and >.

For example, while f64 doesn’t implement Ord because NaN != NaN, we can use partial_cmp as our sort function when we know the slice doesn’t contain a NaN.

use rayon::prelude::*;

let mut floats = [5f64, 4.0, 1.0, 3.0, 2.0];
floats.par_sort_unstable_by(|a, b| a.partial_cmp(b).unwrap());
assert_eq!(floats, [1.0, 2.0, 3.0, 4.0, 5.0]);
§Current implementation

The current algorithm is based on pattern-defeating quicksort by Orson Peters, which combines the fast average case of randomized quicksort with the fast worst case of heapsort, while achieving linear time on slices with certain patterns. It uses some randomization to avoid degenerate cases, but with a fixed seed to always provide deterministic behavior.

It is typically faster than stable sorting, except in a few special cases, e.g., when the slice consists of several concatenated sorted sequences.

All quicksorts work in two stages: partitioning into two halves followed by recursive calls. The partitioning phase is sequential, but the two recursive calls are performed in parallel.

§Examples
use rayon::prelude::*;

let mut v = [5, 4, 1, 3, 2];
v.par_sort_unstable_by(|a, b| a.cmp(b));
assert_eq!(v, [1, 2, 3, 4, 5]);

// reverse sorting
v.par_sort_unstable_by(|a, b| b.cmp(a));
assert_eq!(v, [5, 4, 3, 2, 1]);

fn par_sort_unstable_by_key<K, F>(&mut self, f: F)
where K: Ord, F: Fn(&T) -> K + Sync,

Sorts the slice in parallel with a key extraction function, but might not preserve the order of equal elements.

This sort is unstable (i.e., may reorder equal elements), in-place (i.e., does not allocate), and O(m * n * log(n)) worst-case, where the key function is O(m).

§Current implementation

The current algorithm is based on pattern-defeating quicksort by Orson Peters, which combines the fast average case of randomized quicksort with the fast worst case of heapsort, while achieving linear time on slices with certain patterns. It uses some randomization to avoid degenerate cases, but with a fixed seed to always provide deterministic behavior.

Due to its key calling strategy, par_sort_unstable_by_key is likely to be slower than par_sort_by_cached_key in cases where the key function is expensive.

All quicksorts work in two stages: partitioning into two halves followed by recursive calls. The partitioning phase is sequential, but the two recursive calls are performed in parallel.

§Examples
use rayon::prelude::*;

let mut v = [-5i32, 4, 1, -3, 2];

v.par_sort_unstable_by_key(|k| k.abs());
assert_eq!(v, [1, 2, -3, 4, -5]);

fn par_chunk_by_mut<F>(&mut self, pred: F) -> ChunkByMut<'_, T, F>
where F: Fn(&T, &T) -> bool + Send + Sync,

Returns a parallel iterator over the slice producing non-overlapping mutable runs of elements using the predicate to separate them.

The predicate is called on two elements following themselves, it means the predicate is called on slice[0] and slice[1] then on slice[1] and slice[2] and so on.

§Examples
use rayon::prelude::*;
let mut xs = [1, 2, 2, 3, 3, 3];
let chunks: Vec<_> = xs.par_chunk_by_mut(|&x, &y| x == y).collect();
assert_eq!(chunks[0], &mut [1]);
assert_eq!(chunks[1], &mut [2, 2]);
assert_eq!(chunks[2], &mut [3, 3, 3]);

Object Safety§

This trait is not object safe.

Implementations on Foreign Types§

§

impl<T> ParallelSliceMut<T> for [T]
where T: Send,

§

fn as_parallel_slice_mut(&mut self) -> &mut [T]

Implementors§