Comments
You can use your Mastodon account to reply to this post.
I often encounter variations of the following problem: You are given two or more lists (as
e.g. std::vector
) of elements and you want to sort them “in parallel”, using one of the lists as
keys. I always found the solutions to these kind of problems to be clunky - but with C++23
(or the excellent range-v3
library) we get a very elegant way of doing this.
Say you have two std::vector
(call them vecA
and vecB
) filled with elements that “belong
together”, i.e., vecA[i]
and vecB[i]
somehow belong together. You want them to be sorted by
the values in vecA
, but the elements should stay in sync, i.e., vecA[j]
and vecB[j]
should
later still belong together.
This problem arises surprisingly often when building high-performance algorithms. Taking an example from my day job (I currently build a public transit router for a living), say you have all of the calls of vehicles at a particular station in a particular day. These calls have various attributes, e.g. an arrival time, a departure time, one or several identifier(s) for the vehicle involved, and some other stuff. One thing your algorithm will need to do often is iterate through these in chronological order of departure time. You could just build a large vector of all of these objects and sort that vector by e.g. departure time.
However, this vector would be rather large because the contained objects are large. This means that your iterative algorithm needs to walk all that memory just to iterate the departure times. This is very inefficient - all that memory needs to be loaded into the CPU. It would be a lot more efficient if you had a - much smaller - vector of only the departure times, then you could just iterate in that smaller vector and when you have found what you are looking for, you could use the index of your found element to access whatever you were interested in in the first place.
This is what is often done in practice: All of the attributes of your objects (i.e., calls in this example) are stored in separate vectors, and these vectors are being kept in sync.
range-v3
SolutionLet’s get to the point. You’re here for solutions, not problems. This is the neat C++23 / range-v3 solution (obligatory Godbolt link):
1#include <vector>
2#include <iostream>
3#include <ranges>
4#include <algorithm>
5
6int main() {
7 std::vector<size_t> keys { 1, 4, 3, 0, 2, 8, 6, 5, 7, 9 };
8 std::vector<char> values {'E', 'O', 'L', 'H', 'L', 'L', 'O', 'W', 'R', 'D'};
9
10 // A zip view creates a range of (x,y) tuple-like objects from one range of x's and one range of
11 // y's
12 auto zip = std::ranges::views::zip(keys, values);
13 // ranges::sort (or std::ranges::sort) basically behaves like std::sort, but does take a range
14 // instead of an iterator pair
15 std::ranges::sort(zip, [](const auto & lhs, const auto & rhs) {
16 // lhs and rhs are elements of the zip range, i.e., 'tuple-like objects'
17 return std::get<0>(lhs) < std::get<0>(rhs);
18 });
19
20 for (char c : values) {
21 std::cout << c;
22 }
23}
Note that the actual sorting only takes four lines, without any “magic”: In line 12
, a zip view is
created that transforms the two vectors into a single range of tuple-like (well, pair-like in this
example) objects, and this is then sorted in lines 15-18
. No temporary vectors are created, no
objects are (unnecessarily) copied. So elegant!
The end result is that the values
vector now contains the characters as sorted by keys
(which
should spell HELLOWORLD
).
You can use your Mastodon account to reply to this post.