PWC 277 › We were a Strong Pair, but now we’re Counting Common

This post is part of a series on Mohammad Anwar’s excellent Weekly Challenge, where hackers submit solutions in Perl, Raku, or any other language, to two different challenges every week. (It’s a lot of fun, if you’re into that sort of thing.)

The tasks this week are nice and quick.

You can browse the full source code on GitHub.

Task 1 › Count Common

The first challenge this week has us finding the count of words that appears exactly once in each of two lists. For example:

  1. ('Perl', 'is', 'my', 'friend'), ('Perl', 'and', 'Raku', 'are', 'friend')
    Should return 2 since Perl and friend appear once in each list. But,
  2. ('twice', 'twice'), ('twice') should return 0, since twice appears more than once in one of the arrays.

Since it was easy, I opted to support an arbitrary number of arrays instead of just two. It would of course be trivial to limit the input to two arrays. I also support scalar and list context, so you can easily get just the count (as the task asks), or call the function in list context to get the actual list of common words.

Since my algorithm is O(n), I do not return sorted results by default, as a minor optimization. If stable output is needed, simply sort the result.

Here is the code:

sub count_common {
    'ARRAY' ne ref and croak 'Arguments must be ARRAY refs' for @_; # VAL
    my @once; # $once[$idx]{word} = # True if 'word' appears once in $_[$idx]

    for my $i (keys @_) {
        my %freq; $freq{$_}++ for @{ $_[$i] };
        $once[$i]{$_} = 1 for grep { $freq{$_} == 1 } keys %freq;
    }
    
    grep { my $w = $_; all { $_->{$w} } @once } keys %{$once[0]}
}

If given just two arrays, the complexity is O(n) where n is the maximum array length. However, with an arbitrary number of arrays, m, the complexity is O(mn). This worst case applies if all arrays are the same length and all words appear exactly once in each array.

Task 2 › Strong Pair

The second challenge this week asks us to find strong pairs in a given list of integers. A strong pair is defined as any two consecutive integers x,y and 0 < |x-y| < min(x,y). The provided examples were:

  1. (1..5) ⟼ (2,3), (3,4), (3,5), (4,5)
  2. (5, 7, 1, 7) ⟼ (5,7)

Now, since example #2 does not include flipped pairs (only (5,7), not (7,5)), it is ambiguous whether the intention is to impose a further condition that x < y, or that we are simply to omit the trivially repeated flipped cases. Since there was no x < y constraint given, I will assume the latter condition is what was intended.

This is a simple bit of coding:

sub strong_pair {
    my %seen;

    grep { my ($x,$y) = @$_; my $abs = abs($x - $y);
                      $abs and $abs < $x and $abs < $y
                      and not $seen{$x}{$y}++ } 
     map { my $i = $_; map { [ @_[$i,$_] ] } $i+1..$#_ } 0..$#_ 
}

The bottom map { ... } line gives us every ascending pair of integers from @i (note the inner map { ... }, which operates on the indices of @_ greater than where we are in the outer loop, and returns an ARRAY ref of @_[$i,$_] (the outer and inner indices).

$abs < $x and $abs < $y is an expansion of $abs < min($x,$y). I could use List::Util‘s min() function here (and I will do, later), but if I was stopping here, I wouldn’t bother importing a whole module for that, even though it’s a core module.

The %seen tracking was necessary to capture the provided Example #2. However, if we instead sort and get the unique integers from the list, we can eliminate this check, and also simplify the math. Now that we can expect a list of sorted unique integers, we only need do a bit of arithmetic simplification and check that x < y < 2x, so the final version of the function (with some simple input validation added) becomes:

use List::Util qw< min uniq >;

sub strong_pair {
    ref || $_ !~ /^\-?\d+$/ and croak 'Arguments must be integers' for @_;
    my @i = uniq sort { $a <=> $b } @_;

    grep { my ($x,$y) = @$_; $x < $y and $y < 2*$x }
     map { my $i = $_; map { [ @_[$i,$_] ] } $i+1..$#i } 0..$#i
}

If you are concerned over the cost of the call to sort, recall that sorting runs in O(n log n) time, whereas the above algorithm is already O(n2).

Leave a Reply

Your email address will not be published. Required fields are marked *