kelan.io

Collection Extensions in Swift - uniq() and tapDescription()

Here are a few useful protocol extensions I’ve been using in Swift 2.

uniq() – Find unique elements in a Collection

This is inspried by Ruby’s #uniq, and based on this implementation from alskipp, but updated for Swift 2.

extension RangeReplaceableCollectionType where Self.Generator.Element: Hashable {
    /// @return the unique elements of the collection
    func uniq() -> Self {
        var seen: Set<Self.Generator.Element> = Set()

        return reduce(Self()) { result, item in
            if seen.contains(item) {
                return result
            }
            seen.insert(item)
            return result + [item]
        }
    }
}

Here are some simple example uses:

[1,2,3,4].uniq()  // => [1, 2, 3, 4]

[1,2,3,4,4,2].uniq()  // => [1, 2, 3, 4]

[ "one", "two", "one" ].uniq()  // => ["one", "two"]

String("aaabbbccc".characters.uniq())  // => "abc"

But, here’s how I really use it:

let links = WebPageParser(data: dataOfPage).allLinksOnPage()
    .uniq()
    // find URLs like: https://developer.apple.com/videos/wwdc/2015/?id=610
    .filter { $0.lastPathComponent == "2015" }
    // might as well be sorted so it's a stable list
    .sort { $0.absoluteString.compare($1.absoluteString) == .OrderedAscending }

tapDescription() – Print out all elements in a collection, and return self

This is useful in long pipelines like the one directly above, to see if each stage is happening as you expect.

extension ExtensibleCollectionType {
    /// Add this to a chain of methods, to print out the state at that point in the chain.
    /// @param msg A prefix added before printing out the items.
    /// @param transform An optional block that transforms each item to a string.
    /// @return self, so that it doesn't affect the overall pipeline.
    func tapDescription(msg: String, transform: (Self.Generator.Element -> String) = { String($0) } ) -> Self {
        print("\(msg): \(self.map(transform))")
        return self
    }
}

So, during debugging, the above pipeline could be:

let links = WebPageParser(data: dataOfPage).allLinksOnPage()
    .tapDescription("before uniq")
    .uniq()
    .tapDescription("after uniq")
    // find URLs like: https://developer.apple.com/videos/wwdc/2015/?id=610
    .filter { $0.lastPathComponent == "2015" }
    .tapDescription("after filter")
    // might as well be sorted so it's a stable list
    .sort { $0.absoluteString.compare($1.absoluteString) == .OrderedAscending }
    .tapDescription("after sort")

Which might have output like this:

before uniq: [https://developer.apple.com/, https://developer.apple.com/, https://developer.apple.com/technologies/, https://developer.apple.com/resources/, https://developer.apple.com/programs/, https://developer.apple.com/support/]
after uniq: [https://developer.apple.com/, https://developer.apple.com/technologies/, https://developer.apple.com/resources/, https://developer.apple.com/programs/, https://developer.apple.com/support/]
after filter: []
after sort: []

which makes it clear what each step in the pipeline is actually doing.