KISS

Keep It Simple Stupid

Lightweight validation in swift from scratch

| comments

A program that takes any input from the outside world must validate it. In a project I worked on, I encountered a problem with the code that validates a received response and I didn’t know why validation failed because the logs only said “Rejected response X because it’s invalid”. The problem is that the validation function just returned a Bool, which doesn’t carry any extra information as I show in this post.

This article is about a general idea of how to get more information from various processes in your program, in this case, from validation. It describes only the first steps and can be extended further.

The repository with the sample code is at https://github.com/eunikolsky/LightweightValidation.

Domain model

Say we have a sample response that we need to validate:

1
2
3
4
5
6
7
8
9
/// A sample response from a service.
struct Response {
    /// An identifier to correlate the response with its request.
    let correlationId: Int
    /// User who created the data provided in this response.
    let userName: String
    /// Some extra data.
    let extraData: Data
}

Basic validation

The most basic validation is a function taking a response and returning a Bool, here separated into logical, independent steps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
extension Response {
    public func validate(sentIds: Set<Int>) -> Bool {
        validateCorrelationId(sentIds)
            && validateUserName()
    }

    /// The correlation id must be in the list of correlation ids of sent requests.
    private func validateCorrelationId(_ sentIds: Set<Int>) -> Bool {
        sentIds.contains(correlationId)
    }

    /// Usernames are minimum 3 chars long and cannot include `@`.
    private func validateUserName() -> Bool {
        userName.count >= 3 && !userName.contains("@")
    }
}

That’s not a good design, as there are various problems with it:

  1. you can’t distinguish between the original and validated responses in terms of types, so there is always a possibility of logic errors when you accidentally use the original response when you meant only validated ones;
  2. it returns a plain Bool which tells you nothing about why it failed.

We’ll tackle only the second point in this post.

Basic V type

What we need is a type that could be either a valid result or an error, which is a job for a sum type:

1
2
3
4
5
6
7
/// Simple validation result.
public enum V <T> {
    /// A valid value.
    case value(T)
    /// An error.
    case error
}

In the first step, the error case doesn’t have any actual messages. Don’t worry, we’’ll add those shortly.

If we implement && for our use case, then the logic in the validation functions stays the same:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public func && (lhs: V<()>, rhs: V<()>) -> V<()> {
    switch (lhs, rhs) {
    case (.value, .value): return lhs
    default: return .error
    }
}

extension Bool {
    var V: V<()> {
        self ? .value(()) : .error
    }
}


public typealias SimpleValidationResult = V<()>

public func validate(sentIds: Set<Int>) -> SimpleValidationResult {
    validateCorrelationId(sentIds)
        && validateUserName()
}

/// The correlation id must be in the list of correlation ids of sent requests.
private func validateCorrelationId(_ sentIds: Set<Int>) -> SimpleValidationResult {
    sentIds.contains(correlationId).V
}

/// Usernames are minimum 3 chars long and cannot include `@`.
private func validateUserName() -> SimpleValidationResult {
    (userName.count >= 3).V
        && (!userName.contains("@")).V
}

Our && combinator contains the crucial logic for the validator: if and only if both validation results are successful, then the result is successful; an error otherwise. It propagates errors from the lower-level validators (validateCorrelationId and validateUserName) to the higher-level ones (validate) so that we don’t have to remember to check the results manually.

SimpleValidationResult is now isomorphic to a regular Bool: V.value(()) is true and V.error is false.

See more details in the commit f52a09464ae55bfad75cea86712b6931f4714dfe.

Allowing specific errors

The next commit adds a really simple error type with just a string and extends V.error to contain an array of errors of type E:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/// Primitive error type that contains only a message string.
public struct StringError {
    let message: String
}

public func && <E> (lhs: V<(), E>, rhs: V<(), E>) -> V<(), E> {
    switch (lhs, rhs) {
    case (_, .value): return lhs
    case (.value, .error): return rhs
    case let (.error(e1), .error(e2)): return .error(e1 + e2)
    }
}

extension Bool {
    var V: V<(), StringError> {
        self ? .value(()) : .error([StringError("Failed validation")])
    }
}

The Bool.V converter now provides some placeholder validation error so that the ResponseValidator code doesn’t have to change. We’ll fix it in the next step. The && implementation is updated to correctly combine the errors from both validators.

Supplying the correct validation errors

Here we introduce the <?> operator (instead of Bool.V) so that our validator checks look nice:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/// Converts the `condition` to a `V` value. `true` means a valid value and `false`
/// is replaced with the `rhs` error.
public func <?> <E> (condition: Bool, rhs: @autoclosure () -> E) -> V<(), E> {
    condition
        ? .value(())
        : .error([rhs()])
}


/// The correlation id must be in the list of correlation ids of sent requests.
private func validateCorrelationId(_ sentIds: Set<Int>) -> SimpleValidationResult {
    sentIds.contains(correlationId) <?> StringError("Correlation id \(correlationId) is not in the sent ids set \(sentIds)")
}

/// Usernames are minimum 3 chars long and cannot include `@`.
private func validateUserName() -> SimpleValidationResult {
    userName.count >= 3 <?> StringError("Username \(userName) must be 3+ chars")
        && !userName.contains("@") <?> StringError("Username \(userName) must not contain '@'")
}

Tidying up the errors

A small last step is to make the validator code a bit nicer and not have to wrap the errors in StringError:

1
2
3
4
5
6
7
8
9
10
/// The correlation id must be in the list of correlation ids of sent requests.
private func validateCorrelationId(_ sentIds: Set<Int>) -> SimpleValidationResult {
    sentIds.contains(correlationId) <?> "Correlation id \(correlationId) is not in the sent ids set \(sentIds.sorted())"
}

/// Usernames are minimum 3 chars long and cannot include `@`.
private func validateUserName() -> SimpleValidationResult {
    userName.count >= 3 <?> "Username \(userName) must be 3+ chars"
        && !userName.contains("@") <?> "Username \(userName) must not contain '@'"
}

In our case of simple StringErrors, this is done by conforming the type to ExpressibleByStringInterpolation.

This test verifies the errors accumulation behavior:

1
2
3
4
5
6
7
func testValidationErrorsShouldAccumulate() {
    let sut = response(withCorrelationId: 99, withUserName: "a@")
    XCTAssertEqual(sut.validate(sentIds: [T.anonymousCorrelationId]).error.map(Set.init),
                   Set(["Correlation id 99 is not in the sent ids set [200]",
                        "Username a@ must not contain '@'",
                        "Username a@ must be 3+ chars"]))
}

Comparison

Finally we can compare the contents of the validation functions and see how they have changed:

1
2
3
4
5
6
7
8
9
10
// before:
sentIds.contains(correlationId)

userName.count >= 3 && !userName.contains("@")

// after:
sentIds.contains(correlationId) <?> "Correlation id \(correlationId) is not in the sent ids set \(sentIds.sorted())"

userName.count >= 3 <?> "Username \(userName) must be 3+ chars"
    && !userName.contains("@") <?> "Username \(userName) must not contain '@'"

The logic has stayed exactly the same, we just added the explanations with a nice API! I like the result.

Notes

The result at the last step works and is much better than the original basic implementation. But it’s still a very limited implementation for the sake of simplicity in this post. Further improvements are definitely possible.

Depending on the requirements, instead of using a simple StringError it could be preferable to use an error type specific for your validator. Then the client code could check which exact requirements failed and do something specific instead of simply showing the error strings.

The && as it is implemented now has a very specific use case. In fact, the <*> (apply) operator from the Applicative typeclass (interface) is the generic version of this and && can be implemented in terms of it.

The V.error case currently holds an array of errors: [E]; the code is forced to know that it’s an array even though the only operation it uses it the concatenation (.error(e1 + e2) in &&). In a more general case, it can be any Semigroup of errors, because a semigroup is a typeclass that only defines an associative operation <> to combine two values into one.

Extended validation implementation in the “PureScript by Example” book: https://leanpub.com/purescript/read#leanpub-auto-applicative-validation.

Comments