For a small project, I need to parse the contact information from vcf files, which contain vCards (contact information). Specifically, the vCard 3.0 format defined by RFC 2426. And I only need contact’s name and birthday if any.
It’s a good project for me to learn more about parser combinators in Haskell. I’ve implemented a proof of concept using megaparsec because it has a great tutorial and produces good error messages. The code in this post is truncated and somewhat edited for (hopefully) easier understanding, but may not always make total sense, so you can take a look at the full working code in the PoC repository at https://github.com/eunikolsky/b2c/tree/poc/full.
(I know, the spec says “A vCard object MUST include the VERSION, FN and N types”, but I haven’t implemented these checks in the PoC yet.)
The content lines may be in any order; otherwise I would skip lines until the expected fields come up and return their values directly. Parsing worked great once I realized how to parse some lines until an end token: someTill. Here’s how:
Now that I can find a tuple where the first key is "BDAY" and get its value ("1984-01-01" here), how do I convert it to a Day? String extraction because the format is pretty simple? Regular expressions? Wait, I already have megaparsec for parsing, so I should use it!
Conceptually I need a function with the type signature like Text -> Maybe Day, however Parser Day encapsulates this idea and is a higher-level type — a parser that needs to be “run” and can return a lot of extra information.
The key here is that I use the parse function to run an inner parsing of the birthday; if it fails, I stop the outer (vCard’s) parsing with the error using parseError, otherwise I produce a complete Contact.