#2883 new

Is it possible to apply a `.distinct()` operation in a multivalue list of addresses

Reported by Ayhan | August 20th, 2021 @ 02:55 AM

Hi Benny,

Is there any standard "function" (or "parser") that can be used in any format string to apply a .distinct() operation in a multivalue list (like the addresses in "To", "cc", "#recipients", or in my case #original-to) ?

BTW, other such useful functions (accross the board) could be .lc (lowercase), .uc (uppercase), .first-item, .last-item ...

(Perhaps these are all there but remain undocumented ?)

Initially, I thought it should be easy enough to cook up a regexp (in specifiers.plist) for the ".first-item" function ... but not that easy, because the multi-value aspect appears to be handled implicitly (at least for output columns). In any case, I wouldn't even want to imagine implementing a ".distinct" function with a regexp (I doubt it would be possible, at least without all sorts of bactracking hacks).

As you might have guessed, this is an XY problem, but it's something that can be useful in a number of situations.

If needed, here's my original problem:

I am one of those hoarders trying to preserve at least part of my email archive over a few decades... which contains mail recevied into many different personal or professional email addresses (past and current) over the years, thru various MTA, MDA, MUA.

Some of those also involve "plus/sub adressing" (which is another matter).

I want to be able to visually distinguish the related email account (in output columns and also as smart sub-mailboxes).

Partially figured out how to to add a custom column in outputColumns.plist... So far so good...

The #original-to shortcut seemed as a good candidate... which is currently defined in specifiers.plist as ("x-original-to", "x-delivered-to"). The problem was that some of my older mail have this information in a delivered-to header, some others in envelope-to, and some don't have it at all (for those, I imagine #recipent.#identity could perhaps be a starting point to dig into in conjunction with defining some "retired" email addresses) ...

Anyhow, it first seemed easy enough to just create another shortcut named #~original-to (note the tilde [~] in the name which I use as a convention to mark my custom hacks) in specifiers.plist like below:

"#~original-to" = { 
  specifiers = ( "x-original-to", "x-delivered-to", "delivered-to",  "envelope-to", "x-envelope-to", "x-real-to", "x-msreally-to", "x-apparently-to" );

(I also added the appropriate headers for the address and addressFilters parser definitions as explained in the comments).

Well... That sort of works... but introduces another problem... because many (but not all) messages have got more than one of those headers set ( "x-original-to" and "delivered-to" are common duplicates in my current setup, for example)... Most of the time, it's the same email address that just gets repeated... or else the same address with another character casing (differing only in lower/upper casing)... cluttering both the display output column... and also the submailboxes of smart-mailboxes.

The custom output column I have set up often has cells like the below:

a1@example.com, a1@example.com
a2@example.com, A2@EXAMPLE.COM

The display for the submailboxes is actually worse. The same address appears four times(!) : two rows where each row contains the same address more than once, like below:

a1@example.com, a1@example.com
a1@example.com, a1@example.com
a2@example.com, a2@example.com
a2@example.com, a2@example.com

The messages appear to be randomly scattered in various (identically named) submailboxes.

What makes things even worse (in my case) is my heavy use of plus adressing... resulting in hundreds of lines like the above... killing the whole initial idea of mine altogether...

Mind you, there remain two other hurdles that prevent achieving an optimal solution for my use case, both related to the way submailboxes work as of today (both were already mentioned/requested by others on the formum, I think) :

1) Currently, MailMate applies the distinctness (uniqueness) test on whatever header (or shortcut) was chosen in the combo (instead of the formatString for the submailbox name). This is an overall issue with submailboxes, which I consider to be a bug, since the most usual expectation would almost always be the formatstring (if set).

2) It would be great to have the ability to have more than one level of nested submailboxes...

Well, you can see how that would elegantly solve the above described use case by defining three levels of submailboxes with tyhe below format strings :

  • ${#original-to.domain}
  • ${#original-to.user.notag}
  • ${#original-to.user.tag}

    This kind of thing is not at all limited to the described use case, however. Such a feature would be very usefull accross the board!

My apologies for having overloaded this ticket with the last two points about submailboxes. If needed, I can try to open new tickets for those.

In any case, this ticket could remain dedicated to the request for (documenting?) "distinct/uniq" or "first-item", "lc", "uc" functions/parsers, which appear to be independently useful I suppose (at least for output columns and perhaps elsewhere in query criteria for distict counts, for example).

Best Regards,

Some of those Delivered-To

No comments found

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Mac OS X email client.

Shared Ticket Bins

People watching this ticket