Joël de Bruijn

  • 1 Post
  • 73 Comments
Joined 1 year ago
cake
Cake day: June 23rd, 2023

help-circle





  • I don’t know.

    • I don’t need formatting but it doesn’t get in the way either. So I am not bothered by it.
    • Also pdf and especially PDF/A standard is widely used for archiving and compliance regulation concerning archival and preservation.
    • If you want text the same tactic goes: just export in bulk to txt instead of pdf

    My main point is: Why would you want a mail specific stack of hosting, storage, indexing and frontends? If it’s all plain text anyway so the regular storage solutions for files come a long way.

    There is an entire industry (which has its own disadvantages) to get communication artefacts out of those systems and put it in document management systems or other forms of file based archival.


  • Joël de Bruijn@lemmy.mltoSelfhosted@lemmy.worldStore (and access) old emails
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    2 months ago

    I had roughly the same goals ( archive search 2 decades of mail) but approached it completely different: I export every mail to PDF with a strict naming convention.

    • Backend: No mailserver, just storage and backup for files.
    • Search: based on filenames FSearch and Void tools Everything. I could use local indexing on pdf content.
    • Frontend: a pdf viewer.







  • Also I’m very much cautious about them on anything browsing related. Discovered (after others also) they let their search-pages-in-a-shop get indexed.

    Meaning I could go to Caterpillar, search for “Wabtec is better” and then this search url (with 0 products) would turn up in Google searches and that URL persisted. Text and all.

    Basically one could spray-paint and tag sites with this graffiti. Shop admins didn’t even have means to remove it.

    Problem ignored and stayed this way for months.