Definitely this. I use PostgreSQL (which Lemmy uses on the backend) for an enterprise-grade system that has anywhere from 700-1k users at any given point in time, and it also takes in several million messages from external systems throughout the day. PostgreSQL is excellent at caching data in memory. I’ve got the code for that system up in another window while I write this.
At this point in time, it doesn’t look like Lemmy is using any form of an L2 cache like Redis or Memched. The only single point of failure (that’s not horizontally scalable) looks like the pic-rs server that Lemmy is using for image hosting. If anything, that could easily be swapped over to use something S3 compatible and easily hosted using something like Minio locally, or even directly off of B2 or Linode cloud storage (doesn’t charge for requests).
In that case, I’d use a message queue. Rabbitmq, or I use Pulsar at work - multiple subscribers (using the same subscription name) to one queue of messages that need to be processed. One worker picks it up, processes it, and marks the message as processed. The worker either passes it into a different queue for further processing, or persists it to the DB.
The nice thing with this is when using the Pulsar paradigm, you can have multiple subscriptions to the same message queue, each one carrying its own state as to which messages are processed or not. So say I get one message from an external system, have one system that is processing it right now, and need to add a second system. In that case I just use a different subscription name for the second system, and it works independently of the first with no issues.