• 0 Posts
  • 326 Comments
Joined 2 years ago
cake
Cake day: June 15th, 2023

help-circle
  • It ranges from “automatic” to “infuriating”.

    If you have Secure Boot enabled, there are some hoops to jump through. Read the docs and follow the steps for DKMS.

    Depending on your distro and your requirements, you might want to install the drivers manually from Nvidia rather than using older drivers from your distro.

    If you need CUDA, god help you. Choose a distro that makes this easy and use containers to avoid dependency hell. Note that this is not any easier on Windows (at least not last I checked, which was a few years ago).




  • SEO (search engine optimization) has dominated search results for almost as long as search engines have existed. The entire field of SEO is about gaming the system at the expense of users, and often also at the expense of search platforms.

    The audience for an author’s gripping life story in every goddamn recipe was never humans, either. That was just for Google’s algorithm.

    Slop is not new. It’s just more automated now. There are two new problems for users, though:

    1. Google no longer gives a shit. They used to play the cat-and-mouse game, and while their victories were never long-lasting, at least their defeats were not permanent. (Remember ExpertsExchange? It took years before Google brought down the hammer on that. More recently, think of how many results you’ve seen from Pinterest, Forbes, or Medium, and think of how few of those deserved even a second of your time.)
    2. Companies that still do give a shit face a much more rapid exploitation cycle. The cats are still plain ol’ cats, but the mice are now Borg.

  • Well I’m sorry, but most PDF distillers since the 90s have come with OCR software that can extract text from the images and store it in a way that preserves the layout AND the meaning

    The accuracy rate of even the best OCR software is far, far too low for a wide array of potential use cases.

    Let’s say I have an archive of a few thousand scientific papers. These are neatly formatted digital documents, not even scanned images (though “scanned images” would be within scope of this task and should not be ignored). Even for that, there’s nothing out there that can produce reliably accurate results. Everything requires painstaking validation and correction if you really care about accuracy.

    Even ArXiv can’t do a perfect job of this. They launched their “beta” HTML converter a couple years ago. Improving accuracy and reliability is an ongoing challenge. And that’s with the help or LaTeX source material! It would naturally be much, much harder if they had to rely solely on the PDFs generated from that LaTeX. See: https://info.arxiv.org/about/accessible_HTML.html

    As for solving this problem with “AI”…uh…well, it’s not like “OCR” and “AI” are mutually exclusive terms. OCR tools have been using neural networks for a very long time already, it just wasn’t a buzzword back then so nobody called it “AI”. However, in the current landscape of “AI” in 2025, “accuracy” is usually just a happy accident. It doesn’t need to be that way, and I’m sure the folks behind commercial and open-source OCR tools are hard at work implementing new technology in a way that Doesn’t Suck.

    I’ve played around with various VL models and they still seem to be in the “proof of concept” phase.


  • I’ve been using cryptpad.fr (the “flagship instance” of CryptPad) for years. It’s…fine. Really, it’s fine. I’m not thrilled with the experience, but it is functional and I’m not aware of any viable alternatives that are end-to-end encrypted.

    It’s based on OnlyOffice, which is basically a heavyweight web-first Microsoft Office clone. Set your expectations accordingly.

    No mobile apps, and the web UI is not optimized for mobile. I mean, it works, but does using the desktop MS Office UI on a smartphone sound like fun to you?

    Performance is tolerable but if you’re used to Google Sheets, it’s a big downgrade. Some of this is just the necessary overhead involved in an end-to-end encrypted cloud service. Some of it is because, again, this is a heavyweight desktop UI running in a web browser. It’s functional, but it’s not fast and it’s not pretty.


  • The far right are well-practiced at co-opting and twisting concepts. It’s classic doublespeak.

    It’s why you have “Christians” who are staunchly opposed to feeding the hungry, or treating the sick. (See: school lunches.)

    It’s why “capitalism” now represents the complete lack of meaningful competition, when that competition is the only thing that ever made capitalism worthwhile in the first place. (See: Microsoft getting away scot-free after being found guilty of illegal, anticompetitive business practices all throughout the 90s.)

    It’s why “free speech” proponents are laser-focused on creating new and terrifying mechanisms for censorship. (See: *gestures widely*)

    I could go on.

    It’s sad how little resistance has been made against this corruption. How easily our natural allies have been turned into our greatest enemies.


  • For instance, Mozilla said it may have removed blanket claims that it never sells user data because the legal definition of “sale of data” is now “broad and evolving,” Mozilla’s blog post stated.

    Uh huh.

    The company pointed to the California Consumer Privacy Act (CCPA) as an example of why the language was changed, noting that the CCPA defines “sale” as the “selling, renting, releasing, disclosing, disseminating, making available, transferring, or otherwise communicating orally, in writing, or by electronic or other means, a consumer’s personal information by [a] business to another business or a third party” in exchange for “monetary” or “other valuable consideration.”

    Yes. That’s what “sale of data” means. Everybody understood that. That’s exactly what we don’t want you to do.


  • DNS over HTTPS. It allows encrypted DNS lookup with a URL, which allows for url-based customizations not possible with traditional DNS lookups (e.g. the server could have /ads or /trackers endpoints so you can choose what to block).

    DNS Over TLS (DoT) is similar, but it doesn’t use URLs, just IP addresses like generic DNS. Both are encrypted.



  • Honestly, that sounds great.

    My biggest problem with Flatpak is that Flathub has all sorts of weird crap, and depending on your UI it’s not always easy to tell what’s official and what’s just from some rando. I don’t want a repo full of “unverified” packages to be a first-class citizen in my distro.

    Distros can and should curate packages. That’s half the point of a distro.

    And yes, the idea of packaging dependencies in their own isolated container per-app comes with real downsides: I can’t simply patch a library once at the system level.

    I’m running a Fedora derivative and I wasn’t even aware of this option. I’m going to look into it now because it sounds better than Flathub.





  • But any 50 watt chip will get absolutely destroyed by a 500 watt gpu

    If you are memory-bound (and since OP’s talking about 192GB, it’s pretty safe to assume they are), then it’s hard to make a direct comparison here.

    You’d need 8 high-end consumer GPUs to get 192GB. Not only is that insanely expensive to buy and run, but you won’t even be able to support it on a standard residential electrical circuit, or any consumer-level motherboard. Even 4 GPUs (which would be great for 70B models) would cost more than a Mac.

    The speed advantage you get from discrete GPUs rapidly disappears as your memory requirements exceed VRAM capacity. Partial offloading to GPU is better than nothing, but if we’re talking about standard PC hardware, it’s not going to be as fast as Apple Silicon for anything that requires a lot of memory.

    This might change in the near future as AMD and Intel catch up to Apple Silicon in terms of memory bandwidth and integrated NPU performance. Then you can sidestep the Apple tax, and perhaps you will be able to pair a discrete GPU and get a meaningful performance boost even with larger models.


  • This will be highly platform-dependent, and also dependent on your threat model.

    On PC laptops, you should probably enable Secure Boot (if it’s not enabled by default), and password-protect your BIOS. On Macs you can disable booting from external media (I think that’s even the default now, but not totally sure). You should definitely enable full-disk encryption – that’s FileVault on Mac and Bitlocker on Windows.

    On Apple devices, you can enable USB Restricted Mode, which will protect against some attacks with USB cables or devices.

    Apple devices also have lockdown mode, which restricts or disables a whole bunch of functionality in an effort to reduce your attack surface against a variety of sophisticated attacks.

    If you’re worried about hardware hacks, then on a laptop you’d want to apply some tamper-evident stickers or something similar, so if an evil maid opens it up and tampers with the hardware, at least you’ll know something fishy happened, so you can go drop your laptop in an active volcano or something.

    If you use any external devices, like a keyboard, mouse, hard drive, whatever…well…how paranoid are you? I’m going to be honest: there is a near 0% chance I would even notice if someone replaced my charging cables or peripheral cables with malicious ones. I wouldn’t even notice if someone plugged in a USB keylogger between my desktop PC and my keyboard, because I only look at the back of my PC once in a blue moon. Digital security begins with physical security.

    On the software side, make sure you’re the only one with admin rights, and ideally you shouldn’t even log into admin accounts on a day-to-day basis.



  • vd (VisiData) is a wonderful TUI spreadsheet program. It can read lots of formats, like csv, sqlite, and even nested formats like json. It supports Python expressions and replayable commands.

    I find it most useful for large CSV files from various sources. Logs and reports from a lot of the tools I use can easily be tens of thousands of rows, and it can take many minutes just to open them in GUI apps like Excel or LibreOffice.

    I frequently need to re-export fresh data, so I find myself needing to re-process and re-arrange it every time, which visidata makes easy (well, easier) with its replayable command files. So e.g. I can write a script to open a raw csv, add a formula column, resize all columns to fit their content, set the column types as appropriate, and sort it the way I need it. So I can do direct from exporting the data to reading it with no preprocessing in between.


  • My experience might be a bit outdated, but I remember finding the default Mac OS X Terminal extremely slow. A few years back I ran an output-heavy command, and the speed difference between displaying the output in terminal vs outputting it to a file was orders of magnitude. The same thing on my Linux system was much, much faster. I’m not sure how much of that was due specifically to rendering, vs memory management or something else, though.

    I might see if I can still reproduce this in Sequoia and if Ghostty is faster on Mac.