PM_ME_VINTAGE_30S [he/him]

Anarchist, autistic, engineer, and Certified Professional Life-Regretter. If you got a brick of text, don’t be alarmed; that’s normal.

No, I’m not interested in voting for your candidate.

  • 3 Posts
  • 217 Comments
Joined 1 year ago
cake
Cake day: July 9th, 2023

help-circle


  • PM_ME_VINTAGE_30S [he/him]@lemmy.sdf.orgtoMemes@lemmy.mlAI bros
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    4 months ago

    “Gradient descent” ≈ on a “hilly” (mathematical) surface, try to find the lowest point by finding the lowest point near an initial guess. “Gradient” is basically the steepness, or rate that the thing you’re trying to optimize changes as you move through “space”. The gradient tells you mathematically which direction you need to go to reach the bottom. “Descent” means “try to find the minimum”.

    I’m glossing over a lot of details, particularly what a “surface” actually means in the high dimensional spaces that AI uses, but a lot of problems in mathematical optimization are solved like this. And one of the steps in training an AI agent is to do an optimization, which often does use a gradient descent algorithm. That being said, not every process that uses gradient descent is necessarily AI or even machine learning. I’m actually taking a course this semester where a bunch of my professor’s research is in optimization algorithms that don’t use a gradient descent!


  • They created a good product so people used it and there were no alternatives when it got shit.

    They created an inherently centralizing implementation of a video sharing platform. Even if it was done with good intentions (which it wasn’t, it was some capitalist’s hustle, and its social importance is a side effect), we should basically always condemn centralizing implementations of a given technology because they reinforce existing power structures regardless of the intentions of their creators.

    It’s their fault because they’re a corporation that does what corporations do. Even when corporations try to do right by the world (which is an extremely generous appraisal of YouTube’s existence), they still manage to create centralizing technologies that ultimately serve to reinforce their existing power, because that’s all they can do. Otherwise, they would have set themselves up as a non-profit or some other type of organization. I refuse to accept the notion of a good corporation.

    There’s no lock in. They don’t force you off the platform if you post elsewhere (like twitch did).

    That’s a good point, but while there isn’t a de jure lock-in for creators, there is a de facto lock-in that prevents them from migrating elsewhere. Namely, that YouTube is a centralized, proprietary service, which can’t be accessed from other services.


  • Also: model trains. I was into model trains for a few years, but I realized that I didn’t really have the life experience to make a fulfilling model trainset. Like I did the thing, I made a (really childish) layout with some crappy blocks and streets, and I got the trains to move and stuff, but it didn’t…say much? It was “I’m a child and I like trains”, which is great! Probably wouldn’t have become interested in trains at all otherwise!

    But I want more…I always want more. I need to go more hardcore into the few things I can actually tolerate doing…

    And as a child, I saw some really cool trainsets built by adults that told stories, made me laugh, made my parents laugh, made me feel awe at the storytelling and creativity of the craft. Even my cousin, who built a trainset in his basement in his early twenties, had a much more inspired trainset than mine (when I was much younger, like 10 or 12). His trainset was cool. He studied how trains worked, how to make a realistic line with realistic scenery and infrastructure. His trainset reflected who he was, and ultimately forecasted what he became. He literally works for a rail company now designing the train tracks.

    So I’m kinda “saving” that hobby for when I’m in my 60’s after I integrate enough life experience (and hopefully some capital) to build a trainset that really reflects the person I ultimately became.

    My trainset is gonna have a sick, functioning roller coaster, some overly complicated automated control circuits, some heavy metal references, some intentionally goofy shit, serious shit, an anarcho-communist bent, a layout that at least is informed by modern infrastructure design, etc., because that’s at least partially the person I will have become.









  • It’s so hard!

    It’s really hard! But it’s really rewarding too. And as a computing/music student [1], you’re in a great major to start!

    First off, if you just want to make your own effects and you’re not really interested in distributing them or making them public, I recommend using JSFX. It’s way easier. You can read through the entire spec in a night. JSFX support is built into REAPER, and apparently YSFX allows you to load JSFX code into other DAWs, although I haven’t tested it. JSFX plugins are compiled on the fly (unlike VST plugins, which are compiled ahead of time and distributed as DLLs), so you just write them up as text files.

    However, their capabilities are limited compared to VST, AU, LV2, AAX [2], and other similar plugin formats. Also, pre-compiled plugins perform better. That’s why plugins are released as such.

    So if you plan on writing pre-compiled plugins for public consumption, you’ll need to do some C++ programming.


    IMO the most important thing to learn for plugin design is how to code well, particularly in C++ with Git and JUCE.

    If you learn how to code with good practices, you can compensate for all other deficiencies.


    Between “music”, “engineering”, and “software development”, plugin design feels the most like “software development”.

    99.9% of all plugins are written in C++, and most of those are done (both proprietary and FOSS) with the JUCE library. School taught me the basics of C++ but they don’t teach you how to code well. Particularly, your DSP code needs to meet a soft real-time constraint. You have to use multithreading because you have a thread for the audio signal (which must NEVER get interrupted) and at least one thread for the GUI.

    You also need to figure out which parts of the C++ standard library are real-time safe, and which aren’t. Here’s a good talk on that.

    If you use JUCE or a similar development library then they have well-tested basic DSP functions, meaning you can get by without doing all the math from scratch.

    Start watching Audio Developer Conference talks like TV as they come out. JUCE has a tutorial, and MatKat released a video tutorial guiding the viewer through coding a simple EQ plugin [3]. JUCE plugins are basically cross platform, and can typically be compiled as VSTs on Windows, AU plugins on Mac, and LV2 plugins on Linux.

    JUCE is a really complicated library even though it vastly simplifies the process (because audio plugin development is inherently hard!). You’re going to have to learn to read a LOT of documentation and code.

    I also recommend learning as much math as you can stomach. Start with linear algebra, calculus, Fourier analysis, circuit theory, and numerical analysis (especially Padé approximants), in that order. Eventually, you’ll want to roll your own math, or at least do something that JUCE doesn’t provide out the box. Julius O Smith has some really good free online books on filters, Fourier Analysis, and DSP with a music focus.

    If you’re willing to sail the high seas to LibGen buy a book, I recommend Digital Audio Signal Processing by Udo Zolzer for “generic” audio signal processing, and DAFX: Digital Audio Effects by Zolzer for coverage of nonlinear effects, which are typically absent from DSP engineering books. I also recommend keeping a copy of Digital Signal Processing by Proakis and Manolakis on hand because of its detailed coverage of DSP fundamentals, particularly the coverage of filter structures, numerical errors, multirate signal processing, and the Z transform.

    A little bit of knowledge about machine learning and optimization is good too, because sometimes you need to solve an optimization problem to synthesize a filter, or possibly in a fixed time as your actual output (example: pitch shifting). Deep learning is yielding some seriously magical effects, so I do recommend you learn it at your own pace.

    DSP basically requires all the math ever, especially the kind of DSP that we want to do as musicians, so the more you have the better you’ll be.

    [1] IMO that would have been the perfect major for me, that or acoustical engineering, if anything like that existed in my area when I went to recording school 10 years ago. While my recording degree taught me some really valuable stuff, I kinda wish that they pushed us harder into programming, computing, and electronics.

    [2] AAX requires you to pay Avid to develop. So I never use AAX plugins, and I have no intention of supporting the format once I start releasing plugins for public consumption, despite its other technical merits.

    [3] Over half of MatKat’s tutorial is dedicated towards GUI design, i.e. the audio part is basically done but the interface looks boring and default. GUI design and how your GUI (editor component) interacts with the audio processor component are extremely important and time-consuming parts of plugin design. Frankly, GUI design has been by far the most complicated thing to “pick up”, and it’s why I haven’t released anything yet.


  • So I don’t value high fidelity video because I don’t see very well even with glasses, so it wouldn’t make a difference for me.


    I do value high fidelity audio because:

    • I am a musician and producer, although not as much as I used to
    • I have ear training
    • I went to recording school
    • I am autistic with sensitive hearing
    • I have audio and acoustical engineering as special interests
    • I’m doing a master’s degree in electrical engineering where I’ve already designed audio gear for my projects
    • I am teaching myself audio plugin design for fun

    But I simply can’t afford high fidelity gear for every day listening. For my studio monitors, I spent as much as I could to get the best speakers I could afford so that I can be certain that what I’m hearing is an accurate representation of what I “commit to tape”. However, for walking to class or going to the market, I’m not gonna pay for expensive headphones that could get stolen, broken, or lost. It’s impractical.

    My $20 Bluetooth headphones [1] are sufficient for every day carry. They sound “95% of the way there”, they don’t get in the way when I’m walking, and if I lose them, I can have an identical pair delivered to my door with a couple days. 95% is good enough for me. Actually, I could probably settle for less.

    And then there’s storage. My library is already > 110GB in MP3 format, so storing it all in uncompressed formats would be unwieldy.

    So in the rare cases that my listening hardware is insufficient, I’ll usually consult a software equalizer. For example, on Linux, Easy Effects allows me to apply equalizers, dynamic compression, and a bunch of other plugins in LV2 format to the PipeWire output (and input). It’s super convenient for watching YouTube college lectures with questionable microphone quality on my shitty TV speakers. Other than dynamic compression for leveling and an equalizer for frequency effects, I am typically not interested in doing anything else for intelligibility. Said differently, I am not interested in exploiting the nonlinearities in real speaker systems (other than possibly dynamic compression), so I should be able to fix any linear defects (bad frequency response) with a digital equalizer. The nonlinearities in real speaker systems are, for HiFi listening purposes [2], defects.

    Also, I’m extremely skeptical of products marketed towards “audiophiles” because there’s so much marketing bullshit pseudoscience surrounding the field that all the textbooks that cover loudspeaker design and HiFi audio electronics have paragraphs warning about it as the first thing.

    Like I experience the difference between different pairs of binoculars and speakers dramatically, and graphical analysis backs up the differences, so how could they sound/look negligibly different to others?

    Next time you do a graphical analysis, check out the magnitudes of the differences in your graphs versus the magnitude of the Just Noticeable Difference in amplitude or frequency. We probably do experience the differences between speakers differently than others. We’re outliers.

    What’s your take on both major and, at the high end, diminishing returns on higher quality sensory experiences?

    For personal listening, the point of diminishing returns is basically $20 because I can’t afford shit. For listening to something I plan on sharing with others, I’d be willing to put in whatever I can afford. But frankly, I’d be just as likely to straight-up do the math and design my systems myself because I 100% don’t trust any “”“high fidelity”“” system that doesn’t come with a datasheet and frequency response.


    Lastly, I do wear glasses. I typically get my glasses online because, once you have the prescription and your facial measurements, it is the same quality as the stuff you get at the big-box stores.

    [1] I acknowledge that Bluetooth sucks, particularly for audio.

    [2] As a metal guitarist, I’m not against speaker nonlinearity for guitar speakers, but then again, guitar speakers are really convincingly simulated by impulse responses, which are a core linear systems concept, implying that they are nearly linear devices even at the volumes they are typically played at.