I know this isn’t strictly piracy related, I apologise, but I think it is tangentally related in that piracy protects you from data theft by avoiding the services the biggest thieves operate. Also, I feel like people here might be very interested in this take.

Apparently, the “legal” data brokerage industry was worth $319 billion in 2021, and is predicted to be worth $545 billion in 2028.[1]

Meanwhile, in 2021 there were only 7.9 billion people in the world[2] - many of whom do not have internet access or have very little data being traded. If we generously assume 6 billion people have equal volumes of data being traded, that means each person’s data is worth $53.17 per year on the market.

Data is effectively stolen from people. We do not get anything in return for it. We may be offered access to a website free of charge, but that is a separate transaction - it is not appropriate for another transaction to be hidden in the fine print of the terms and conditions. When you buy insurance, the key terms have to be front and centre - you pay x, you get y service. Not “You can have y for free!!! (But also you give us x for free.)” You’re supposed to be able to compare the value of the things being traded.

Bearing in mind that this is merely data brokerage, not actual processing or deriving any value from the data, a simple profit margin can be applied. They simply collect the data - easily and at low cost through automated processes - and then sell it. If businesses still took a very generous 30% profit (rather than a ludicrous infinite and pure profit) then the value of an average person’s data that they are owed is around $40 per year.


To run the other numbers to check, the global population in 2028 is predicted to be 8.4 billion - a growth of 6.329%. So our 6 billion population would become 6.38 billion, and with the $545 billion market value an individual’s data would be worth $85.43 on the market, or $65.71 to the individual. The value of user data is predicted to rise.

Obviously that 6 billion population figure I used is an approximation - a blind one at that. To give a worst case valuation for 2021, if we assume all 7.9 billion people equally have data being traded, then an individual’s data is worth $40.38 on the market, and $31.06 to the user. These are the minimum values, averaged evenly across the entire global population.


When Google and Facebook started out, data had very little value - there was no market for it. Thus it seemed reasonable to let them just take it, even if maybe it could be worth something. The service they offered was new and novel, a shiny new toy for everyone to play with. They then used this data to become some of the wealthiest businesses in the world. Now, even big players like Microsoft have joined in, in spite of the fact that their main products are paid products.

One form of bank fraud is where the criminal takes pennies out of multiple accounts, the idea being that people won’t notice such a small debit, and banks might write it off as some kind of error. This has been legislated against and proven illegal - yet these assholes take $40 each from everyone and get away with it!


  1. https://www.knowledge-sourcing.com/report/global-data-broker-market Edit: lmao we broke it https://web.archive.org/web/20240107042301/https://www.knowledge-sourcing.com/report/global-data-broker-market …or did they maybe take it down?? /tinfoil Edit2: it’s back up lol ↩︎

  2. https://www.populationpyramid.net/world/2021/ ↩︎

  • TWeaK@lemm.eeOP
    link
    fedilink
    English
    arrow-up
    88
    ·
    11 months ago

    Also, I’m really happy I finally found a genuine excuse to show off Lemmy’s citation feature lol

    • fmstrat@lemmy.nowsci.com
      link
      fedilink
      English
      arrow-up
      5
      ·
      11 months ago

      Oh interesting, I didn’t know there was a citation feature but can’t see them in Thunder. PR time perhaps.

      • TWeaK@lemm.eeOP
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        Yeah it’s really not very well known, also the links don’t actually work properly.

      • TWeaK@lemm.eeOP
        link
        fedilink
        English
        arrow-up
        3
        ·
        11 months ago

        Yes I’ve noticed that as well, the links have always been borked. Doubt it’ll get fixed any time soon, but at least the ground work is there and it makes it ever so slightly easier to make the formatting.

  • bionicjoey@lemmy.ca
    link
    fedilink
    English
    arrow-up
    42
    ·
    11 months ago

    $53 is a lot less than I would have expected honestly. But I guess that’s a mean average figure. It’s going to be practically worthless for poorer people. And since wealth is not evenly distributed, and since personal data of people with disposable income is worth a lot more, the average internet user’s data is probably worth a lot more.

    • TWeaK@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      29
      arrow-down
      1
      ·
      11 months ago

      Sure, it’s not the hundreds of dollars I’d estimated previously. In the past I’ve said “the data brokerage industry is a multi-trillion dollar industry” and come up with figures ranging from $100-$700 per year owed to the user.

      However, it should be said that this is just data brokerage. Not all businesses sell the data they collect, instead they keep it proprietary and use it themselves. Google, for example, sells advertising, not user data.

      So I think my estimations here have been very conservative overall, and the real value may well be much higher.

      Also, it’s not just about it being a small amount from an individual, it’s the fact that they’re robbing everyone blind that really gets my wick. No one really understands the value of user data, not intuitively, and the whole transaction is done in a deceptive manner to abuse this fact.

      • burningmatches@feddit.uk
        link
        fedilink
        English
        arrow-up
        6
        ·
        11 months ago

        It’s true that a lot of data isn’t sold, but a large chunk of the figure you quote also seems to include business data — stuff that contains zero personal information but is still hugely valuable to companies and investors (look at how much this report costs, for example, or consider that a Bloomberg terminal costs around $25k/yr).

        And remember, those investment buyers make up a big chunk of the consumer data market too and are only interested in aggregated insights to inform trading strategies. They don’t care about personal info or targeted ads.

        • TWeaK@lemm.eeOP
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          edit-2
          11 months ago

          Damn lmao did we kill my first source? It won’t load anymore for me to double check what is included.

          With regards to consumer data being aggregated insights, rather than personal info or targeted ads, that still doesn’t mean they should get it for free, though. Furthermore, I’d argue that all info is personal info, given that it is so easy to identify a person with very few data points.

          Edit: You’re right, it includes business data. However I’d expect much of that data is paid for down to the data subject, excluding the stuff that’s public domain.

          It’s not reasonable that business data should be fairly paid for, while consumer data isn’t.

    • xep@kbin.social
      link
      fedilink
      arrow-up
      9
      ·
      11 months ago

      The way I see it, that number is a baseline figure for what their services would be offered for in exchange. If someone came up to me and said “here, I’ll give you $53 and in exchange you’ll let me surveil you for a year” I’d say no, but maybe someone else would’ve said yes. Then, as an experiment, maybe we can let the market take it from there, now that there’s a price and some form of discovery mechanism.

      • TWeaK@lemm.eeOP
        link
        fedilink
        English
        arrow-up
        3
        ·
        11 months ago

        Exactly. Also, the main point I’m trying to make here is that data does not have a completely trivial value - it’s not pennies per year, even with a conservative estimate.

    • Appoxo@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      7
      ·
      11 months ago

      6 bil ia a stretch.
      I’d go with 60-70% of tge general population being perpetually online +10% of the older folks only having a smartphone for whatsapp and some other stuff.

  • leanleft@lemmy.ml
    link
    fedilink
    English
    arrow-up
    24
    ·
    11 months ago

    this service claims:

    "Ad based search engines make almost $300 a year off their users.

    Google generated $76 billion in US ad revenue in 2023. Google had 274 million unique visitors in the US as of February 2023.

    To estimate the revenue per user, we can divide the 2023 US ad revenue by the 2023 number of users: $76 billion / 274 million = $277 revenue per user in the US or $23 USD per month, on average! That means there is someone, somewhere, a third party and a complete stranger, an advertiser, paying $23 per month for your searches."

    https://help.kagi.com/kagi/why-kagi/why-pay-for-search.html

    • harc@szmer.info
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      11 months ago

      Would that factor in the [unknown] costs of that revenue? Running all the servers (incl youtube), offices and staff aint cheap. So more likely some is paying enough to leave 23USD on top of massive costs.

    • TWeaK@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      11 months ago

      That’s very interesting! I’d also read somewhere that data collection was a trillion dollar industry, however the figure I found here is purely data brokerage so does not include Google per se - Google sell advertising, the data they collect is kept to themselves, so it’s much harder to pin down a value.

      It also stands to reason that an American’s data is worth more on the market than, say, a North Korean’s - users who use the internet more will have more data being traded.

  • Monument@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    1
    ·
    11 months ago

    To buy weed, my state requires folks hand over their ID, and the shop records the person’s info to make sure they’re not selling to a minor.
    For someone that doesn’t want their info anywhere, I’m mildly annoyed by this, but I understand it.

    My weed shop had a loyalty program where (because obviously they have to track your purchases because of state law), you got points based on how much you spent. It was automatic. No opting in or out or whatever. They had to collect the data, and figured they’d reward their customers for coming back.

    Last week, they told me they were discontinuing the existing rewards program, and spinning up a new one that customers have to sign up for.
    To me, that means they’re not just handling the data they’re required to maintain in house, but need me to opt in to something or otherwise waive my right to privacy in some fashion. I scanned the QR code they referenced and the page (off-site from their actual website) wouldn’t even load unless I disabled tracking protection/ad-blocking.
    I closed the tab and am now wondering if I need a different weed shop.

    • wowwoweowza@lemmy.ml
      link
      fedilink
      English
      arrow-up
      11
      ·
      11 months ago

      You need to hand a twenty to a dude on the corner. That’s privacy. We used to have it.

    • mnemonicmonkeys@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      7
      ·
      11 months ago

      I closed the tab and am now wondering if I need a different weed shop.

      The answer is yes. Make sure to also let management know exactly why so they know how bad they fucked up

    • TWeaK@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      11 months ago

      Yeah I really hate that kind of thing. I went into a gas station once, and at the registers it had a tiny little label saying they had CCTV with facial recognition, for crime prevention and “legitimate interest” - the GDPR term that websites always hide and sneak in pre-ticked, even when you think the main points are completely unchecked. There wasn’t even a clear way to opt out either, just a QR code you could scan. I didn’t scan, I’ve avoided that place since.

      I also vote for finding a new weed shop, but ideally do tell them your reason why.

    • pbjamm@beehaw.org
      link
      fedilink
      English
      arrow-up
      5
      ·
      11 months ago

      Go olde skool and find a local weed guy. I assume they must still exist.

      • Monument@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        6
        ·
        11 months ago

        I don’t smoke, though. I’m a gummies kinda guy, and those are hard to get right unless you’re like, an operation, you know?

        Dispensary gummies are lab tested. Although there’s a bit of a problem with lab shopping here, they’re going to be pretty consistent in terms of dosage. I won’t wind up accidentally couch-locked because the dose was too high or the gummies had an unexpected activation time.

  • wowwoweowza@lemmy.ml
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    1
    ·
    11 months ago

    Perfectly relevant. Thank you.

    I enjoyed reaching the gist of your meaning: Legislation needs to be written.

    So let’s hope that can happen.

    And on a personal level, what have you heard about people who intentionally make their data useless?

    This has been my strategy.

    I never buy what’s recommended.

    I purchase items with cash when I can so they are not added to my “profile”.

    Have you heard much about this strategy? How might it work if everyone used it? Generally thoughts for how we can defy their machine and protect ourselves?

    • TWeaK@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      6
      ·
      11 months ago

      I enjoyed reaching the gist of your meaning: Legislation needs to be written.

      So let’s hope that can happen.

      Agreed. The first step I think is education, letting people know the value, pointing out that it is a pandemic problem that affects everyone, then convincing politicians that they are being robbed too. If a lawmaker thinks they’re a victim, then they might actually pull their finger out.

      And on a personal level, what have you heard about people who intentionally make their data useless?

      This has been my strategy.

      I do that to some degree, with some things. Like with captcha, I play a game of getting things wrong, but just enough to get through. Not every attempt though, I want it to still think I’m a human that’s smarter than the machine, then when I think it’s giving me a genuine training screen I spoil it.

      I don’t use cash as much as I maybe should, I prefer it in some regards, but contactless card purchases are just so easy. I’ve never used Google or Apple Pay, though, but that’s more because I run custom firmware. Also, I’ve since learned that when you use your phone to pay it’s the equivalent to chip and PIN. You are authorising the transaction and taking responsibility for it, whereas if you use a contactless debit/credit card it is processed as “cardholder not present”, whereby the seller assumes more responsibility if you dispute it. This method of transaction isn’t new, it’s how catalogue or telephone purchases were always done, as well as online purchases. But if you use your card with chip and PIN, or if you use your phone, you will have a much harder time disputing any transaction.

      Have you heard much about this strategy? How might it work if everyone used it? Generally thoughts for how we can defy their machine and protect ourselves?

      In terms of user data protection, really I think the cat is long since out of the bag. There’s no putting it back in - and in many ways we shouldn’t, as data is useful and has benefits to society. I think it should go either one of two ways:

      1. Allow businesses to continue their free data collection, but force them to make the raw data public. Any processing they do can be private, but the raw data doesn’t belong to them.
      2. Have businesses start paying the data subject for their data.

      In the meantime, one way a user can limit their data collection using restrictive privacy browsing settings. For my personal PC’s, I not only run uBlock Origin but also uMatrix - a deprecated extension made by the same author. This has similar funcionality to uBlock Origin when you set it to author mode, where it can selectively block different domains, but uMatrix presents it as a matrix which also allows you to select the type of content as well as domains. By default, it blocks all 3rd party frames, audio/video media, scripts, XHR, and “other”, so quite often it leaves websites broken on first load, but then I pick through and enable the bare minimum of content to get it working. This isn’t for everyone, of course, as it can be a hassle sometimes - particularly with payment processors which are all done on multiple 3rd party servers. However, it does highlight to me how endemic Google are with captcha, even when it doesn’t give you a captcha prompt. I can’t log into some of my online banking without enabling connections to Google, which is sickening. This is an example of what uMatrix looks like:

      The extension doesn’t get updates anymore, so my lists are out of date compared to uBlock Origin. I’m pretty sure I could update them manually, but since I run uBO as well I don’t really feel the need. I’ve tried running just uMatrix, but uBO has its own array of special lists and without those YouTube ad blocking doesn’t work.

      • wowwoweowza@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        uBlock Origin and Matrix are entirely new to me — of course I can’t have missed references to them in discussions but I just don’t have the time to chase down every good idea.

        Simply getting engaged with the Fediverse has been adventure enough.

        Here’s hoping more and more people start valuing their privacy more and more.

        • TWeaK@lemm.eeOP
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          11 months ago

          uBlock Origin is essential. Firefox (or a hardened fork) with uBlock Origin is the bare minimum protection, IMO. Definitely don’t use Chrome or any derivative (which is basically all of them these days, eg Microsoft Edge, Brave).

          uMatrix is deprecated and breaks websites by default. I love it, but it’s not for everyone. I don’t use it on all my devices, though.

  • iopq@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    11 months ago

    This is a more believable figure than the trillions people were throwing around

  • acchariya@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    11 months ago

    Disclosure: I’m affiliated with this company.

    There’s a platform where you can add personal data in the form of questionnaires, documents, and integrations that pull profile data from social media, then allows you to sell the data to buyers at your discretion. The platform does not own your data, does not access it, and simply acts as a broker directly between you and the buyer. Not a ton of activity on it at the moment, but it’s picking up as clients shift spending from big tech to pay users for their own acquisition.

    https://tartle.co

    • TWeaK@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 months ago

      Glad to see there’s someone trying to make the process more legitimate.

      However, I feel like the better solution is to require that all raw data be publicly available - no one pays for it, but everyone can access it. Then, when people process the data, they can keep their methods and results secret. I think this is perhaps a more practical solution as the cat is already out of the bag, you’re not going to get the likes of Facebook paying users appropriately.

      • acchariya@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 months ago

        You are right about not getting Facebook to pay for the data, but each time a company pays you $2 to be referred to their site, that’s $2 Facebook didn’t receive. Anything you earn on TARTLE comes directly out of the purse of big tech.