Edit: obligatory explanation (thanks mods for squaring me away)…
What you see via the UI isn’t “all that exists”. Unlike Reddit, where everything is a black box, there are a lot more eyeballs who can see “under the hood”. Any instance admin, proper or rogue, gets a ton of information that users won’t normally see. The attached example demonstrates that while users will only see upvote/downvote tallies, admins can see who actually performed those actions.
Edit: To clarify, not just YOUR instance admin gets this info. This is ANY instance admin across the Fediverse.
To anyone surprised at this: welcome to the fediverse, please treat everyhing you do or say as public.
The way to achieve privacy around here is by following the long forgotten arts of the old internet before Facebook was a thing:
use a Nick name and don't tell strangers on the internet your real identity
.Your home instance will act as a proxy and only they have access to your email and IP address. That does stay private.
So, as long as you trust your home instance to not leak or disclose your connection or sign up data (which would be illegal in EU countries), just sign up with an alias.
A very positive aspects of this is that it should allow us to detect voting manipulation by correlating the activity of certain potentially malicious actors. If Lemmy instances take vote manipulation seriously and do their best to block bots this has the chance to make Lemmy / Kbin much more transparent and credible than Reddit ever was.
Lol. kids these days would post their bank info online if the banks didn’t prevent them from doing so.
You say that like A/S/L wasn’t a thing back in the day.
19/f/Cali was the only acceptable response
I think we cybered
As I put on my robe and wizard hat…
puts on wizard’s hat
Yall remember those “your stripper name is the street you grew up on and your pet’s name” challenges? Literally phishing for password recovery keys.
I don’t want to shame anyone, but I’ve had people sign up give me their full DoB and offering to show me their ID. I know of people who disclose their id to get access to nsfw discord communities.
DUDE MY GIRLFRIEND FUCKING DID THAT AND I JUST LOOKED AT HER AND ASKED HER IF SHE THOUGHT THAT WAS A GOOD IDEA. In hindsight no, thankfully she’s gonna be moving soonish. This was from before we were together, otherwise I would have warned her not to do that. It was the same discord she got a cyberstalker from, thankfully the stalker wasn’t a friend of the owner because otherwise he totally could have gotten her address and irl info.
Wasn’t there a twitter account that retweeted people posting photos of their credit cards?
I whole heartedly agree with this perspective.
Additionally, and this is an unpopular opinion, but trying to maintain a Nick or online identity over many years is folly. You end up with a huge repository of personal information, increasing the risk that it can be connected to you personally.
This has come up as part of those requests to migrate accounts between instances. “I want a persona that stays with me for years”… Is that actually a good idea though!?
Your home instance will act as a proxy and only they have access to your email and IP address.
Your home image typically doesn’t proxy image loading, those are hotlinked to the Lemmy server that the image was uploaded to. So your IP address and browser string are going to other Lemmy servers.
The posts just contain a URL which doesn’t include the uploader’s ip address or their browser string.
When the browser loads that URL, hotlinked image, that server has to have your IP address to return the results. Just browsing posts those images are being loaded.
Of course. They dont get any info to associate your IP with your lemmy account. You could even not have a lemmy account at all.
Of course. They dont get any info to associate your IP with your lemmy account. You could even not have a lemmy account at all.
To illustrate op’s point I’m going to spin up an instance, federate with everyone, and not tell anyone what that instance is.
Then I’m going to feed all that data into my new website, called Open Lemmy Stats, where anyone can query the user data ive accumulated. The homepage will be ripe with insights, leaderboards and all kinds of data on prolific users.
Additionally, I’ll display a snapshot/profile of a random user by feeding that users data to GPT4 to make inferences about the user’s political affiliations and display the results.
Worst of all, I’m not going to out my instance for everyone to know it as the one to defederate. In fact I’m spinning up a few instances that will host innocuous communities that I plan to mod and support to give my instances cover for their true purpose: redundant fediverse datastreams for my site, Open Lemmy Stats.
I’ll also have a store where anyone can buy my collected fediverse data for a handsome sum.
Just kidding I’m not doing any of this. But someone absolutely will or already is.
You know, I came in here with the mindset that the topic of discussion here isn’t a bad thing; I’m largely pro information-should-be-open-and-available. But you’ve argued a very solid point, and I’ve changed my mind on the issue. I appreciate you sharing this perspective!
I think your comment clearly illustrates what might go wrong with it. If they need this data for sorting or something else absolutely, then I would be happy if they just hashed the usernames/instances or used some other form of UID.
And just think how much data you can gather by sending out puppet accounts on various instances, accounts that will serve only to publicly state an opinion, such as “I support this candidate”, so the data on the people who upvote it can be harvested and categorized more easily. There is so much data harvesting potential here with a little imagination, and with a little more, a lot of ways to use that data to influence the way average users engage with the fediverse.
That site would also be a great advertisement for Lemmy. Come here to our decentralized platform, where you can vote…but you better not, lest you end up on the site. What social network wouldn’t grow when users are peer pressured into not using one of it’s basic underlying mechanics that makes the whole thing work?
Can your instance secretly run a fork that doesn’t respect deletes?
I’m almost willing to bet that big tech companies are already doing this. They got the motive and the means. No doubt Meta or Google have dedicated some of their servers to mining our Lemmy data in this way.
With only around 100k users and most people using anonymous usernames that cannot be connected to their identity it would hardly be worth the effort, time or money.
You’re looking at this from the wrong point of view. The fediverse is not just lemmy: Threads, Tumblr, even BlueSky (albeit with their own protocol, but anyone could just modify their fediverse enabled app to convert their data to be applicable to BlueSky’s protocol) are quickly setting the stage for a new norm. The more websites integrate the fediverse into their stack, the more data outside the immediate sphere of influence of these major corporations can be harvested. To what ends they’ll use it, I don’t know – but I don’t trust them with it.
They will know the user but not the person in real life. Even if you know that my user is more conservative on some points or more liberal on others, how can you use that for nefarious action ? Unless you know where I live and who I am, the data is useless.
People need to be aware that sharing your personal information on the internet is never a good idea.
It’s very difficult to both A) have meaningful conversations in a public space, and B) conceal your identity from a dedicated adversary. Once a person has a long post history, it’s likely that an observer could narrow down their identity to a very small group, if not a single person. Every post you make reveals something.
Even if you don’t ever explicitly state it, your age range and gender can likely be guessed with high probability by your writing style and/or little tidbits of info you leak without thinking about it. Same for political leanings. You might casually mention the brand of car you drive, or your favorite foods, or just reference something you experienced as a child that is not universal. All of these things leak information, and while each one seems insignificant, in aggregate they can tell a detailed story. Just knowing that you’re a Canadian who speaks both French and English eliminates about 99.8% of the world’s population as possibilities.
Back on Reddit I used to create fresh accounts all the time, but then I’d go and join the same subs, post with the same writing style, and generally express the same worldview. If anybody cared, had a good grasp of statistics, bothered to collect the data, and put in a stupid amount of time to it, they could likely match all of my accounts together. I was never too worried about this because…well I just didn’t care. But I did have a cyberstalker at one point and it made me think.
I wouldn’t be shocked if someone could match me to one or more of my Reddit accounts just from this one comment, tbh. I’m leaking information here like a sieve! Not many people have the skills to do that, and the few who do are unlikely to give a rat’s ass about me. HOWEVER, as AI becomes more advanced, anyone with computer literacy will be able to do analysis in minutes that might currently take an expert days or weeks.
Honestly, why not? The data is already being recorded. At least this way it’s public and the rest of us get to interact with it. It might even scare a few people into paying attention to the information that they disclose about themselves and increase their digital hygiene.
Edit: Obligatory RIP my inbox.
Can we leave this kinda stuff behind? It is NOT obligatory.
I’m going to start throwing “edit: thanks for the gold kind stranger!” on the end of my comments just to induce some nostalgic cringe.
You are a gentleman and a scholar. /s
That’s a pretty common turn-of-phrase in Ireland, I remember hearing it in the early 90s!, and it’s still common to hear it from older generations too. I wouldn’t equate it with reddit slang/culture at all. I wonder when it made its way to reddit?
This.
EDIT: Thanks for the awards kind stranger!
EDIT 2: Rip my inbox
This is all examples of reddit shit that is really dumb. We don’t need to bring it over here
Reading these comments, seeing so many excuses, sarcastic responses, and handwaving, makes me realize a great deal of users really need to develop some imagination.
This is not about privacy. It’s about data that can easily be used for targeting and profiling users, and how that creates countless avenues for targeted harassment and wide scale retaliation. It’s about all of the innumerable ways public vote information can and will be abused to manipulate scoring across the site with targeted/automated shadow banning and shared blocklists. Raise your hand if you trust every single admin to never abuse such a tool to curate the outward appearance of an instance to fit a narrative.
For a different example: I could say something about how great Nazis are right now, and have a bot programmed to read every single person that downvoted me, add those names to a shared blocklist, and viola, I’ve made myself and all my alts invisible to the people that would challenge me on a massive scale.
I promise you this is going to be a big issue as tools for this site get more sophisticated over time.
Not to sound harsh or anything, but those of you saying that it’s okay that all this data is public are insane. This completely goes against the entire philosophy of the Fediverse and FOSS in general. The reason we all are fleeing from Big Tech is because they collect so much data on us. At least, they keep it hidden from public view. This is a major issue in my opinion, and needs to be addressed ASAP before we can claim to have superior platforms on the Fediverse. Why can’t this data at least be encrypted?
Agreed, I am incredibly confused by what seems to be the majority reaction to this.
I’ve never been particularly involved with the FOSS community, though I do use a few FOSS apps and generally appreciate their view on what FOSS means. I also strongly appreciate data privacy, and it was my observation that the FOSS community was (generally) relatively the same way. So to see this reaction is very surprising. It’s quite literally the same terrible argument of “Why fear it if you have nothing to hide” used against multiple data privacy concerns throughout the years.
I think the worst are the bad faith “But Reddit…!” arguments. For one, we’re not on Reddit anymore, this is about Lemmy’s issues that can be corrected. And for two, whilst Reddit potentially outsourcing that data to the highest bidder is far from ideal, at the very least the data wasn’t outright PUBLIC to anyone who wishes to set up a simple server.
You say these issues can be corrected but I am not sure they can. ActivityPub is a protocol managed by the W3C. So to have different behavior You’d have to change the specification there. That is possible but it will take some time. Still you’d need a way to make votes not bound to a user and still hard to spoof. That sounds hard. Apart from that upvotes and downvotes are not really the most interesting datapoints you can gather. You can still collect posts. These can’t be obfuscated. There is simply no way to have an open network where you can share data between servers where you can make sure that no one harvests the data. It is simply not possible. As soon as it is public it is public. This has nothing to do with FOSS. If you have a solution you can implement it. That is what it means. If you have one then go ahead.
You’d have to change the specification there. That is possible but it will take some time.
Then they should do so, these issues need to be fixed ASAP.
Still you’d need a way to make votes not bound to a user and still hard to spoof.
Obfuscating user IDs via a hash or something would seem like the way to make it work. I’m not a professional programmer, I only know a little bit of python, so I have no idea if I’m talking nonsense on that front. And whilst still not an ideal solution, but sharing non-private votes with your own instance admin and have them share only the total vote count with other instances is another solution. That way you need only trust your instance admin, which is choosable and can also be yourself.
That is what it means. If you have one then go ahead.
Putting the onus on me is a shitty thing to do. I’m not the one running this site in any capacity, but this is an issue that many users are unhappy with. If the issue with the site won’t or even can’t be fixed, then I will simply not use the site. I don’t know how many people feel the same on that front, but I’d imagine there’s quite a few.
then I will simply not use the site
Maybe that’s what you should do. But don’t do it as a protest. Do it because you don’t want to share that data publicly.
The entire point of social media is sharing things publicly. If you’re worried about people collecting that data, then you shouldn’t have put it in public.
There aren’t good ways to keep a public secret. That’s inherent to how information works and not a failing of ActivityPub. It’s the same reason media will never stop being pirated. If I can see/hear it, I can repeat it.
But don’t do it as a protest. Do it because you don’t want to share that data publicly.
I mean yeah, that’s what I’d do it for. It’s a suggestion for the site and it’s a sentiment that seems to be shared by several people here, but it ultimately falls down to me to decide whether or not I want to continue using it, much the same as with my usage of Reddit.
If you’re worried about people collecting that data, then you shouldn’t have put it in public.
Voting is a core functionality of the site. It’s something I don’t think should be public as it puts more emphasis on what content I interact with in what is now apparently a public manner. If you want to debate that a mere vote is something I shouldn’t put in public, then fine, you do you. But for me, it defeats half the point of me even having an account here. What one comments on are often an incredibly small portion of what one actually votes on simply by ease of voting.
And I know I said “But Reddit…!” is a bad argument earlier, but even so, I’d like to say that even Reddit’s voting is not publicly accessible (as in not accessible by other users, even if Reddit almost certainly collects and sells such data), so clearly there should be ways to do it. If ActivityPub requires public voting and the people who have the ability to change it are unwilling or even unable to do so, then fair enough. But equally, I will refrain from contributing to such a site, which seems like a bit of a shame when it seems close to ideal otherwise.
clearly there should be ways to do it
Your votes on Reddit are public to Reddit admins. On Lemmy anyone can be an admin.
Giving vote totals without names makes the system ripe for fraud and abuse. In real life votes the decision to make votes public or private is a major one. In a system like Lemmy, the problems with private votes are exaggerated, and the problems with public votes are much smaller. Your Lemmy name shouldn’t be tied to your real name. It’s unlikely anyone is going to coerce your vote like they might coerce your political vote.
If you’re concerned about anonymity, maybe use more than one name or a different name so that your account isn’t so easily tied back to you.
The purpose behind having votes be more public is to have some kind of reputation behind those votes. It’s still possible to shill, but it requires more depth and and effort, and the shills may still be discovered if there are too many.
Oh no, so my upvotes on c/spacedicks aren’t private?
/s
So when Threads decides to federate, they can slurp all this information.
That would be massively concerning and that should be blocked. Ideally votes should remain only on the current instance. Anything shared with other instances should be anonymised. This would need to be re-architected imho.
People come here to get away from Reddit now that trust has gone. Trust and a feeling of safety is vitally important to continue to build this platform.
I’m safe, I upboated the beans
There’s something amusing about people feeling violated by their activity being made public, but not necessarily by corporations hoarding and capitalizing on that activity & data. I mean, one of them is out in the open. The other is pure abuse.
Activities are public and easily viewable on kbin. It’s been interesting. Seems mostly positive other than people harassing those who down-vote them demanding explanations.
Knowing they’re visible on kbin made me realize that most Lemmy users probably weren’t aware, as it’s non-obvious.
Yeah, I had a good natured discussion with a Lemmy user on feddit.uk the other day where they were still inexplicably downvoting my responses each time, despite us both being polite and constructive.
It made me realise that a) they use the downvote button quite differently to how I use it and b) they probably didn’t know that I, as a kbinaut, could literally see they were the one downvoting.
Yea, good call. I wonder if kbin makes them viewable because the activity pub protocol does not allow them to be easily hidden.
Seems to be Ernest’s attitude about that sort of thing, he doesn’t like to hide things from the average user that someone more technically inclined would still be able to access
And I like it. It’s pretty earnest :)
One thing I really like is that it makes it easy to identify users to block. If there’s a post stating that “Nazis are bad” and it has ten downvotes, it’s very easy to use that to block future content from trolls and people I’m not interested in hearing from.
Yeah, and guess what? They can do that to you.
Effectively, every single person can use a bot that will automate the blocking of any user that ever downvotes them ever.
Like if I made a post that says I like Nazis, and then waited for the downvotes to pour in. Add every single one of those names to a block list, share that block list with all of my alts and all of my friends, and suddenly you have a whole army of Nazi sympathizers that are invisible to the users that would downvote them.
These hand waving excuses about votes being public are really lacking imagination. This is extremely abusable information, and cursory tools can will be put together to make abusing them simple.
I think there are some problems about voting being public. I don’t think this is one of them.
I don’t mind people blocking me, and if I don’t appreciate the type of content people provide I’ll block them liberally. It’s not necessarily anything personal, I’m just cirating my experience.
Furthermore, I strive to be on instances where nazi sympathisers would be banned, and where instances tolerating them would be defederated. The only issue is identifying and weeding out troll accounts.
I’ll just use my short username then
Just commenting so this stays one of the most commented posts. Feel free to keep scrolling
…and your point is… ? Admins and sysadmins as a general rule can see everything inside the technical system begins the interface. This is not news.
Good find, albeit a bit horrifying.
I wonder what the GDPR implications of this is. As far as I understand, even free, privately run services are required to abide by GDPR and offer data insight and deletion. They’re also required to state clearly what happens to user data.
Edit: Apparently people have varying takes and feelings on what the GDPR does and does not say, so I urge you to please read the summary of GDPR data privacy here: https://gdpr.eu/data-privacy/ as well as the summary of what constitutes personal data here: https://gdpr.eu/eu-gdpr-personal-data/ It’s easier to have a good and fruitful discussion if we talk about what the GDPR actually says.
How often are we going to see this postage? I think this is the third time I’ve seen it at least