I must be rare in that the longer I use tiktok, the less relevant the recommendations feel. Maybe because I compulsively watch videos until the end even if I don't like them.
> Maybe because I compulsively watch videos until the end even if I don't like them.
That would definitely do it, basically destroying their most important signal.
TikTok is best in class for recommendeding content and I personally haven't see a dip in quality. Aka I never get trashy videos or anything cringe, just a consistent stream of science/tech, local Toronto restaurant reviews, cat videos, etc
? Do you have any proof things aren't working out for TikToks algorithm besides a few people whining on HN who probably weren't the target market in the first place? Because they seem to be doing just fine.
GP didn't say anything about TikTok's algorithm not working out for people in general, but I'll bite.
The selling point behind TikTok is that by using the app normally (i.e. watching what you want) TikTok can figure out what stuff you want to watch.
But pretty much every suggestion that comes out on HN for people who complain about the algorithm on sites like this is to do unnatural things to "train" the algorithm: be very careful about how long you watch videos, click on videos that you think are relevant even if you don't want to watch them, watch things you're not interested in to prevent falling into a rut, avoid click on things you're curious about if you don't think you want to watch 100s of them to avoid tainting your recommendations, etc.
If that's "working" I think I'd agree that TikTok was mostly hype.
Yep. It's almost like tiktok targets users who are only using system 1. If you just go with lizard brain reactions you don't care about any of that. But once system 2 is in play you start doing "crazy" things like watch a video forensically even if you don't want to be fed similar videos on the regular or skip videos even if they are about things you like, and you need some control over what you watch
A good example here is YouTube, which started recommending me a bunch of really short (5-second) meme videos lately which I really don’t want polluting my feed. Even if I really want to click and watch them just out of dumb curiousity… (I don’t go to youtube for 5 second meme videos, I prefer longer form content.)
I spent some time disliking every 5-second video being recommended to me, and the problem went away. It was easy.
I'm another one who's tried TikTok on a number of occasions and it's always failed badly with me, worse than any other site I can think of. It never improved despite people telling me it eventually would.
My impression is that it was making too many assumptions about me based on the wrong signals or something.
But to your point: wouldn't there be a survival bias of sorts with assessing recommendation performance? I didn't like it and left, so presumably the people who remained liked it.
It's obviously popular so there's that, but it seems circular to say it's working for the people who like it and stay. You're losing information about opportunity costs of lost users.
I mean there is an upvote/downvote in that there's a like button and a not interested button. If tiktok shows you something you really don't like just long press on the screen and hit "not interested"
They used to have options for like "don't show me videos with this sound" or "don't show me videos from this user" but now it's just a general purpose button.
>The lack of up/downvote is what destroys their signal, I think.
They don't need up/downvotes. They have signals such as liking, sharing, watching multiple times, commenting, looking at comments, saving, skipping, looking at creators account and other videos by the creator. If you want to focus on up and downvotes, you end up with reddit.
I want to know what portion of the algorithm is responsible for, when you are given a new blank slate user, "tries" certain categories
like, let's present this user travel or cooking material, that's usually safe
then, let's try things like certain genres of music, we'll see what they like/don't like
what i don't get is... how does that first recommendation on the #foryoupage or discover or whatever it's called, starts recommending you the sex workers who try to post as close to NSFW material as possible, get you to land on their profile, in their bio is a link to their Instagram or Linktree, and then from there it's an OnlyFans link
does the system try to recommend a soft entry into this content and then just pivot away if the user doesn't like it?
TikTok likely has enough information about others that it can begin to build a profile about you from the moment you login.
Let's use a hypothetical scenario: Someone states that they identify as a man, they're in the 20-25 year old age range, and based on phone location you can gather that they live in Texas. Now you're labeled as a 20-25yo Texas Man. Then you can look at others who fall in the "20-25yo Texas Man" category and show things you'd expect that group to like because chances are, you're more similar to others in the group than being a true outlier. If other people in the "20-25yo Texas Man" group have expressed interest in Apples, NSFW material, and lawn mowing videos, then since you're in that group, it's going to start off with that same material.
disclaimer: i've never signed up for tiktok and have no clue if this is how they do it.
100% it is. I'm a 34 year old woman and have to be pretty aggressive about not wanting the mommy and wifey shit. I want cats, watching things explode, and the Zoomers' digital Dadaism.
The classic terminology for this in AI/ML is "explore vs exploit", i.e. striking a balance between trying new things (in hopes of finding a new favorite) vs going back to the tried-and-true.
I have noticed these types of videos slip into Facebook’s Reels late at night. It’ll switch from showing me people making candy and doing home improvement stuff - videos with tens of thousands of views and likes, then it will cut to a video with almost 0 views/likes that are basically the beginning of the Onlyfans sales funnel. Never when the sun is up!
> It probably just recommends a bunch of stuff that is popular at the moment for new users.
I get that, but I feel like it starts with "known safe/neutral" material like cooking/traveling/photography/whatever
How can it detect "hey, this person might like if we introduce softcore porn into their timeline"? Like, do they have softcore porn identified on a scale and they introduce the really "safe" stuff and then gradually crank it up? Why are they presenting softcore porn at all? The Apple App Store is cool with that ToC wise?
It's not trying to start with "safe" stuff, it's not trying to "gently introduce" softcore porn.
It's going "This video got a billion views in the last 30 minutes, people must love it, let's keep amplifying it to any account that hasn't explicitly rejected this category of content"
Presumably blank slate accounts are treated as open to anything, until people start curating.
At the risk of going down a rabbit hole for no real reason, I don't use tiktok but when I speak to those that do I've not yet heard this softcore porn/sex worker thing.
For example, in my mind, not all ASMR content might lead to sexualized recommendations, but a girl in a bikini top with cat ears doing ASMR might generate both recommendations for ASMR and other more cam-girl like content. So I guess my question is, when you're starting off in tiktok seeing cooking videos, do you trend towards ones that feature 'sexier' hosts? They might not be sex workers to you, but they might be making tiktok think you're interested.
Also, what does tiktok know about you to start? What info do you have to give it to start an account?
so you agree that tiktok is able to classify "cooking videos" and "cooking videos with slightly sexualized hosts"? and that they "willingly" "try to push in recommendations" posts with higher "sexuality" attached content?
No, again, my assumption is that the user would trend towards that content. You don't need to push people towards it if you have a nuanced enough profile of each video.
(all things made up for this example)
cookinglady39 does a beach bbq recipe tiktok, in a bathing suit. You watch it. They give you another cookinglady39 video where she's back in the kitchen, you skip it, they give you a new cooking host also female, also dressed in summer attire cooking outside. You watch til the end. It gives you a man cooking outside, you skip. Nothing you've seen so far has been sexual, but tiktok is probably picking up on some trends that might lead them to give you more and more things done by women, then women in a certain setting, dressed a certain way and so on.
TikTok gives you the content you enjoy. When someone complains about TikTok content I basically assume they don't understand how good the algorithm is and that you just like that kind of stuff. I don't care whether you do or not but TikTok thinks you do because of the feedback you are giving the app. I mean, you clicked their profile and followed their links all the way to onlyfans. They have to assume you like it.
Assuming we're starting with a blank slate, and a heteronormative male user that would happen to enjoy consuming that content on TikTok:
In the initial set of recommendations based only on overall popularity, there might be a video that's popular that incidentally contains a pretty woman. If the user skips most videos after barely a few seconds, but watches that one fully 3 times through, then the recommendation engine probably looks at users it does know more about that exhibit similar behavior and have higher engagement. It will then recommend videos that those users would probably watch a lot. Now the recommendations are shifted in the direction from "generally popular" to "contains pretty women". You repeat this enough times and the user ends up navigating the space of recommendations until they're maximally engaged (in theory). That means they might end up at softcore porn. Goodness knows that porn is popular if nothing else.
The recommendation engine doesn't even have to know anything about the content of the video. Just know what already high-engagement users that watched that video a lot also watched a lot.
That's at it's most basic really, I'm sure there's additional cleverness on top in practice.
The point of the original poster was that those videos have zero views or low rating - not popular by any means - and they appear out of place in the stream.
Youtube does it too, and my best guess is that it is a form of supervised training and the real question is who's being trained.
This is a critical step to get rid of bad recommendations, and the algorithm seems to treat it as a very strong signal—after 1-a few “not interested” labels you won’t see that content for weeks if ever.
That's because on the inverse of algorithm recommendations, everyone forgets that the platform is primarily built around an ad system that generates money... There is always going to be a conflict with recommendations because they need to placate and accommodate paying users of all kinds, even users with low quality content and outright commercials. Pretty much all of these social platforms work in this manner now. As a result everyone is seeing undesired (ad) boosted content on top of the other ads (that are marked as ads).
They are probably trying to save GPU power on already hooked users. This is a common trick in Recommendation Systems. You want to spend the most resources / run your most expensive model on users that are just checking out your platform.
A bit like how Poker sites give you better cards in the beginning.
It's less idiotic if you assume the bias goes the other way. People who stick with online poker had better cards in the beginning leading to them early winning and continuing to play. They then experience a reversion to the mean. Those who lose too often too early lose interest and sign off.
They say the worst result to get on your first roulette spin is to hit your number.
Obviously, these are just trends among people. Some gambling addicts never won and skilled poker players won't go on tilt after 20 bad hands.
Eh, it’s bad, but not as bad as the “companies intentionally leak false information to make their upcoming products look unappealing just so the real announcement is a pleasant surprise” 8D chess players.
This can happen to me if it gets stuck down a avenue that it thought I was interested in. But the next day or even a few hours later, it seems to correct itself.
I've avoided downloading TikTok due to security concerns and its ties to the People's Republic of China. Are these concerns valid? I remember reading articles that claimed it was stealing data from people's phones and spying.
Anecdotal but my daughter had tiktok before the ios update that showed when apps were listening and right after the update we caught tiktok listening when she wasn’t recording any videos. I almost couldn’t believe how brazen they were considering the ios change was well documented and reported on.
The PRC has the world's most advanced and far-reaching data mining and censorship apparatus to tightly (and quite effectively) control the flow of information to its citizens. Does that sound like the type of organization you would entrust your data to?
Bytedance is beholden to the PRC, and while TikTok data may (or may not) be monitored, the Chinese version of it is subject to the same monitoring and censorship gates as any other social media app in China.
Interesting, and despite these concerns about privacy and security, many high-profile people including celebrities are using TikTok. It's likely they have dedicated devices for social media rather than installing TikTok onto their primary devices but it still amazes me how many people, including important ones, have ignored these warnings. Last I heard TikTok has 1 billion monthly users.
Lots of people also just don't care about either China or the data privacy concerns because they're not data private people nor data privacy activists. I don't think it really has to do with having the luxury of a second device so much as the issue not actually being an issue to most people.
It’s the only app/service that makes people smile non-stop. It’s endless videos of cool stuff, comedy, cat videos, tech tidbits, cooking videos, etc. I’ve deleted all other social media accounts because browsing Facebook, Instagram, and Twitter made me feel like shit.
TikTok doesn’t make me feel that way, and it’s that simple. This is the same reason why a ton of people are dialing back their Instagram usage or creating “finsta” accounts - Who wants to feel insecure constantly? Same goes with Facebook.
There’s the argument “Well you’re not using those service correctly. I use FB/IG/Twitter and I’m fine. Follow these steps on how to avoid toxicity on your FB feed…” (articles like this used to be posted to HN constantly).
But the thing is… why should we have to do that? TikTok shows me content that makes me smile BY DEFAULT. I am not being milked for engagement using my fight or flight response like Meta’s lolcow.
If other apps weren’t so toxic (by design), then people wouldn’t be nervous when posting on IG, or avoiding Facebook because of the cancer comment sections and ranking systems/algorithms designed to make you angry.
If you reread this and replace TikTok with Soma, it’s eerie.
To me, TikTok just found the proper levers to hit psychologically to keep people’s attention.
I jokingly think, all the creators of each social media app are in a makeshift lab, and one say “wait… if we make them happy, they stay longer!” Then another one goes “Drats! I thought jealously and conflict was the key!”
I predict in the next 5 years we’ll be banning/have a campaign against people under 18 (or maybe 16) from even using these apps, and even go so far as to say anyone spending more than X time on them per day/week, should seek help. (With X being a relatively small number compared to the norm)
Just remember, not too long ago we thought: Cigarettes are a great way to lose weight! Even my doctor recommends them.
Following the link to the cuckoo hasing algorithm on Wikipedia, I don't quite understand what it's doing. I looked up another couple articles but still find myself confused. Does anyone have a link to a resource with an easy-to-follow writeup of how cuckoo hashing works?
Chained (aka open hashing) hashtables store pointers rather than values in the "main" array, and just put collisions in a linked list on that hash cell. Easy, but indirections have a cost.
Closed hashing (aka open addressing, yes it's confusing) hashtables take hash(input) and just do it again to get a second location, and put it there. Repeat N times for N hash(hash(hash(...))) collisions. Dense, but needs more complex logic to figure out when to stop looking / what to do when deleting because anything could be at location X due to a collision with something else.
Cuckoo hashtables use two (or more) hash algorithms rather than one, and dedicate a portion of the memory to each algorithm. If something's already in the first algorithm's location, put the thing it's colliding with in that thing's second location. On read, check both locations. Dense, relatively simple for both insert and deletion, and tolerant of a few collisions with low cost.
(cuckoo hashtables are a form of closed hashing / open addressing, because they keep all the data within the data-sized arrays, not storing extra info. and all of these are over-generalizing / there are fairly different-looking strategies available, e.g. it's not necessarily pointers or strictly repeated hashing)
And you could just reject the existence of collisions entirely and move to a new, larger array immediately. That tends to perform so poorly in both cpu and memory that nothing really does it in practice, but it is technically an option.
> Closed hashing (aka open addressing, yes it's confusing) hashtables take hash(input) and just do it again to get a second location, and put it there. Repeat N times for N hash(hash(hash(...))) collisions. Dense, but needs more complex logic to figure out when to stop looking / what to do when deleting because anything could be at location X due to a collision with something else.
That's not really the common way to do open addressing [1].
Normally you either use (1) linear probing, where you look at `h(input)`, then `h(input)+1`, ... and so on. This is nice because of data locality. Though you need a good hash function for it to work well. Or (2) Double hashing, where you first look at `h_1(input)` then `h_1(input) + h_2(input)` then `h_1(input) + 2*h_2(input)` and so on. I'm not saying repeatedly hashing `hash(hash(hash(...)))` doesn't work, but it's more expensive to call the hash function so many times.
This isn’t the recommendation algorithm or system for tiktok it’s for 2 https://www.byteplus.com/en/product/recommend and it’s outlined in the paper. People miss this that skim it.
I try registered an account on Tiktok once, try uploaded a video of mine, and never get any view.
The ideal social network, to me, is everybody has a fair chance to get enough views for them to seriously produce good content to keep followers retention.
That means, "more money to get more views" technigue should be obsolete in this system.