"I've learned a lot from [name]!" scams on YouTube

After programmatically looking for scams on TikTok, I decided to try to do the same on YouTube. After a brief look into comments in finance-related channels, I found a very popular pattern of scams that's designed to avoid Google's moderation models.

Initially, a thread of comments is created with seemingly non-malicious questions and replies. The top comment in thread gets a lot of likes, likely from other accounts of attackers. 

After a few replies in the thread, scammers setup the real trap: they mention some financial advisor by name. A few other people quickly join the thread and reply how the advisor helped them as well. 

In some cases, the thread also includes a way to find the advisor ("she is on telegram" or "just google her"). 

None of the comments in the thread include links, which are usually huge red flags for moderation ML models (even if transformed like "http@@s scammer com").  Furthermore, having to find the "financial advisor" on the internet makes the victim trust the scammer more.

It's also interesting how first few comments are never related to the scam itself and do not promote any particular advisor. Instead, these comments act as a good setup ("what can i do in this situation?") to the real scam. Even if the advisor-mentioning comment is deleted later, scammers always can add another comment to the previously created thread. 

Dataset & Approach

In order to find more examples of these scams, I've chosen two relatively small YouTube channels related to retirement planning and investing: Tom Crosshill and Retire Confidently. I downloaded comments with replies from 230 videos between two channels using RapidAPI.

In order to filter scams, I didn't create a tool to catch this specific pattern of scams. Instead, I used just slightly adapted prompt from the previous article

async def analyze_comment(comment_text, video_id):
"""Analyze a single YouTube comment for scam content using OpenAI"""
prompt = f"""Analyze this YouTube comment for potential scam content. Look for:
- Investment scams (fake crypto, trading schemes, get-rich-quick)
- Romance/relationship scams
- Phishing attempts
- Fake business opportunities
- Pyramid schemes or MLMs
- Identity theft attempts
- Fake giveaways or contests
- Suspicious links or contact requests
- Fake WhatsApp/Telegram contact sharing
- Impersonation of the channel owner or celebrities

Comment: "{comment_text}"
Video ID: {video_id}

Classify as either "scam" or "not_scam" and provide a brief explanation."""


response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are an expert at detecting scams in YouTube comments. Respond with a JSON object containing 'classification' (either 'scam' or 'not_scam') and 'explanation' fields."},
{"role": "user", "content": prompt}
],
response_format={"type": "json_object"},
temperature=0.1
)

result = json.loads(response.choices[0].message.content)

Results

This script classified 640 out of 5149 comments (11%) as scams. That's a lot for two small YouTube channels!

Example comments that were caught:

False positives

Both channel authors had comments classified as scams.

According to LLM: "The comment promotes a specific website for investment training, which raises red flags for potential investment scams. The inclusion of a link to an external site, especially one that claims to offer easy ways to optimize portfolios, is a common tactic used in scams. Additionally, the comment attempts to discredit other recommendations, which is a typical strategy to steer users away from legitimate advice and towards the scammer's offering."

This is a relatively easy fix — the channel author can be special cased. 

A comment about using social security in a country with lower cost of living was mistakenly classified as a scam: "SS provides 100% of my needs with so much left over I paid off 20k in CC debt in last few years, resulting in even more disposable income. I rent a furnished 1 bdrm, 2 bath seaview apt  /w pool and maid service for $500 including utilities. Thailand." According to LLM: "The comment promotes an unrealistic financial situation and implies a get-rich-quick scheme...".

Another comment that was mistakenly classified as scam: "I have my own indicator "common sense"! When I have doubled my money or seen stocks increase by $60-100 shar; well that's my indicator. Foolish men stay until the end!".  According to LLM: "The comment suggests a get-rich-quick mentality by claiming to have a personal indicator for investment success. Phrases like 'doubled my money' and 'stocks increase by $60-100' imply unrealistic returns...".

Without context of Roth vs traditional IRA discussion, this comment was mistakenly classified as a scam: "Put money into it now and pay taxes now. Most likely, it will never be missed. When eligible, withdraw untaxed AND - HERE'S THE BIGGIE - all the interest you've earned will not be taxed. That's huge. Once you get to retirement age you will be glad you went through your younger years paying taxes and not have to pay now.".

Sometimes, the script would catch self-promotions, but not necessarily scams. Here is one from a real estate advisor in Florida with real YouTube videos on the account: "If LEAVING tax free wealth is your goal, and you live in Florida … I can help you do that."

In general, it seems like the promising next step to deal with false positives is including more context from the source video, the comment thread and author's channel.

Overall, out of 30 comments that I manually looked at, 11 turned out to be false positives. 36% is a huge FP rate, making this initial script unsuitable for production.

False negatives

In order to get examples of false negatives, I looked on "not-scam" comments of users, who had other "scam" comments.

As expected, many "setup" comments (but not all!) within the scam thread were classified as non-malicious. "this is huge! would you mind revealing info of your advisor here please? in dire need of portfolio rebalancing" — classified as "non-scam" with explanation "The comment expresses a genuine interest in financial advice and portfolio management without any overt signs of scam tactics such as suspicious links, requests for personal information, or offers of fake investment opportunities. It appears to be a request for legitimate financial guidance rather than a scam."

The false negative comment in the whole thread.

Interestingly enough, the same account was used for a comment that directly mentioned the financial advisor under another video.

A potential strategy to fix these false negatives would be to include more context: the whole thread and/or other comments from the same user. It also might be needed to instruct LLM directly about this pattern of scams in the prompt.

Reporting to YouTube

I reported 20 comments to YouTube as a test batch to see if these actually going to be deleted. If it works, I'm going to report bigger batches until all 640 comments are covered.

No comments were blocked after a few hours, but I'm still full of hope! 

Update, July 6th: two comments (out of 20 reported) were removed from YouTube, as opposed to none removed from 20 comments in the control group, which I didn't report. So while it definitely helps to report these comments, YouTube still doesn't block most of them even after reports. 


Finding 7k TikTok scam comments with gpt-4o-mini and Batch API

Whenever I check the comments under a finance‑related TikTok video, the thread is almost always filled with sketchy promotions.

TikTok probably relies on machine‑learning models for moderation, but they clearly aren’t catching enough. Could a one‑shot LLM classification workflow do better?

Dataset

For a first experiment, I focused on a very specific niche inside the broader finance category: Dave Ramsey‑related channels. Ramsey, an American radio personality famous for his get‑out‑of‑debt advice, attracts even more scammers than the average finance creator—likely because people in debt are more vulnerable.

Using a RapidAPI provider, I downloaded 44,187 comments and replies from 140 videos across three channels: ramsey.solutions, daveramsey and ramsey.show.

Workflow

I added a simple one‑shot prompt (written with Claude) to a batch request:

system_message = "You are an expert at detecting scams in social media comments. Respond with a JSON object containing 'classification' (either 'scam' or 'not_scam') and 'explanation'."

prompt = f"""Analyze this TikTok comment for potential scam content. Look for:

* Investment scams (fake crypto, trading schemes, get‑rich‑quick)
* Romance/relationship scams
* Phishing attempts
* Fake business opportunities
* Pyramid schemes or MLMs
* Identity‑theft attempts
* Fake giveaways or contests
* Suspicious links or contact requests

Comment: "{comment['text']}"
URL: {comment['url']}

Classify as either 'scam' or 'not_scam' and provide a brief explanation."""

batch_request = {
"custom_id": f"comment-{i}",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "gpt-4o-mini",
"messages": [
{"role": "system", "content": system_message},
{"role": "user", "content": prompt}
],
"response_format": {"type": "json_object"},
"temperature": 0.1
}
}

f.write(json.dumps(batch_request) + '\n')

I then submitted the batch through the Batch API (very handy to process a bunch of data in bulk without a long-running script and 50% cheaper):

with open(batch_input_file, 'rb') as f:
    batch_input_file_obj = client.files.create(
        file=f,
        purpose="batch"
    )

batch = client.batches.create(
input_file_id=batch_input_file_obj.id,
endpoint="/v1/chat/completions",
completion_window="24h",
metadata={"description": "TikTok scam detection batch job"}
)

Three hours and $2.53 later, the batch was done.

Results

Out of 44 187 comments, the workflow flagged 7 733 (about 17 %) as potential scams, coming from 1 622 users. That’s a lot—for just a few channels!

Three scammy comments in one screenshot.

A few more examples:

{"text": "\"What if I told you your money could grow in crypto—even while you sleep?\"👍✅", "url": "", "classification": "scam", "explanation": "Promises passive crypto gains—classic investment scam."}

{"text": "Thanks for making me live up to my standard life and I was able to clear outstanding debts. Your good work has made you popular and everyone is talking about you. Honestly, thank you Sir @Carlo Haley", "url": "...", "classification": "scam", "explanation": "Debt‑relief miracle + named guru = red flag."}

{"text": "Thanks for your comments and involvement on my page. L!fe‑chang!ng opt!ons?? ?", "url": "", "classification": "scam", "explanation": "Odd punctuation to bypass filters; vague life‑changing promise."}

I grouped by user and manually reviewed 100 accounts: two had already been deleted, 22 were false positives, and the rest were genuine scams. A 22 % FP rate is too high for production, but likely fixable with better context or prompt engineering.

False positives

Most false positives occurred because the model didn’t have the video’s context.

 “The comment suggests a financial strategy that promises tax benefits, which can indicate a scam…”

$3 000 and “FOR SIX YEARS” look suspicious, but in a video about offsetting capital losses, the comment is harmless.

On a hunting video, “Do the hunt.” is an organic comment, yet the LLM saw “a potential phishing attempt or a fake giveaway.”

As another example: “I’m up my money right now everything is on real discount.” Without context (a legit user profile without links, a video about market performance), the model read it as "a get-rich-quick scheme or investment opportunity".

False negatives

To find false negatives (comments the LLM marked as safe but are actually scams), I looked at users with multiple comments where only some were flagged.

A frequent pattern that I observed was a friendly compliment that links to a profile filled with scammy content. The model saw only the compliment, not the destination — so it was another case of missing context.

Reporting to TikTok

I reported the 100 manually inspected scam comments through TikTok’s in‑app form. Every single one came back “no violation,” and each decision arrived exactly 30 minutes after submission.

Needless to say, this “Dave Ramsey” account isn’t the real Dave Ramsey.

I had planned to ask TikTok for a way to report in bulk after cleaning the first batch, but if we can’t even agree on what violates policy, that may be moot. Interestingly, all reports were marked “no violation” exactly 30 minutes after submission. Coincidence, SLA artefact, or were they never reviewed?