Day 59!
Look at karlllllll
These videos follow a common recipe: A narrator, given a fandom (usually anime ones like My Hero Academia and Naruto), explores an alternative timeline where something is different. Maybe the main character has extra powers, maybe a key plot point goes differently. They then go on and make up a whole new story, detailing the conflicts and romance between characters, much like an ordinary fanfic.
Except, they are fanfics. Actual fanfics, pulled off AO3, FFN and Wattpad, given a different title, with random thumbnail and background images added to them, narrated by computer text-to-speech synthesizers.
They are very easy to make: pick a fanfic, copy all the text into a text-to-speech generator, mix the resulting audio file with some generic art from the fandom as the background, give it a snappy title like “What if Deku had the Power of Ten Rings”, photoshop an attention-grabbing thumbnail, dump it onto YouTube and get thousands of views.
In fact, the process is so straightforward and requires so little effort, it’s pretty clear some of these channels have automated pipelines to pump these out en-masse. They don’t bother with asking the fic authors for permission. Sometimes they don’t even bother with putting the fic’s link in the description or crediting the author. These content-farms then monetise these videos, so they get a cut from YouTube’s ads.
In short, an industry has emerged from the systematic copyright theft of fanfiction, for profit.
Since the adversaries almost certainly have automated systems set up for this, the only realistic countermeasure is with another automated system. Identifying fanfics manually by listening to the videos and searching them up with tags is just too slow and impractical.
And so, I came up with a simple automated pipeline to identify the original authors of “What If” videos.
It would go download these videos, run speech recognition on it, search the text through a database full of AO3 fics, and identify which work it came from. After manual confirmation, the original authors will be notified that their works have been subject to copyright theft, and instructions provided on how to DMCA-strike the channel out of existence.
I built a prototype over the weekend, and it works surprisingly well:
On a randomly-selected YouTube channel (in this case Infinite Paradox Fanfic), the toolchain was able to identify the origin of half of the content. The raw output, after manual verification, turned out to be extremely accurate. The time taken to identify the source of a video was about 5 minutes, most of those were spent running Whisper, and the actual full-text-search query and Levenshtein analysis was less than 5 seconds.
The other videos probably came from fanfiction websites other than AO3, like fanfiction.net or Wattpad. As I do not have access to archives of those websites, I cannot identify the other ones, but they are almost certainly not original.
Armed with this fantastic proof-of-concept, I’m officially declaring war against “What If” videos. The mission statement of Project Copy-Knight will be the elimination of “What If” videos based on the theft of AO3 content on YouTube.
I am acutely aware that I cannot accomplish this on my own. There are many moving parts in this system that simply cannot be completely automated – like the selection of YouTube channels to feed into the toolchain, the manual verification step to prevent false-positives being sent to authors, the reaching-out to authors who have comments disabled, etc, etc.
So, if you are interested in helping to defend fanworks, or just want to have a chat or ask about the technical details of the toolchain, please consider joining my Discord server. I could really use your help.
------
See full blog article and acknowledgements here: https://echoekhi.com/2023/11/25/project-copy-knight/
eepy
Shigaraki: Allmight is the sinner for everything
book 3 hulians so sickening
The above examples have been provided with the authors' permission to demonstrate what these look like.
Basic rundown:
They are all 3 sentences long
Perfect grammar, capitalization, and punctuation
Like absolutely flawless English teacher-style writing with only a single exclamation mark, ever
No mentions whatsoever of character names, settings, situations, or anything that could be tied to the story
The usernames may be identical to people who exist on ao3, but the name is not clickable, and no profile is associated with it EXCEPT when you directly search for that name. What this means: the comments come from an unregistered (not logged in) reader, bots scrape the site for real usernames, attach that to the comment, and post
Please spread the word about this so authors can filter comments and report them accordingly
There has been some speculation about why this is happening at all, and the best guess is that this is a feature that AI-training story-scraping tools are implementing to try and make their browsing traffic look legitimate
bottle of cats
1st set of svsss characters done!
SJ - LQG - YQY - MBJ - GYX - LBH (OG) - young LBH
✨ Some more edits of my edits + an Ida ✨
you know, the first time i saw ons it was through an edit
and my brain automatically assumed that mikayuu were lovers that kept reincarnating in tragic situations
i dont know how i came to that conclusion but tbh
accurate
sasuke x the living dead by marina & the diamonds
(SprinK/PhoeniX on AO3) mxtx and mha fics | she/her | hi, i don't know what I'm doing
241 posts