AI Loophole #1; Your GitHub README.md

Elias Griffin@lemmy.world · edit-2 6 months ago

AI Loophole #1; Your GitHub README.md

bamboo@lemm.ee · 6 months ago

Anything you put publicly on the internet in a well known format is likely to end up in a training set. It hasn’t been decided legally yet, but it’s very likely that training a model will fall under fair use. Commercial solutions go a step further and prevent exact 1:1 reproductions, which would likely settle any ambiguity. You can throw anti-AI licenses on it, but until it’s determined to be a violation of copyright, it is literally meaningless.

Also if you just hope to spam tab with any of the AI code generators and get good results, you’re not. That’s not how those work. Saying something like this just shows the world that you have no idea how to use the tool, not the quality of the tool itself. AI is a useful tool, it’s not a magic bullet.

Elias Griffin@lemmy.world · edit-2 6 months ago

Sounds like AI or an AI influencer post. The first paragaph is so far off-topic, might as well be talking about sailing. You completely mis-understood what I meant using TabNine. I wrote my own code and obfuscated my own code. Then tried to have AI complete another function using my code.

Nothing you said is relevant is any way, shape, or form.

[EDIT} https://www.tabnine.com/

wizardbeard@lemmy.dbzer0.com · edit-2 6 months ago

My guy, your posts are particularly hard to follow, and you are very very quick to jump to the conclusion that you’re somehow being targeted and under attack. It’s no surprise that people aren’t responding to what you think is appropriate for them to respond to.

You’ve gone out of your way to provide extra info about irrelevant details: Why does the particular flavor of git you use matter at all to this conversation beyond the fact that you self host, why does it matter that you are on github as well when we are specifically discussing things you believe were sourced from readme.mds you have self hosted?

Meanwhile you don’t give many details or explanation about the core thing you are trying to discuss, seemingly expecting people to be able to just follow your ramblings.

Edit: After having re-read your OP, it’s less messy than I initially thought, but jesus christ man you need to work on arranging your points better. It shouldn’t take reading your main post, a few of your comments, and the main post again to get your point: “AI data scrapers appear to treat readme files as public data regardless of any anti-AI precautions or licensing you’ve tried to apply, and they appear to not only grab from github bit also from self-hosted git repositories.”

Chronographs@lemmy.zip · 6 months ago

Seriously. OP might have a legitimate point but they’re making it with the energy of someone trying to convince me that vole people live in the antiposition of the time cube.