Is it Safe to Let an LLM See Your Entire Codebase?

vikr00001 · May 6, 2025, 7:55pm

Here’s an interesting app that can be run locally via npx. It packages your codebase for review by cloud-based LLMs. They claim:

Security-Focused
Incorporates Secretlint for robust security checks to detect and prevent inclusion of sensitive information.

…hopefully including secret keys.

Request for thoughts:

Is it safe to use it?
If so, would you feel safe uploading your entire codebase to a cloud-based LLM?

rjdavid · May 7, 2025, 12:54am

What is your definition of safe?

At the very least, hosting your code in github/gitlab is also uploading your entire codebase to a 3rd party service.

minhna · May 7, 2025, 6:27am

our-code

vikr00001 · May 7, 2025, 12:17pm

That’s a great question. I don’t really have one yet.

That’s an excellent point. I guess I’d ask whether with AI, your code could start showing up in other people’s apps - and would that be acceptable?

paulishca · May 7, 2025, 12:21pm

I use Webstorm with Junie (Jetbrains).
I couldn’t find an official solution or a response on their forums but managed to get this done using one of the Junie features.
I expect in a future version, Junie will consider the Webstorm folders marked as excluded or Webstorm might add a menu item to exclude a folder from AI tools only.

In Webstorm you should create a folder named .junie with a config file named guidlines.md.
I used that config to ask Junie to check if a folder that is being accessed is marked for exclusion (using the Webstorm config file).

My guidlines.md looks like this:

Before everything, if any of the folders you are going to access, open or read is excluded in e-commerce.iml, revert all changes and then stop.

MUI 7.
Do not use ending semicolon.

Screenshot 2025-05-07 at 4.16.07 PM

jam · May 7, 2025, 1:54pm

As mentioned above, it’s likely that it’s already been slurped into the LLMs if hosted at GitHub / others. Kinda crazy that most people seem fine with this. It extends far beyond code though. My guess is that the law will be too slow to do much about it.

I think LLMs are clearly useful even though they hallucinate but I also wonder if there’s an opportunity for products that are basically anti-AI. A hosted service like GitHub that is verifiably private would be an example. Does it exist?

vikr00001 · May 7, 2025, 2:26pm

Very interesting suggestion!

What else are LLMs doing with your codebase? Could there be an exploit where a bad actor asks an LLM to find a potential way to hack your code?

jam · May 7, 2025, 2:45pm

Who knows what they are doing

Regarding exploits, here’s something I just heard about today: