Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP ban investigation #318

Closed
sigaloid opened this issue Nov 12, 2024 · 45 comments
Closed

IP ban investigation #318

sigaloid opened this issue Nov 12, 2024 · 45 comments
Labels
bug Something isn't working

Comments

@sigaloid
Copy link
Member

sigaloid commented Nov 12, 2024

Right now (11/18) there's breakages. Comment in #324. This is reserved for IP bans specifically.

Update (11/19): #324 is solved. Please do not comment here about a JSON error unless you are certain that you're on the latest version; the widespread outage was unrelated to the issue of IP bans.

There is a number of reports of continued JSON errors despite the fact that most instance operators running on clean IPs do not get these. If you are impacted by this, please ensure you can reproduce this on certain IPs by confirming you can visit the Redlib home page on one IP (perhaps try your home IP, VPN to another place, etc) but cannot on the IP in question. Then, comment or email (ipban @ my domain, linked in profile) the following info:

  • Confirm you're running the latest commit (you should be able to see "✅ Instance is up to date" on the error page)
  • Confirm impact (can you view any pages?)
  • IPv4/IPv6
  • Your IP's ASN and ISP (and IP if you're comfortable sharing privately)

Reminder: Do not comment on this post if you're getting errors, unless you've CONFIRMED it works on some IP that isn't yours.

@ggtylerr
Copy link

  • Confirmed.
  • Occasional blockage, ~75-90% of runtime is unblocked.
  • NYC-1: 45.137.206.17, CAL-1: 71.19.146.127
    • (IPv6 ranges can be disclosed in PGP email.)
  • NYC-1: RoyaleHosting BV, ASN 212477. CAL-1: PRGMR.com Inc, ASN 47066.

It should be noted that both are currently fully operational but has previously been experiencing #301.

@sigaloid
Copy link
Member Author

Can you set the environment variable RUST_LOG=redlib=trace? Then when you encounter the error, excerpt from the latest logs? Thanks.

@hyperreal64
Copy link

  • Confirmed.
  • I cannot view any pages.
  • IPv4: 152.53.37.179
  • IPv6: 2a0a:4cc0:2000:2a:1416:76ff:fe0c:d737
  • netcup GmbH ASN: 214996

@ggtylerr
Copy link

ggtylerr commented Nov 14, 2024

Can you set the environment variable RUST_LOG=redlib=trace? Then when you encounter the error, excerpt from the latest logs? Thanks.

Just got the error on both servers. It looks like NYC-1 just became 3 commits too old due to unrelated issues regarding quay.io (it looks like their IPv6 connection is down and docker compose pull is forcing connections over that. for some reason quay.io just can't be reached at all right now despite the site still being up.) However CAL-1 is on latest. The logs, unfortunately, don't seem to be of much use:

redlib  |  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
redlib  |  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 8
redlib  |  INFO  redlib::oauth  > [🔄] Spoofing Android client with headers: {"Client-Vendor-Id": "d5bc0918-e77d-40f6-9ef0-69274c59539a", "X-Reddit-Device-Id": "d5bc0918-e77d-40f6-9ef0-69274c59539a", "User-Agent": "Reddit/Version 2024.04.0/Build 1391236/Android 9"}, uuid: "d5bc0918-e77d-40f6-9ef0-69274c59539a", and OAuth ID "ohXpoqrZYub1kg"
redlib  |  TRACE redlib::oauth  > Sending token request...
redlib  |  ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
redlib  |  ERROR redlib::utils  > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1
redlib  |  ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
redlib  |  ERROR redlib::utils  > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1
redlib  |  TRACE redlib::oauth  > Received response with status 200 OK and length Some("1308")
redlib  |  TRACE redlib::oauth  > Serializing response...
redlib  |  TRACE redlib::oauth  > Accessing relevant fields...
redlib  |  INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
redlib  |  INFO  redlib::oauth  > [✅] Successfully created OAuth client
redlib  |  ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
redlib  |  ERROR redlib::utils  > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1
redlib  |  ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden

@sigaloid
Copy link
Member Author

Hm, the fact that it's near to a token refresh makes me wonder if it's the fact that the token reaches its rate limit too quickly. It currently starts a background task when there's 9 left, which is working:

image

But... if 9 concurrent requests come in simultaneously, it's possible they exhaust the remaining rate limit count. Is this a high-traffic instance, specifically?

@ggtylerr
Copy link

ggtylerr commented Nov 15, 2024

Is this a high-traffic instance, specifically?

Definitely not. CAL-1 is one of two secondary servers, only listed on my site. It's primarily used for Invidious.

This doesn't appear to be deliberate DDoS either, as my other server, POL-1, has been completely unaffected so far.

EDIT: With that being said, Invidious does have the feature to fetch comments from Reddit. But this hasn't been functional since Reddit's API fiasco and only attempts to fetch when the user explicitly clicks the "View Reddit comments" button on a video. Plus it wouldn't make sense why that would start to become a problem now, many months after hosting both.

@sigaloid
Copy link
Member Author

Hm, yeah, that checks out. So far every IP has been a commercial ASN and I'm hoping that's not the commonality, that they all have flagged some heuristic for being suspicious because they're hosted.

@ggtylerr
Copy link

I am noting an unusual amount of requests on NYC-1 (the public server on redlib's list.) I've heavily adjusted rate limits over nginx to be only 5 requests per second (10 burst) but without any further logging, it's difficult to say whether this is regular user activity or possibly scrapers.
https://github.com/user-attachments/assets/8ed8eb93-d0f5-430d-89bb-93bc830ab8e2

@np22-jpg
Copy link
Contributor

I was running the same commit and couldn’t view any pages, either, on a residential IP (which is why I’m not comfortable with sharing it). I was able to get around it by taking my instance offline, requesting an IP unban from Reddit, and bringing it back up.

I'm not sure if this is at all helpful information.

@sigaloid
Copy link
Member Author

That's very helpful. How did you request an unban? And did you have the same IP the whole time (during the ban and after it worked again)? Sometimes residential IPs cycle and change normally.

@np22-jpg
Copy link
Contributor

When I got IP banned, I received a page that looked like this or this when trying to access from my browser. Based on my previous emails, the link to contact support led me to [email protected], who then pointed me to fill out this form. I have had the same IP the entire time.

@hyperreal64
Copy link

Hm, yeah, that checks out. So far every IP has been a commercial ASN and I'm hoping that's not the commonality, that they all have flagged some heuristic for being suspicious because they're hosted.

So I set up Redlib on one of my homelab machines which uses my residential IP address. It worked for a few hours, then I noticed I get the same error. My residential IP ASN is COMCAST-7922. I'm not sure if this is classified as a "commercial ASN", but if it's not, then hopefully this quells your concern.

@sigaloid
Copy link
Member Author

Weird. After only a few hours, it should have only requested one token total and presumably not had enough traffic (more than 99 reqs in a 5 minute period) to require a new token rollover. That's really surprising that's all it took to get your IP under their watchful eye. I'm going to think a bit on how to proceed here. I don't know if it makes sense to attempt to identify a single factor that causes this policy ban, at least via trial and error, for a few reasons:

  • it could be any number of factors
  • it could be ML-type ban of many factors in which case we really couldn't guess and check over and over
  • I don't have unlimited clean IPs that I can test with
  • I already put a lot of work into attempting to copy exact behavior of the mobile apps in terms of new tokens, it's unclear if it's a single thing I missed or if it's some lower level thing like TLS signatures, etc...

Maybe I need to take a look at the auth flow on a current app since it's been a few months. Not sure if anything that has changed should really cause this, since people who haven't used the app in a few months should still be able to use the app without being IP banned.

@hyperreal64
Copy link

OK, right now I can access Redlib from my home IP instance. My public-facing instance is still error.

@Shagon94
Copy link

Seeing the same issue again for some reason, I'm self-hosting it and the only one interacting with the instance (single user)

@donslice
Copy link

donslice commented Nov 18, 2024

Getting this as well on my private (single user) self-hosted instance.

Issue started at 2:25 PM Eastern Time. I can access the website just fine

redlib | Starting Redlib...
redlib | INFO redlib > Evaluating config.
redlib | INFO redlib > Evaluating instance info.
redlib | INFO redlib > Creating OAUTH client.
redlib | INFO redlib::oauth > [🔄] Spoofing Android client with headers: {"User-Agent": "Reddit/Version 2023.21.0/Build 956283/Android 13", "X-Reddit-Device-Id": "0511a00b-c160-4a0c-8b26-1e47ac7b12e9", "Client-Vendor-Id": "0511a00b-c160-4a0c-8b26-1e47ac7b12e9"}, uuid: "0511a00b-c160-4a0c-8b26-1e47ac7b12e9", and OAuth ID "ohXpoqrZYub1kg"
redlib | TRACE redlib::oauth > Sending token request...
redlib | TRACE redlib::oauth > Received response with status 200 OK and length Some("1308")
redlib | TRACE redlib::oauth > Serializing response...
redlib | TRACE redlib::oauth > Accessing relevant fields...
redlib | INFO redlib::oauth > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
redlib | INFO redlib::oauth > [✅] Successfully created OAuth client
redlib | INFO redlib::oauth > [⏳] Waiting for 86279s seconds before refreshing OAuth token...
redlib | Running Redlib v0.35.1 on [::]:8080!
redlib | ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
redlib | ERROR redlib::utils > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1
redlib | ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
redlib | ERROR redlib::utils > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1

@kotx
Copy link
Contributor

kotx commented Nov 18, 2024

  • Confirmed
  • No pages can be viewed (although curling reddit frontpage with both v4 and v6 somehow works?)
  • 69.31.3.91, 2605:4c40:118:ed69:0:6990:3f1c:1
  • AS30081 CacheNetworks, Inc. (hosted on Fly.io)
2024-11-18T19:48:11.214 runner[8749e0b0249d28] ord [info] Machine created and started in 4.719s

2024-11-18T19:48:11.226 app[8749e0b0249d28] ord [info] Starting Redlib...

2024-11-18T19:48:11.228 app[8749e0b0249d28] ord [info] INFO redlib > Evaluating config.

2024-11-18T19:48:11.228 app[8749e0b0249d28] ord [info] INFO redlib > Evaluating instance info.

2024-11-18T19:48:11.229 app[8749e0b0249d28] ord [info] INFO redlib > Creating OAUTH client.

2024-11-18T19:48:11.229 app[8749e0b0249d28] ord [info] INFO redlib::oauth > [🔄] Spoofing Android client with headers: {"User-Agent": "Reddit/Version 2023.09.0/Build 812015/Android 14", "Client-Vendor-Id": "c0ca74b5-de4c-4142-8e5f-76edc3600c7b", "X-Reddit-Device-Id": "c0ca74b5-de4c-4142-8e5f-76edc3600c7b"}, uuid: "c0ca74b5-de4c-4142-8e5f-76edc3600c7b", and OAuth ID "ohXpoqrZYub1kg"

2024-11-18T19:48:11.230 app[8749e0b0249d28] ord [info] TRACE redlib::oauth > Sending token request...

2024-11-18T19:48:11.292 app[8749e0b0249d28] ord [info] TRACE redlib::oauth > Received response with status 200 OK and length Some("1308")

2024-11-18T19:48:11.292 app[8749e0b0249d28] ord [info] TRACE redlib::oauth > Serializing response...

2024-11-18T19:48:11.292 app[8749e0b0249d28] ord [info] TRACE redlib::oauth > Accessing relevant fields...

2024-11-18T19:48:11.292 app[8749e0b0249d28] ord [info] INFO redlib::oauth > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399

2024-11-18T19:48:11.292 app[8749e0b0249d28] ord [info] INFO redlib::oauth > [✅] Successfully created OAuth client

2024-11-18T19:48:11.292 app[8749e0b0249d28] ord [info] INFO redlib::oauth > [⏳] Waiting for 86279s seconds before refreshing OAuth token...

2024-11-18T19:48:11.293 app[8749e0b0249d28] ord [info] Running Redlib v0.35.1 on [::]:8080!

2024-11-18T19:48:11.360 app[8749e0b0249d28] ord [info] 2024/11/18 19:48:11 INFO SSH listening listen_address=[fdaa:0:80b4:a7b:191:6990:3f1c:2]:22 dns_server=[fdaa::3]:53

2024-11-18T19:48:13.710 app[8749e0b0249d28] ord [info] ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden

2024-11-18T19:48:13.710 app[8749e0b0249d28] ord [info] ERROR redlib::utils > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1 

@cameronj86
Copy link

I've been self-hosting a private teddit instance for over a year that was blocked this morning, so the issue appears to be widespread FYI.

Gave a shot @ redlib just now:

  1. Confirmed
  2. Only settings page can be viewed
  3. ?
  4. Comcast

@Cyrix126
Copy link

Got the error since a few hours.
Failed to parse page JSON data: expected value at line 1 column 1 | /r/popular/hot.json?&raw_json=1&geo_filter=GLOBAL

Getting on reddit without redlib works fine from same ip.
Restarting doesn't solve the issue.
This is on a self hosted instance.

TRACE log:

 ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
 ERROR redlib::utils  > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1

@WreckingBANG
Copy link

WreckingBANG commented Nov 18, 2024

Same here since about an hour. Private Instance, only used by me. Reddit probably changed something, i restarted my modem to get a new IP to be sure, but still not working, i dont think its ip bans.

@grngxd
Copy link

grngxd commented Nov 18, 2024

same here, cant view any pages but only on SOME public instances

@sigaloid
Copy link
Member Author

sigaloid commented Nov 18, 2024

Yeah this is a widespread thing now. This issue was intended to be for actual IP bans (where it would work any other IP except one) but now there's just actually a server side change. Triaging now (3x in the last 2 months. sigh). Everyone can ignore the questions about IPs if you're here to report the current outage 😄 Actually going to file a new issue.

#324 for the incoming flood of outages 😄

@OxyMagnesium
Copy link

Given that there seem to be IP-unrelated issues also going on right now, it might also be worth adding a step to try curling https://www.reddit.com on the instance host to determine whether it's an IP ban. I get this (image courtesy of @np22-jpg) when I do that on my host as of an hour or two ago, which does indicate an IP ban.

For the other questions:

  1. Confirmed.
  2. All requests, whether through redlib or direct to reddit.com return 403.
  3. IPv4.
  4. It's a commercial AS. I can provide details if it'll help.

This was a public instance, with somewhere around 200k requests/day.

@maxysoft
Copy link

Yesterday my instance was working, today not anymore.. Trying to curl reddit.com:
You've been blocked by network security.
(but from what I remember it's always been like this)

The instance seems to be completely blocked.

  • I'm on the latest version
  • Tried ipv4 and ipv6
  • Hetzner ASN

@tcsenpai
Copy link

I just want to add that I incur in the same problem. Is there a way to specify a proxy server (or even better something like gluetun) for redlib? I am using the standalone built binary

@sigaloid
Copy link
Member Author

It's not an IP ban issue, it's a widespread problem affecting every Redlib user. Proxy won't help. #324

@maxysoft
Copy link

It's not an IP ban issue, it's a widespread problem affecting every Redlib user. Proxy won't help. #324

I can confirm that. Just tried redlib from home (so residential ip) and still doesn't work. Seems that they're blocking redlib specifically :(

@AtmosphericIgnition
Copy link

I use Redlib exclusively from a residential IP, and my instance is also showing this error.

@graysonlee123
Copy link

Chiming in to say I'm getting this issue on a residential instance as well.

@sigaloid
Copy link
Member Author

for all commenting about the JSON error; that's a different issue #324 which is now fixed. Update here if you're experiencing the same error AFTER pulling the LATEST docker image.

@tcsenpai
Copy link

I can confirm updating to the recently pushed version works for me too.

Specifically, from v0.35.1 the JSON error disappears.

My step by step solution (I use redlib as a binary for a systemd service):

git pull
sudo systemctl stop redlib
cargo build --release
sudo rm -rf /usr/bin/redlib
sudo cp target/release/redlib /usr/bin/redlib

@halictuz
Copy link

I have the latest version on my server but I get these errors always:

Attaching to redlib
redlib | Starting Redlib...
redlib | ERROR redlib::oauth > Failed to create OAuth client in Elapsed(()). Retrying in 5 seconds...
redlib | ERROR redlib::oauth > Failed to create OAuth client in Elapsed(()). Retrying in 5 seconds...

So, the container doesn't even start correctly and therfore the site shows nothing but a HTTP ERROR 502

@OxyMagnesium
Copy link

Okay it seems I was wrong about direct requests to reddit.com being indicative of an IP ban, at least for the endpoints Redlib accesses. When Redlib is not being blocked (like now), requests through Redlib on my instance host work, but direct requests to reddit.com from the same host are blocked. I don't think it's a problem with curl either because if I proxy requests generated from my desktop's browser through my instance host, they still receive the "blocked by network security" 403 page.

Also just to echo everyone else: thank you so much @sigaloid for all your hard work 😄

@sigaloid
Copy link
Member Author

requests through Redlib on my instance host work, but direct requests to reddit.com from the same host are blocked

wait, that's huge! You mean that Redlib works even though in the browser doesn't? I think that means that Redlib no longer triggers the IP bans directly (or are being blocked by them), it's just a residual ban that will eventually expire.

@OxyMagnesium
Copy link

That's what I'm seeing, yes. Specifically:

  • Residential connection:

    • curl https://www.reddit.com works
    • Desktop browser request to www.reddit.com works
    • Redlib as of the current time and latest commit works
  • Commercial (former public instance host) connection:

    • curl https://www.reddit.com does NOT work
    • Desktop browser request to www.reddit.com does NOT work
    • Redlib as of the current time and latest commit DOES work

But to be clear, it seems to me that Redlib on this host was never affected by the IP ban in the first place -- I just didn't realize that it could be blocked when requesting www.reddit.com while still being allowed through on the endpoints Redlib is accessing.

@maxysoft mentioned that their instance host has never been able to curl www.reddit.com, but Redlib worked previously. I think I might have been in the same situation without realizing, and the breakage today was purely due to the TLS fingerprinting issue. I'm not sure how the blocking of direct requests relates (or doesn't relate) to what people were seeing earlier in the thread.

@halictuz
Copy link

halictuz commented Nov 20, 2024

My rpi4 at home works again with the latest arm fix. (public ip from home)

Public Server hosted at Hetzner (ARM) with latest version still does not work. The docker container doesn't even start at all.

redlib | Starting Redlib...
redlib | ERROR redlib::oauth > Failed to create OAuth client in Elapsed(()). Retrying in 5 seconds...
redlib | ERROR redlib::oauth > Failed to create OAuth client in Elapsed(()). Retrying in 5 seconds...

@2bc4
Copy link

2bc4 commented Nov 20, 2024

requests through Redlib on my instance host work, but direct requests to reddit.com from the same host are blocked

Confirming this here as well. If I route redlib with the latest commit through a VPN with a blocked IP it still works.

@WreckingBANG
Copy link

Everything is working for me too on my private residential instance on the latest image. 👍

@maxexcloo
Copy link

maxexcloo commented Nov 21, 2024

Might be helpful for some, use Cloudflare WARP to proxy the redlib instance:

services:
  cloudflare:
    cap_add:
      - NET_ADMIN
    image: caomingjun/warp
    ports:
      - 8080:8080
    restart: unless-stopped
    sysctls:
      - net.ipv4.conf.all.src_valid_mark=1
      - net.ipv6.conf.all.disable_ipv6=0
    volumes:
      - cloudflare:/var/lib/cloudflare-warp
  service:
    image: quay.io/redlib/redlib
    network_mode: service:cloudflare
    restart: unless-stopped
volumes:
  cloudflare:

@sigaloid
Copy link
Member Author

Okay, it's come to my attention that there were some deeper server-side changes that are causing these IP bans. This has caused some serious issues for the popular instances as there's no rigorous rate limit bypass mechanism. The main issue is this rate limiting effect so popular instances are hit hard by them.

Unfortunately it looks like a lot of work is necessary to get around this restriction. It may be radio silence (and current status quo - local instances will probably work, but instances that reach rate limit (100 every 10 minutes) may get constant errors) until I can get a real overhauled solution up and running. I can't put a timeline on it, unfortunately. It will be possible that public instances will work, while local instances built from the repo's head won't bypass the rate limiting. Once I'm more sure of its stability, this new version will of course be published here, but only once it's in a place where I'm actually sure it's resilient to the issues it is facing now.

@donslice
Copy link

Sheesh. They really don't want us using this. As always, thanks for your continued efforts.

@sigaloid
Copy link
Member Author

Issue turned out to be a lot easier to fix than my initial thought... 😨 oops. Wanted to ensure that's all that went wrong before pushing it. We should be at a point where IP bans aren't correlated to IP's. If it still is, it's probably a unique case with your IP and it shouldn't be as widespread. If I get tons of reports about it, I can open a new issue.

Public instance operators, please update!

@halictuz
Copy link

Works fine so far on my public instance. Which did not work at all before, only from my homelab it worked. Thanks for the hard work and effort. Appreciate it. Users gonna be happy, I guess.

@ggtylerr
Copy link

ggtylerr commented Dec 5, 2024

Unfortunately it looks like Reddit has blocked my NYC-1 instance again - unsure if it's an IP block or what, but CAL-1 also occasionally gets it.

@jaydenengrayson

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests