Unable to synchronize repos with pulp, getting 503 error

I am trying to create a local mirror of the almalnux 9 repos, but almost all of them fail with errors like this:

"description": "503, message='Service Temporarily Unavailable', url=URL('https://repo.almalinux.org/vault/9/AppStream/Source/Packages/boom-boot-1.3-3.el9.src.rpm')"

I don’t seem to have the same issue if I just wget the package. Also, given that the synchronization task is finding the repo and trying to pull packages I assume the repo feed url is configured correctly… For the above example: Index of /almalinux/9/AppStream/x86_64/os/

Another odd thing is that some of the artifacts (rpm packages in this case) do download, but maybe there is some kind of rate limit? In the above case it looks like it got through about 120 before it failed. Also, downloading the metadata worked/completed.

  "progress_reports": [
    {
      "message": "Downloading Metadata Files",
      "code": "sync.downloading.metadata",
      "state": "completed",
      "total": null,
      "done": 5,
      "suffix": null
    },
    {
      "message": "Downloading Artifacts",
      "code": "sync.downloading.artifacts",
      "state": "failed",
      "total": null,
      "done": 120,
      "suffix": null
    },
    {
      "message": "Associating Content",
      "code": "associating.content",
      "state": "canceled",
      "total": null,
      "done": 0,
      "suffix": null
    },
    {
      "message": "Parsed Packages",
      "code": "sync.parsing.packages",
      "state": "completed",
      "total": null,
      "done": 1851,
      "suffix": null
    },
    {
      "message": "Parsed Advisories",
      "code": "sync.parsing.advisories",
      "state": "completed",
      "total": 0,
      "done": 0,
      "suffix": null
    }
  ],

Pulp3 versions I’m using:

    "versions": [
        {
            "component": "core",
            "version": "3.17.3"
        },
        {
            "component": "rpm",
            "version": "3.17.4"
        },
        {
            "component": "file",
            "version": "1.10.2"
        },
        {
            "component": "deb",
            "version": "2.17.0"
        }
    ],

I think rate limit helps, as well as lowering download-concurrency… My question I guess then is what is the recommended downloads per second for the main mirror?

I haven’t had any problems synching using Pulp, but I’ve been using mirror.vtti.vt.edu, apparently. Do you have this problem with mirrors, or just with repo.almalinux.org?

I haven’t tried anything but the main repo. I’m guessing the other mirrors would probably be less limited but the ask from consumer was to use main mirror so trying to stick with that.

All I see documented for synching the whole main repo is rsync – will Pulp accept an rsync url as the remote? I guess otherwise you’re waiting on an answer regarding HTTP from those who run the main repo.

i’ve seen that when using reposync (not on almalinux); it seems to be a timeout issue - like the http client expects an instant response but its on a slow link or busy site.

rsync is much better suited to the task.