back to article Why Python's pip search isn't working: We speak to infrastructure director about ongoing traffic overload

Last December, the Python development team overseeing the Python Package Index (PyPI) temporarily disabled the search endpoint on its XML-RPC API because its infrastructure has been overwhelmed by "abusive clients." The upshot is that searching for Python packages with pip, eg: pip search ascii or pip3 search png, isn't …

  1. Anonymous Coward
    Anonymous Coward

    pip3 worked fine for me last week.....

    pip3 worked fine for me last week when I needed the pynput package. Strange!

    1. Yet Another Anonymous coward Silver badge

      Re: pip3 worked fine for me last week.....

      pip is working, they just turned off search.

  2. Missing Semicolon Silver badge
    FAIL

    Devops, Web3.0, Agile, CI/CD

    All of this stuff works by pulling things from public servers. Linux distribution mirrors (Ubuntu, Debian), Pypi, Docker, NPM, all of these get used like they were local storage for CI/CD pipelines in Jenkins and similar.

    A standard procedure is to pull an image from Docker, update it from the distribution mirrors, install a load of Python, NPM and Maven libraries, do some tests, then throw it away. Repeat for each test, on each branch, for each project, in each company.

    Until the software providers start throttling anonymous pulls, and insisting on paid accounts, this will continue. It's basically laziness, because hip Agile Devops engineers don't need to worry about somebody else's costs, so they don't stand up local mirrors of these things that would cost them money and time.

    One day we will all lose when even updating our home desktops will cost. TIWWCHNT.

    1. Tom 38

      Re: Devops, Web3.0, Agile, CI/CD

      I'm a hip agile devops engineer; we just stick a caching http proxy in front of our CI pipelines.

      As in TFA, the problem with pip search is that it can't easily be cached like that.

    2. bombastic bob Silver badge
      Linux

      Re: Devops, Web3.0, Agile, CI/CD

      well I suspect that open source OS and package mirrors are providing their services in support of open source (in general) and not exclusively one particular distro (etc) and if their repos have the source or binaries needed to support something similar, it's really just all part of the gig. My guess is that at some point it will all balance out.

      Or we could host it on github (or similar service) instead, one that provides storage and bandwidth for free to publicly visible projects.

    3. Claptrap314 Silver badge

      Re: Devops, Web3.0, Agile, CI/CD

      No even vaguely sane CI/CD pipeline does a keyword search to find the libraries/gems/eggs/libraries/frobnitzs needed. No even vaguely sane app would, either.

      So either there are some idiot devs who have distributed a popular package that's doing something really REALLY stupid, or there is some evil actor out there. It hardly matters which.

      But it seems to me that the devs bear more than a bit of the blame. It's arguably a violation of standards to use POST for searches. In the end, the devs for PIP put up a vulnerable endpoint, and somehow this wasn't really pointed out until less than a year ago.

      1. shayneoneill

        Re: The next generation will attempt to port the kernel to Javascript...

        This is an XML-RPC endpoint, and its an *ancient* one. Yes it uses Post, but this comes from an era prior to Rest (as we know it today), which means the server isn't deciding whether its a get/post/delete or whatever from the HTTP method, but rather from the implementation of the RPC function.

        Arguably its bad POST, however not using POST would violate the standard.

  3. AndrueC Silver badge
    Happy

    I remember the Peripheral Interchange Program. Seems like that was less trouble :)

  4. CrackedNoggin Bronze badge

    Surely it's >meant< to be for human search - the search box at pypi (dot) org serves that purpose adequately.

    One possible reason it is not getting cleaned up from whatever automated scripts are using it is because it is non-essential usage which was coded to not abort on error. So the failure just takes a few ms more which nobody notices.

    One server-side approach to fix it would be to not reply at all, so the client would hang until timeout. More likely to get someones attention.

    Also, rewrite the pip program to remove the "search" command - although that wouldn't help with older versions which will be around for a long time.

  5. foxyshadis

    I've been using both for years and this is the first that I found out the teams aren't related at all. Go me? Well, there are only so many hoods I can look under out of sheer curiosity, rather than when they give out and stop running.

    I would've thought the pip team would do whatever it took to burn down anything Webservices/XML-RPC the minute any alternative appeared. REST is 99% of the functionality in 5% of the overhead.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like