Skip to content

Service responds with 500 status when URL without scheme is requested #718

@robertknight

Description

@robertknight

An issue was observed when loading proxied resources while visiting https://via.hypothes.is/https://en.wikipedia.org/wiki/Diplodocus, where the service would respond with 500 statuses when attempting to proxy some images.

There are two separate issues:

  • The fact that proxying fails in the first place
  • The response code is a 500 status yet contains a body with a 404 error

This issue is about the latter problem.

Steps to reproduce:

 curl -H 'Referer: https://viahtml.hypothes.is' -i 'https://viahtml.hypothes.is/proxy///upload.wikimedia.org/wikipedia/commons/thumb/3/31/Diplodocus_teeth_Smithsonian.png/500px-Diplodocus_teeth_Smithsonian.png'

Note the URL being requested here (//upload.wikimedia.org/wikipedia/commons/thumb/3/31/Diplodocus_teeth_Smithsonian.png/500px-Diplodocus_teeth_Smithsonian.png) is missing a scheme. The correct URL should be https://viahtml.hypothes.is/proxy/https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/Diplodocus_teeth_Smithsonian.png/500px-Diplodocus_teeth_Smithsonian.png.

Expected result:

A response with a 4xx status, since the URL is invalid. Alternatively the service might infer that the scheme should be HTTPS. I haven't tracked down why this particular URL is being generated and whether it "should" be accepted. In any case, it shouldn't produce a 5xx error.

Actual result:

The response has a 500 status, but a body that says "404 not found"

HTTP/2 500 
date: Thu, 19 Sep 2024 10:11:05 GMT
content-type: text/html
content-length: 1268
cache-control: no-store
referrer-policy: no-referrer-when-downgrade
x-robots-tag: noindex, nofollow
x-abuse-policy: https://web.hypothes.is/abuse-policy/
x-complaints-to: https://web.hypothes.is/report-abuse/
cf-cache-status: BYPASS
strict-transport-security: max-age=15552000; includeSubDomains; preload
x-content-type-options: nosniff
server: cloudflare
cf-ray: 8c58ca242e1863d6-LHR
alt-svc: h3=":443"; ma=86400

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="content-type" content="text/html; charset=UTF-8;charset=utf-8"/>
        <meta name="viewport" content="width=device-width, initial-scale=1">

        <title>Via Error</title>

        <link rel="stylesheet" href="/static/css/bootstrap.min.css"/>
<link rel="stylesheet" href="/static/css/font-awesome.min.css">
<link rel="stylesheet" href="/static/css/base.css">

<script src="/static/js/jquery-latest.min.js"></script>
<script src="/static/js/bootstrap.min.js"></script>
                            </head>

    <body>
                <header>
   </header>
        
        <section>
        <div class="container text-danger">
    <div class="row justify-content-center">
        <h2 class="display-2">Via Error</h2>
    </div>
    <div class="row">
        <div class="col-12 text-center">
        
            <p class="lead">None</p>

                            <p class="lead">Error Details:</p>
                <pre>Internal Error: 404 Not Found: The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</pre>
                            </div>
    </div>
</div>
        </section>

                            </body>
</html>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions