Code Review Videos > Broken Link Checker > TypeScript > Handling Redirects

Handling Redirects

The ultimate goal of this little utility we are coding is to give us a clear visual overview of exactly what is happening when we visit a link that redirects.

Ultimately what we want is to be able to give our code a starting URL, and then get an array of output consisting of one or more links that were actually visited before the final URL was located.

So far we have the ability to visit a URL, but because we set up our fetch call to explicit require manually handling a redirect, right now things kinda come to a halt if the requested URL does redirect.

Here’s the fetcher code for reference:

import { FetcherResponse } from "./types";

export const fetcher = async (href: string): Promise<FetcherResponse> => {
  const { url, status, statusText, ok, headers } = await fetch(href, {
    redirect: "manual",
  });

  const headersObject = Object.fromEntries(headers);

  return { url, status, statusText, ok, headers: headersObject };
};
Code language: TypeScript (typescript)

And here’s what happens if we visit a link that should redirect:

We have a couple of issues here:

  1. We hardcoded redirected: false
  2. We’re not actually following the redirect.

Hopefully we can fix them both in one go.

A Recursive Approach

The process of fetching a URL is always the same.

It doesn’t matter whether we provided the URL, or the URL was provided from a location header in the response.

We start by calling visit with a URL:

export const visit = async (
  href: string,
  requests: VisitedURL[] = []
): Promise<VisitedURL[]> => {Code language: TypeScript (typescript)

There is a second parameter called requests, which is an array of previous requests. If we don’t provide a value here, it defaults to an empty array. When calling visit manually, we would likely never pass in a second parameter.

Where this parameter becomes useful is if the visit function determines there is another URL that it has to visit. It can then append the current request information to the requests array, and call itself with the new URL and that updated array of requests.

That should mean we can use a recursive approach to find the full link journey, regardless of how many redirects are actually involved.

Although, that said, we likely do not want to redirect indefinitely. So we will add in a bit of logic to ensure we stop at some high value – such as after 50 redirects. That should almost never happen in the real world, unless something has gone very wrong.

How Do We Known When We Should Redirect?

There are at least a couple of ways we could determine if our fetch request encountered a redirect.

One way is to look at the HTTP response code. All of the 3xx codes indicate some kind of redirection occurred.

A better way, for us at least, is to look in the response headers. This will contain a bunch of information that is very specific to the current request, but the one header we care about is the location.

If there is a location header then we have a possible next URL to visit.

That is the check we shall make.

Let’s write a test to cover this:

  test("should recursively call the visit function if a valid location header exists", async () => {
    const validUrl = "https://some.valid.url";
    const nextUrl = "https://next.url";

    const fetcherSpy = jest
      .spyOn(Fetcher, "fetcher")
      .mockResolvedValueOnce({
        ok: false,
        url: validUrl,
        status: 307,
        statusText: "Temporary Redirect",
        headers: { location: nextUrl, c: "d" },
      })
      .mockResolvedValueOnce({
        ok: true,
        url: nextUrl,
        status: 200,
        statusText: "OK",
        headers: { a: "b" },
      });

    expect(await visit(validUrl)).toEqual([
      {
        ok: false,
        url: validUrl,
        status: 307,
        statusText: "Temporary Redirect",
        headers: { location: nextUrl, c: "d" },
        redirected: true,
      },
      {
        ok: true,
        url: nextUrl,
        status: 200,
        statusText: "OK",
        headers: { a: "b" },
        redirected: false,
      },
    ]);

    expect(fetcherSpy).toHaveBeenCalledTimes(2);
    expect(fetcherSpy.mock.calls).toEqual([[validUrl], [nextUrl]]);
  });
Code language: TypeScript (typescript)

There’s a lot happening here, so let’s break it down.

Like in previous tests, jest.spyOn(Fetcher, "fetcher") creates a spy on the fetcher method of the Fetcher object.

We said above that the way the visit function should work is by recursively calling itself.

We’re going to set up our test so that we make an initial call to visit with the URL in the variable validUrl. This happens on line 22.

Internally, this will call our fetcher.

By using the chained syntax on lines 7 through 20, we tell our fetcherSpy how it should respond to the first call to fetcher (lines 7-13), and then how it should respond to the second call (lines 14-20).

In the first mocked response we will return some fake, but real looking data – the most important of which to this particular test is the location header on line 12.

Internally our visit function will need to be updated to contain logic that says, hey, I just noticed a location header, let’s use that as the new URL and call the visit function again.

On line 41 we assert that we did indeed recursively call the visit function twice, as the fetcher is invoked once per visit.

expect(fetcherSpy).toHaveBeenCalledTimes(2);Code language: TypeScript (typescript)

On line 42 we explicitly check that our fetcher function was called with the expected URLs.

expect(fetcherSpy.mock.calls).toEqual([[validUrl], [nextUrl]]);

// a good way of figuring out this stuff is to:
console.log(fetcherSpy.mock.calls);
// add this in your unit test code, and it will dump out as part of your test outputCode language: TypeScript (typescript)

And on lines 23-38 we cover off the anticipated response data that we should get from our visit function, if everything behaves the way we would like.

[
      {
        ok: false,
        url: validUrl,
        status: 307,
        statusText: "Temporary Redirect",
        headers: { location: nextUrl, c: "d" },
        redirected: true,
      },
      {
        ok: true,
        url: nextUrl,
        status: 200,
        statusText: "OK",
        headers: { a: "b" },
        redirected: false,
      },
    ]Code language: TypeScript (typescript)

Right now though, this test fails:

We can no longer get away with hardcoding the redirected value to false.

Let’s work now to make this pass.

Implementing The Recursive visit Call

Here’s a first pass at making this test go green:

  try {
    const result = await fetcher(href);

    if (result.headers.location) {
      const updatedRequests = [...requests, { ...result, redirected: true }];

      return await visit(result.headers.location, updatedRequests);
    }

    return [
      ...requests,
      {
        ...result,
        redirected: false,
      },
    ];
  } catch (e) {
    // removed for brevity
  }
Code language: TypeScript (typescript)

There’s two changes here.

The first is that if we got a location header on the response then:

  • Create a new array of updatedRequests by taking any previous requests, and adding in the current result along with a redircted value of true. We know this must be true, or we wouldn’t have the location header.
  • Then, call the visit function again recursively.

The second is that I forgot to include any previous requests when returning the result if we didn’t redirect.

That should now pass:

I’m really not keen on that conditional. It spills out implementation details from the fetcher in a way that makes me unhappy.

We could refactor that, but let’s try using the implementation right now and see if it really does work.

Road Test

Previously we set up our code so that we can either provide a URL from the command line, or it will call the default:

// index.ts

import { visit } from "./visit";

const defaultHref = "https://codereviewvideos.com";

const url = process.argv[2] ?? defaultHref;

(async () => {
  try {
    const journey = await visit(url);
    console.log(journey);
  } catch (e) {
    console.error(e);
  }
})();
Code language: TypeScript (typescript)

We will need to compile and run the code.

// from your project root dir
node ./node_modules/.bin/tsc

This should, if you are following along with the way I’ve been working, spit out lots of JavaScript files in your ./dist directory.

You can then call the index.js file using Node.

I will use:

node dist/v2/index.js https://codereviewvideos.com/typescript-tupleCode language: Shell Session (shell)

It’s a little hard to see.

What’s happening here is…

It is working!

Hurrah.

However, it sort of … hangs.

Whilst the code does what we expect, it doesn’t exit / return in a timely manner.

That’s one out of one. What if we try another URL:

node dist/v2/index.js https://aka.ms/new-console-templateCode language: Shell Session (shell)

Well, that doesn’t work, even though it looks like it should:

That one is interesting because I never expected that a location would not be a fully qualified URL:

  • https://aka.ms/new-console-template
  • location: 'https://learn.microsoft.com/dotnet/core/tutorials/top-level-templates'
  • location: '/en-us/dotnet/core/tutorials/top-level-templates'

OK, so it kinda works. But there are bugs. Let’s continue on, and fix them.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.