Methodology 4 min read 5 March 2026

How We Use the Wayback Machine to Verify Charter Booking History

The Wayback Machine CDX API is a free, publicly accessible tool that lets us build 24-month booking history for any charter listing. Here's how it works.

The Wayback Machine at the Internet Archive is best known for preserving historical versions of websites. For charter due diligence, it serves a different purpose: it provides a free, independently generated record of what a charter listing calendar showed on any given date — going back years. This article explains exactly how we use it.

The CDX API: What It Is

The Wayback Machine CDX API (cdx.api.web.archive.org) is a free, unauthenticated HTTP API that returns an index of all saved snapshots for any URL. For a given Boatsetter or Sailo listing URL, the CDX API returns timestamps, HTTP status codes, and snapshot identifiers for every crawl Wayback has performed on that URL.

Major charter platform listings are crawled by the Wayback Machine every 3–21 days, depending on the platform and the listing's traffic. A popular Boatsetter listing in Split, Croatia may have 80–120 snapshots available over a 24-month period — one snapshot every 7–10 days on average.

Extracting Calendar State From Snapshots

Each Wayback snapshot is a frozen HTML capture of the listing page at a specific point in time. For charter platforms that render availability calendars in the HTML (as opposed to JavaScript-loaded), this snapshot contains the booking state for all visible calendar months. We parse these snapshots to extract: (1) Which dates were marked as booked; (2) Which dates were marked as available; (3) Which dates were blocked (maintenance, owner use).

Building a Booking History

By sequencing snapshots chronologically and applying state logic — a date that was 'available' in a January snapshot and 'booked' in a March snapshot was booked somewhere in that window — we build a probabilistic booking history for any listed vessel. The confidence of each booking inference increases with snapshot density: more snapshots in the window → narrower uncertainty range on the booking date.

Snapshot density and inference confidence

Snapshot intervalDate uncertaintyConfidence level
< 7 days± 3–4 daysHigh
7–14 days± 5–7 daysMedium–High
14–30 days± 10–15 daysMedium
> 30 days± 15+ daysLow

What the Data Can and Cannot Show

Wayback Machine data can show: approximate booking dates and durations for the past 6–36 months, seasonal patterns across multiple years, changes in pricing and listing content over time. It cannot show: actual revenue (only availability, not price paid), cancellations that were re-booked before the next snapshot, or bookings on platforms that render calendars in JavaScript without server-side HTML (some newer platform implementations).

Use case in practice

A bank reviewing a charter loan application receives a third-party Charter Pulse report showing 22 months of Wayback-derived booking history for the specific vessel, with source snapshot URLs for every inferred booking. The bank can independently verify each data point by visiting the archived URL.

Why This Matters for Due Diligence

The defining property of Wayback Machine data is independence: it is generated by the Internet Archive, not by the charter operator, the management company, or the booking platform. Each snapshot URL is publicly accessible and permanently archived. This makes it the closest available equivalent to an audit trail for charter booking history — verifiable by any party with internet access.