Introducing Wayback Machine Transforms
We are thrilled to announce that we’re adding an integration of the Wayback Machine to Maltego! The Internet Archive’s Wayback Machine is a digital library of over 330 billion web pages and other internet artifacts. One of its many applications includes monitoring changes to websites, finding deleted social media posts and locating or tracing bad actors attempting to conceal their online footprints. All of which make it an incredibly useful tool for investigators.
The new Wayback Machine Transforms return archived Snapshots of the web resources available for a given input. Currently, you can use maltego.Domain, maltego.URL, maltego.File, maltego.Website, maltego.Image and maltego.Document as input Entities. You can also filter Snapshots based on the HTTP Status code returned to Wayback Machine when the page was archived, which can be useful to investigate webpage redirects.
Through online research, dead ends can be frequently encountered when web resources have been moved, modified or deleted. This new set of Transforms aims to clear the way to successful investigations!
To give you some idea of what is possible with these new Transforms, we will walk you through three quick use cases that make use of the Wayback Machine Transforms.
Use Case 1: Message Boards Chatter around the 2019 El Paso Shooter 🔗︎
For this example, let’s assume you want to research the background behind the tragic 2019 El Paso Shooting to further understand the mechanisms and conditions which fed into far-right extremism. Knowing that a lot of past right-wing shooters were active on 8chan, (most frequently on the /pol/ board (a descendant of 4chan’s /pol/)), you decide to investigate. However, you quickly find that the original 8Chan domain (8ch.net) was removed from the web after the El Paso shooting. So where do you go from here?
Searching through 8chan 🔗︎
Let’s start by inserting 8ch.net as a maltego.Domain Entity. Select the Entity and run the Transform To Snapshots between Dates [Wayback Machine]. This will prompt a popup asking for a date range allowing us to look for posts on the day of the shooting.
After combing through the Snapshots from the /pol/ board, you find a post with a file named “P. Crusius - Notification ….pdf”. Since Patrick Crusius has been identified as the shooter, you decide to further investigate this post.
Although the .pdf file itself was not saved by the Wayback Machine, you can copy its name and insert it as a maltego.Phrase Entity into Maltego. We can then run the Transform To Files (Office) [using Search Engine] and To Website [using Search Engine] (from the CTAS Hub Item) to look for the file elsewhere.
In the maltego.Website and maltego.Document Entities returned, you can find several copies of the shooter’s manifesto and a “Student Code of Conduct” warning from his university that he appears to have mistakenly included in the post.
You could now continue with your investigation by mapping the spread of the manifesto through 8chan or digging up the shooter’s online footprint.
Use Case 2: Disputing Patents 🔗︎
Another way to utilize the Wayback Machine Transforms is in the search for Prior art when submitting or disputing patents. Evidence from the Internet Archive has been successfully used to establish what was common general knowledge at the time, which in turn can help determine the novelty of the invention described in the patent.
In the case of Dyno Nobel Inc (Dyno) v Orica Explosives Technology Pty Ltd (Orica), Dyno submitted as evidence printouts of snapshots from the Wayback Machine of a website (i-konsystem.com), operated on behalf of Orica, in an attempt to revoke the patent “Method of blasting multiple layers or levels of rock” filed by Orica. With the Snapshots, Dyno attempted to establish the common general knowledge at the time of filing.
Establishing Common General Knowledge 🔗︎
To investigate Dyno’s claim, insert the website in question (i-konsystem.com) as a maltego.Website Entity. Next, with the slider set to 256, select the website and run the To Snapshots between Dates [Wayback Machine] Transform. Since the patent was filed on the 13th October 2004, we now search for all snapshots between the 12th October 2003 and the 12th October 2004.
Among the results, you can find over 40 PDFs containing details about their blasters and detonators, specific projects where they were used and even trade magazine articles detailing the current state of blasting systems at that time (which could be considered a description of common general knowledge!). To get a clearer picture of which documents may have warranted the court’s attention, we can run the PDF documents through IBM Watson to extract Entities:
From this, a rather large graph is returned with different Entities and keywords that were extracted from the document content. Using Ctrl+F to search by phrase, we can quickly select relevant matches:
Finally, to isolate the relevant documents, we click Add Parents in the selection toolbar, after which we invert and delete our selection to clean up the graph:
As shown, we were able to quickly find, explore, and narrow down content in a visual way to aid us in discovering relevant evidence for this case.
Use Case 3: Investigating Amaq News Agency 🔗︎
For this final use case, let us assume you want to gather material on past ISIS propaganda videos, hoping to find an unintentional information leak or other insights. Knowing that the Amaq News Agency is largely considered to be ISIS’s “state media”, you attempt to browse the website and find it to be down (the Belgian police in cooperation with Europol took it down in November 2019).
Note that this use case will require us to use Transforms from Farsight DNSDB, one of our Hub Partners. To follow along with this example, you will need to have Farsight’s Hub item installed:
Find the previous Domain Names 🔗︎
To find the agency’s archived content, insert their last known domain name (amaqagency) as a maltego.Phrase into an empty graph. Now run the [DNSDB] lookup $phrase.* Transform.
This Transform will return all domains with the same second-level domain (amaqagency), by switching out the top-level domain. It returns 49 maltego.DNSName Entities and 17 maltego.Domain Entities. Let’s go ahead and explore them!
Explore Archived Content from the Domains 🔗︎
Select the Domain Entities returned and run the To Snapshots [Wayback Machine] Transform. This returns 151 maltego.wayback.Snapshot Entities (including various Snapshots of videos) and 30 maltego.wayback.ImageSnapshot Entities.
Some relevant content can be found in these Snapshots, specifically via the amaqagency.ch Entity. The Wayback Machine captured 61 videos from this domain, all the way from “martyrdom operations” (what the Amaq News Agency calls suicide bombings) and the “assault and capture” of territories, to videos of the Aleppo night life and road repair.
Other content includes an infographic on how to download a Firefox and Chromium add-on in order to access the Amaq News Agency website at its latest URL (since they are constantly being taken down), invites to their Telegram groups and even an image from when Anonymous took control of one of their websites.
Some further steps you could take to further explore Amaq’s internet presence could include searching for other pages where the images returned are linked from (with our TinEye Transforms), or through exploring connections by matching Tracking Codes, DNS servers or IPs (with our CTAS Transforms).
We hope this introduction to our new Transforms will inspire you to try them out!