Dark Web Insight
privacy

Metadata Hygiene: Data That Exposes You

Metadata reveals who you contact and when — even when content is encrypted. Learn to strip EXIF data, document metadata, and limit your network footprint.

By Dark Web Insight Research Desk6 min readUpdated

James Clapper, the former Director of National Intelligence, testified before the U.S. Senate in 2013 that the NSA used metadata — not message content — to map social networks and identify surveillance targets. Signal encrypts your messages; it does not encrypt the fact that you sent them. Metadata hygiene is the practice of auditing and controlling what that secondary layer reveals about you.

What Metadata Is

Metadata is data about data. It describes the circumstances of a communication or file without describing the content itself. Think of it as the envelope rather than the letter inside.

Examples across different contexts:

  • Email: sender address, recipient address, timestamps, mail server routing chain, originating IP address
  • Phone calls: who called whom, call duration, frequency over time, cell tower location data
  • Photos: GPS coordinates of where the photo was taken, exact timestamp, device model and serial number, camera settings
  • Documents: author name from Office account, revision history, last-saved timestamp, printer information
  • Web requests: IP address, user agent string, session cookies, referrer URL

None of this is the content of your communication. All of it can identify you, your location, and your associations — often more reliably than the content itself.

Metadata in Photos (EXIF)

JPEG, PNG, and RAW files embed metadata in a structure called EXIF (Exchangeable Image File Format). A photograph taken with a smartphone in 2026 routinely includes:

  • GPS latitude and longitude accurate to a few meters
  • Altitude and compass bearing
  • Exact timestamp (not rounded — often millisecond precision)
  • Device manufacturer, model, and software version
  • Sometimes: camera serial number

Journalists and activists have been physically located via EXIF data embedded in photos they shared online. A 2012 case involved a hacktivist group whose member accidentally posted an unstripped photo that contained GPS coordinates for his residence.

Stripping EXIF data:

  • ExifTool (Phil Harvey, open-source): the definitive command-line tool. exiftool -all= filename.jpg removes all metadata. Available on Linux, macOS, Windows.
  • MAT2 (Metadata Anonymisation Toolkit 2): supports photos, documents, and audio files. Included in Tails OS by default.
  • Signal: automatically strips EXIF data from photos sent through the app before transmission. One of the few messaging apps that does this.
  • ImageMagick: mogrify -strip filename.jpg removes EXIF. Available on all platforms.

Verify removal after processing: exiftool filename.jpg should return no GPS, timestamp, or device data.

Metadata in Documents

Word documents (.docx), PDFs, and spreadsheets carry metadata that has exposed confidential sources in documented cases. Microsoft Office files embed:

  • Author name (pulled from the Office account)
  • Revision history, including deleted text recoverable via Track Changes
  • Total editing time
  • Last-saved timestamp and the path on disk where the file was saved (which may include a username or folder structure)
  • Printer name and identifier

The 2003 UK government Iraq dossier controversy involved a Word document whose revision history revealed the names of government officials who had edited the file — information not in the published version.

Stripping document metadata:

  • MAT2: handles .docx, .pdf, .odt, and more. The safest option.
  • LibreOffice: File → Properties → Reset Properties removes basic metadata. The "Save As" export dialog also offers a "Remove personal information on save" checkbox.
  • For PDFs: exiftool -all= document.pdf or use MAT2. Note that some PDFs contain embedded fonts or form data that may still carry identifying information.

Communication Metadata

Phone call records — who you called, when, and for how long — are accessible to your carrier without a warrant in the US under the third-party doctrine (Smith v. Maryland, 1979). The NSA's PRISM and upstream collection programs, revealed by Edward Snowden in 2013, included bulk collection of call metadata from US carriers.

End-to-end encryption of call content (as Signal provides) does not protect call metadata. Signal knows the timestamp of your last connection — it does not know who you called, but your carrier still records that your phone established a data connection to Signal's servers at specific times.

Practical steps: use Signal for both calls and messages, which moves the metadata to a form the carrier cannot parse. Reduce the frequency of non-Signal contact with sensitive associates. Consider a dedicated device for high-sensitivity communications.

Network Metadata

Your IP address is attached to every connection you make. Your ISP maintains logs correlating IP addresses to subscriber accounts, typically retained for 12 months or longer depending on jurisdiction. Law enforcement can subpoena these records.

Tor Browser hides your destination from your ISP by routing traffic through three relays. Your ISP sees that you connected to the Tor network; it does not see where you went. A VPN hides your destination from your ISP but exposes it to the VPN provider — trading one trust relationship for another.

Time-of-access patterns remain a risk even over Tor. If you access a particular .onion site every day at 9:00 AM, a traffic analysis adversary observing both your connection to the Tor network and traffic patterns at the exit node can correlate the behavior — a technique called traffic fingerprinting. Good OPSEC includes varying access times and session lengths. Reviewing common Tor mistakes can help you avoid the most common network metadata errors.

Frequently Asked Questions

What is metadata and why does it matter for privacy?

Metadata is information about your communications and files that describes context rather than content. It reveals who you contact, when, from where, and how often — which intelligence agencies and law enforcement have found more useful than message content for identifying networks and building cases. Encrypting content while ignoring metadata provides partial protection at best.

How do I remove EXIF data from photos?

Install ExifTool and run exiftool -all= yourphoto.jpg. Verify the result with exiftool yourphoto.jpg — the output should contain no GPS, timestamp, or device entries. On mobile, use the Scrambled Exif app (Android) or the Files app on iOS 16+ (which shows and can strip EXIF). Signal strips EXIF automatically from photos sent through the app.

Does Signal remove metadata from images?

Yes. Signal strips EXIF data from images before sending them. It does not strip metadata from files sent as documents (using the "File" send option rather than the photo attachment flow). If you are sending a document that may contain embedded metadata, strip it with MAT2 before attaching.