Skip to main content
Back

Web tracking

17 results across all content

Publications (6)

2025Preprint

Every Keystroke You Make: A Tech-Law Measurement and Analysis of Event Listeners for Wiretapping

arXiv Preprint(arXiv)

Shaoor Munir, Nurullah Demir, Qian Li, Konrad Kollnig, Zubair Shafiq

TL;DR:38.52% of top websites install third-party keystroke listeners. We connect this invasive tracking to U.S. wiretapping laws.

We conduct a technical and legal analysis connecting JavaScript event listeners used by third-party trackers to U.S. wiretapping laws. Using an instrumented web browser to analyze the top-million websites, we discovered that 38.52% websites installed third-party event listeners to intercept keystrokes, and that at least 3.18% websites transmitted intercepted information to a third-party server. We demonstrate that captured data—such as email addresses entered in form fields—are leveraged for unsolicited marketing campaigns. We map this invasive tracking technique against federal and California wiretapping statutes, bridging the gap between emerging technical practices and decades-old legal frameworks designed to protect electronic communications privacy.

2025Preprint

SoK: Advances and Open Problems in Web Tracking

arXiv Preprint(arXiv)

Yash Vekaria, Yohan Beugin, Shaoor Munir, Gunes Acar, Nataliia Bielova, Steven Englehardt, Umar Iqbal, Alexandros Kapravelos, Pierre Laperdrix, Nick Nikiforakis, Jason Polakis, Franziska Roesner, Zubair Shafiq, Sebastian Zimmeck

TL;DR:Comprehensive systematization of web tracking research: techniques, defenses, and regulations shaping the evolving privacy landscape.

This paper consolidates research on web tracking by examining technical mechanisms, countermeasures, and regulations that shape the modern and rapidly evolving web tracking landscape. We synthesize fragmented literature across tracking techniques, defenses against tracking, and regulatory compliance. The field is experiencing transformative change due to industry shifts in advertising, browser adoption of anti-tracking features, and increased privacy regulation enforcement. We identify open research challenges and propose future directions for researchers, practitioners, and policymakers studying web tracking practices.

2024Report

Web Almanac 2024: Privacy Chapter

HTTP Archive Web Almanac(Web Almanac)

Yash Vekaria, Benjamin Standaert, Max Ostapenko, Abdul Haddi Amjad, Yana Dimova, Shaoor Munir, Chris Böttger, Umar Iqbal

TL;DR:Annual comprehensive analysis of privacy on the web: tracking prevalence, evasion techniques, browser policies, and regulatory compliance.

The Privacy chapter of the 2024 Web Almanac examines online tracking prevalence, privacy protection mechanisms, evasion techniques used by trackers, browser privacy policies, Privacy Sandbox proposals, and regulatory compliance across the web. This annual report provides data-driven insights into how privacy practices are evolving across the internet, analyzing millions of websites to understand the current state of web privacy.

2024ConferenceTop-TierBest Artifact Award

Blocking Tracking JavaScript at the Function Granularity

ACM SIGSAC Conference on Computer and Communications Security(CCS) · 19% acceptance

Abdul Haddi Amjad, Shaoor Munir, Zubair Shafiq, Muhammad Ali Gulzar

TL;DR:Not.js blocks tracking JavaScript at function-level granularity with 94% precision and 98% recall, without breaking websites.

Modern websites extensively rely on JavaScript to implement both functionality and tracking. Existing privacy enhancing content blocking tools struggle against mixed scripts, which simultaneously implement both functionality and tracking, because blocking the script would break functionality and not blocking it would allow tracking. We propose Not.js, a fine grained JavaScript blocking tool that operates at the function level granularity. Not.js's strengths lie in analyzing the dynamic execution context, including the call stack and calling context of each JavaScript function, and then encoding this context to build a rich graph representation. Not.js trains a supervised machine learning classifier on a webpage's graph representation to first detect tracking at the JavaScript function level and then automatically generate surrogate scripts that preserve functionality while removing tracking. Our evaluation of Not.js on the top 10K websites demonstrates that it achieves high precision (94%) and recall (98%) in detecting tracking JavaScript functions, outperforming the state of the art while being robust against off the shelf JavaScript obfuscation. Fine grained detection of tracking functions allows Not.js to automatically generate surrogate scripts that remove tracking JavaScript functions without causing major breakage. Our deployment of Not.js shows that mixed scripts are present on 62.3% of the top 10K websites, with 70.6% of the mixed scripts being third party that engage in tracking activities such as cookie ghostwriting. We share a sample of the tracking functions detected by Not.js within mixed scripts not currently on filter lists with filter list authors, who confirm that these scripts are not blocked due to potential functionality breakage, despite being known to implement tracking.

2024ConferenceTop-Tier

PURL: Safe and Effective Sanitization of Link Decoration

USENIX Security Symposium(USENIX Security) · 17% acceptance

Shaoor Munir, Patrick Lee, Umar Iqbal, Zubair Shafiq, Sandra Siby

TL;DR:PURL uses ML to sanitize tracking information from URL parameters while preserving website functionality.

While privacy-focused browsers have taken steps to block third-party cookies and browser fingerprinting, novel tracking methods that bypass existing defenses continue to emerge. Since trackers need to exfiltrate information from the client- to server-side through link decoration regardless of the tracking technique they employ, a promising orthogonal approach is to detect and sanitize tracking information in decorated links. We present PURL, a machine-learning approach that leverages a cross-layer graph representation of webpage execution to safely and effectively sanitize link decoration. Our evaluation shows that PURL significantly outperforms existing countermeasures in terms of accuracy and reducing website breakage while being robust to common evasion techniques. We use PURL to perform a measurement study on top-million websites. We find that link decorations are widely abused by well-known advertisers and trackers to exfiltrate user information collected from browser storage, email addresses, and scripts involved in fingerprinting.

2023ConferenceTop-Tier

COOKIEGRAPH: Measuring and Countering First-Party Tracking Cookies

ACM SIGSAC Conference on Computer and Communications Security(CCS) · 19% acceptance

Shaoor Munir, Sandra Siby, Umar Iqbal, Steven Englehardt, Zubair Shafiq, Carmela Troncoso

TL;DR:First-party tracking cookies exist on 89.86% of websites. CookieGraph detects them with 90% accuracy without breaking SSO.

As third-party cookie blocking is becoming the norm in mainstream web browsers, advertisers and trackers have started to use first-party cookies for tracking. To understand this phenomenon, we conduct a differential measurement study with versus without third-party cookies. We find that first-party cookies are used to store and exfiltrate identifiers to known trackers even when third-party cookies are blocked. As opposed to third-party cookie blocking, first-party cookie blocking is not practical because it would result in major breakage of website functionality. We propose CookieGraph, a machine learning-based approach that can accurately and robustly detect and block first-party tracking cookies. CookieGraph detects first-party tracking cookies with 90.18% accuracy, outperforming the state-of-the-art CookieBlock by 17.31%. We show that CookieGraph is robust against cookie name manipulation, while CookieBlock's accuracy drops by 15.87%. While blocking all first-party cookies results in major breakage on 32% of the sites with SSO logins, and CookieBlock reduces it to 10%, we show that CookieGraph does not cause any major breakage on these sites. Our deployment of CookieGraph shows that first-party tracking cookies are used on 89.86% of the top-million websites. We find that 96.61% of these first-party tracking cookies are in fact ghostwritten by third-party scripts embedded in the first-party context. We also find evidence of first-party tracking cookies being set by fingerprinting scripts. The most prevalent first-party tracking cookies are set by major advertising entities such as Google, Facebook, and TikTok.

Talks (7)

Evaluating Large Language Models as a Defense Against Online Tracking

Ad-Filtering Dev Summit 2024 · October 2024

Exploring how LLMs can be leveraged to detect and block tracking JavaScript at the function level granularity, enabling fine-grained privacy protection while preserving website functionality.

Watch/Listen →

PURL: Safe and Effective Sanitization of Link Decoration

USENIX Security 2024 · August 2024

Presenting a machine-learning approach that uses cross-layer graph representation of webpage execution to safely and effectively sanitize tracking information in decorated links.

Watch/Listen →

Beyond Third-Party Cookies: Safeguarding User Data from Storage and Exfiltration with CookieGraph and PURL

IMDEA Networks · November 2023

A comprehensive talk covering two complementary approaches to combat emerging tracking techniques: CookieGraph for first-party cookie tracking and PURL for link decoration tracking.

Watch/Listen →

COOKIEGRAPH: Measuring and Countering First-Party Tracking Cookies

ACM CCS 2023 · November 2023

Presenting a machine learning-based approach that accurately detects and blocks first-party tracking cookies that are increasingly used as third-party cookies become blocked by browsers.

Watch/Listen →

What you don't remove can track you: Measuring and detecting tracking decorations

Ad-Filtering Dev Summit 2023 · October 2023

Discussing how link decorations are abused by advertisers and trackers to exfiltrate user information, and how PURL can detect and sanitize these tracking decorations.

Watch/Listen →

COOKIEGRAPH: Measuring and Countering First Party Tracking Cookies

Ad-Filtering Dev Summit 2022 · October 2022

Early presentation of CookieGraph research showing how first-party tracking cookies are used on 89.86% of top websites, with 96.61% being ghostwritten by third-party scripts.

Watch/Listen →

First-Party Tracking Cookies

DataSkeptic Podcast · September 2022

A podcast discussion explaining first-party tracking cookies, how they differ from third-party cookies, and the implications for user privacy as browsers block third-party cookies.

Watch/Listen →

Teaching (1)

ECS 152AFall 2022

Computer Networks

UC Davis

Delivered guest lectures on web tracking and privacy, conducted lab sessions on network protocols.

Media Coverage (3)

Web tracking Research & Content | Shaoor Munir