Canvas fingerprinting is a technique brought to light by Mowery et al [1], [2] in their paper entitled “Pixel Perfect : Fingerprinting Canvas in HTML5”. It relies on the canvas API to generate images whose rendering is different depending on the OS, browser and the device of the user, making it suitable for tracking on the web. Different studies have looked at its adoption among popular websites [3]. In particular, Englehardt et al. [4] showed that, in 2016, canvas fingerprinting was used by 1.6% of the sites in the top Alexa 1M.
How are canvas crafted?
The instructions used to generate a canvas fingerprint need to be carefully chosen depending on the desired goal. In the case of tracking and fraud detection, the goal is to obtain a canvas as unique and stable as possible in order to distinguish the maximum number of users. In a paper I published with other researchers [5], we showed that canvas fingerprints such as the one collected by the popular FingerprintJS2 library tend to remain stable for more than 290 days for the majority of the browsers.
In the case canvas is used for crawler detection, one does not necessarily want a unique canvas. Indeed, in their Picasso paper, Bursztein et al. [6] craft canvas fingerprint so that they are the same depending on the kind of device. Their goal is not to obtain a unique canvas but canvas that are the same for a given device model. Using this kind of canvas fingerprints, they are able to detect users that lie about their identity by modifying their user agent, as well as emulated devices.
Use of canvas fingerprinting on the web
We crawl the top Alexa 500K to study how the use of canvas fingerprinting has evolved since Englehardt’s study [4] conducted in 2016. The methodology is the following: for each website of the top Alexa 500K, we visit the home page and wait for the DOMContentLoaded event to be triggered. Then, the crawler waits for 15 seconds and records the access to functions used for canvas fingerprinting such as fillText that enables to write text to a canvas, or toDataURL that enables to obtain the value of a canvas. Whenever a canvas related function is called, we store in a database the name of the function called, its arguments as well as the URL of the script that called the function.
After the 500K websites have been crawled, we analyze the database to obtain a list of scripts doing canvas fingerprinting. In order not to classify all use of the canvas API as canvas fingerprinting, we consider only scripts that called the fillText function with a string constituted of at least 7 characters. In total, we identify 3,825 sites (0.77%) among the top Alexa 500K that use canvas fingerprinting on their home page. This number is less than the result of Englehardt et al. that found that 1.6% of the websites in the top 1M used canvas fingerprinting. This seems to indicate a decrease in the use of canvas fingerprinting. Nevertheless, further crawls should be conducted since we only crawled the home page of the 500K websites and fingerprinting may be used on critical pages such as login pages that we didn’t crawl.
Diversity of canvas fingerprints
In total we encountered 69 different canvas present on 3,825 distinct websites. The bar chart below presents the distribution of the number of websites canvas are present. We observe that 24.6% of the canvas (17 canvas) are only present on a single website. On the other hand, only 5.8%of the canvas (4 canvas) are present on more than 500 websites. The 2 most used canvas are respectively present on 1,241 and 900 websites. Thus, while we encountered 69 different canvas, the majority of the websites use the same canvas fingerprint.
We now present the main canvas fingerprints encountered during our crawl. We order them from the most popular to the less popular. You can download an archive containing all the canvas: archive.
Number of websites where the canvas is present | Canvas | |
---|---|---|
1241 | ||
900 | ||
556 | ||
556 | ||
182 | ||
174 | ||
125 | ||
87 | ||
77 | ||
57 | ||
56 | ||
55 | ||
46 | ||
33 | ||
30 | ||
24 | ||
24 | ||
24 | ||
19 | ||
17 | ||
14 | ||
12 | ||
9 | ||
9 | ||
9 | ||
9 | ||
9 | ||
9 | ||
9 | ||
9 | ||
9 | ||
9 | ||
8 | ||
8 | ||
7 | ||
6 | ||
5 | ||
5 | ||
5 | ||
4 | ||
3 | ||
3 | ||
3 | ||
3 | ||
3 | ||
3 | ||
2 | ||
2 | ||
2 | ||
2 | ||
2 | ||
2 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 | ||
1 |
References
- K. Mowery and H. Shacham, “Pixel perfect: Fingerprinting canvas in HTML5,” Proceedings of W2SP, pp. 1–12, 2012.
- A. Lerner, A. K. Simpson, T. Kohno, and F. Roesner, “Internet jones and the raiders of the lost trackers: An archaeological study of web tracking from 1996 to 2016,” in 25th {USENIX} Security Symposium ({USENIX} Security 16), 2016.
- G. Acar, C. Eubank, S. Englehardt, M. Juarez, A. Narayanan, and C. Diaz, “The web never forgets: Persistent tracking mechanisms in the wild,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, 2014, pp. 674–689.
- S. Englehardt and A. Narayanan, “Online tracking: A 1-million-site measurement and analysis,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 1388–1401.
- A. Vastel, P. Laperdrix, W. Rudametkin, and R. Rouvoy, “FP-STALKER: Tracking browser fingerprint evolutions,” in 2018 IEEE Symposium on Security and Privacy (SP), 2018, pp. 728–741.
- E. Bursztein, A. Malyshev, T. Pietraszek, and K. Thomas, “Picasso: Lightweight device class fingerprinting for web clients,” in Proceedings of the 6th Workshop on Security and Privacy in Smartphones and Mobile Devices, 2016, pp. 93–102.