The Community for Technology Leaders
Green Image
Issue No. 06 - Nov.-Dec. (2014 vol. 18)
ISSN: 1089-7801
pp: 52-59
Andrew G. West , Verisign Labs
Adam J. Aviv , US Naval Academy
ABSTRACT
Publicly posted URLs sometimes contain a wealth of information about the identities and activities of the users who share them. URLs often utilize query strings -- that is, key-value pairs appended to the URL path -- to pass session parameters and form data. Although often benign and necessary to render the Web page, query strings sometimes contain tracking mechanisms, usernames, email addresses, and other information that users might not wish to publicly reveal. In isolation, this isn't particularly problematic, but the growth of Web 2.0 platforms such as social networks and microblogging means URLs, which are often copied and pasted from Web browsers, are increasingly publicly broadcast. To study URL sharing's privacy ramifications, the authors ran a measurement study that looked at 892 million user-submitted URLs, many disseminated in semipublic forums. That corpus contained a trove of personal information, including 1.7 million email addresses. In the most egregious examples, query strings contain plaintext usernames and passwords for administrative and sensitive accounts. The authors identify data leakage via both key-driven and value-driven analysis using manual inspections and automatic detection logic. Additionally, they analyze the click-through rates of sensitive URLs, examine geographical and mobile behavior patterns, and measure the broader statistical properties of key-value pairs. Finally, they propose a CleanURL service that can "scrub"' URLs of privacy-violating content.
INDEX TERMS
Privacy, Entropy, Electronic mail, Mobile communication, Internet, Mobile handsets, Uniform resource locators, Computer security, Query processing,URL sanitization, URL security, URL privacy, query string tracking, Internet computing, Internet security
CITATION
Andrew G. West, Adam J. Aviv, "Measuring Privacy Disclosures in URL Query Strings", IEEE Internet Computing, vol. 18, no. , pp. 52-59, Nov.-Dec. 2014, doi:10.1109/MIC.2014.104
201 ms
(Ver 3.3 (11022016))