Did I really agree to that?: making sense of policies you didn’t read with models that actually did

Shikha Soneji; Mitchell Hoesing; Sujay Koujalgi; Jonathan Dodge

doi:10.47989/ir31iConf64157

Authors

Shikha Soneji Pennsylvania State University
Mitchell Hoesing Pennsylvania State University
Sujay Koujalgi Pennsylvania State University
Jonathan Dodge Pennsylvania State University

DOI:

https://doi.org/10.47989/ir31iConf64157

Keywords:

Privacy policy, Terms of service, Classification

Abstract

Introduction. While Privacy Policies and Terms of Service (ToS) are intended to inform users; they often overwhelm, mislead, and confuse in practice. This work investigates automated techniques for analyzing such legal documents, with the goal of supporting user comprehension and regulatory auditing.

Method. We use expert-driven annotations from Terms of Service; Didn’t Read (ToS;DR) to train classification models that assign case labels to individual sentences. We also develop a classifier to distinguish document types: Privacy Policy, ToS, or other legal text.

Analysis. Models were evaluated using F1 score, and we compared traditional fine-tuned models (RoBERTa, PrivBERT) against GPT-4 Turbo. We then applied the best-performing models to real-world policies to uncover conceptual overlaps between Privacy Policies and ToS.

Results. Our case classifier achieved a 0.73 F1 score, while the document-type classifier reached 0.79. GPT-4 performed worse on case classification (0.58 F1). We found that GDPR-relevant clauses often appear in both Privacy Policies and ToS, blurring distinctions and raising risks for user misinterpretation and regulatory non-compliance.

Conclusion. By surfacing hidden structures and overlapping clauses, our system enhances transparency and supports digital literacy by increasing accessibility of complex documents. This work lays the foundation for tools that promote user agency and platform accountability.

References

Adhikari, A. D. (2020). Automated change detection in privacy policies [Doctoral dissertation, University of Denver].

Ahmad, W., Chi, J., Le, T., Norton, T., Tian, Y., & Chang, K.-W. (2021). Intent classification and slot filling for privacy policies. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 4402–4417.

Ahmad, W., Chi, J., Tian, Y., & Chang, K.-W. (2020). PolicyQA: A reading comprehension dataset for privacy policies. Findings of the Association for Computational Linguistics: EMNLP 2020, 743–749.

Alabduljabbar, A., & Mohaisen, D. (2022). Measuring the privacy dimension of free content websites through automated privacy policy analysis and annotation. Companion Proceedings of the Web Conference 2022, 860–867.

Amos, R., Acar, G., Lucherini, E., Kshirsagar, M., Narayanan, A., & Mayer, J. (2021). Privacy policies over time: Curation and analysis of a million-document dataset. Proceedings of the Web Conference 2021, 2165–2176.

Auxier, B., Rainie, L., Anderson, M., Perrin, A., Kumar, M., & Turner, E. (2019). Americans’ attitudes and experiences with privacy policies and laws. Pew Research Center: Internet, Science & Tech.

Bakos, Y., Marotta-Wurgler, F., & Trossen, D. R. (2014). Does anyone read the fine print? consumer attention to standard-form contracts. The Journal of Legal Studies, 43(1), 1–35.

Bolton, T., Dargahi, T., Belguith, S., & Maple, C. (2023). PrivExtractor: Toward redressing the imbalance of understanding between virtual assistant users and vendors. ACM Transactions on Privacy and Security, 26(3), 1–29.

Bui, D., Shin, K. G., Choi, J.-M., & Shin, J. (2021). Automated extraction and presentation of data practices in privacy policies. Proceedings on Privacy Enhancing Technologies.

Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Conference on fairness, accountability and transparency, 77–91.

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1), 37–46.

Davis, A. J. (2000). The american prosecutor: Independence, power, and the threat of tyranny. Iowa L. Rev., 86, 393.

Gao, X., Singh, M. P., & Mehra, P. (2011). Mining business contracts for service exceptions. IEEE Transactions on Services Computing, 5(3), 333–344.

Golbeck, J., & Mauriello, M. L. (2016). User perception of Facebook app data access: A comparison of methods and privacy concerns. Future Internet, 8(2), 9.

Harkous, H., Fawaz, K., Lebret, R., Schaub, F., Shin, K. G., & Aberer, K. (2018). Polisis: Automated analysis and presentation of privacy policies using deep learning. 27th {USENIX} security symposium ({USENIX} security 18), 531–548.

Kaaz, K. J., Hoffer, A., Saeidi, M., Sarma, A., & Bobba, R. B. (2017). Understanding user perceptions of privacy, and configuration challenges in home automation. 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), 297–301.

Kost, M., & Freytag, J. C. (2012). Privacy analysis using ontologies. Proceedings of the second ACM conference on Data and Application Security and Privacy, 205–216.

Lippi, M., Pałka, P., Contissa, G., Lagioia, F., Micklitz, H.-W., Sartor, G., & Torroni, P. (2019). Claudette: An automated detector of potentially unfair clauses in online terms of service. Artificial Intelligence and Law, 27(2), 117–139.

Lipton, Z. C., & Steinhardt, J. (2019). Troubling trends in machine learning scholarship: Some ML papers suffer from flaws that could mislead the public and stymie future research. Queue, 17(1), 45–77.

Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A robustly optimised BERT pretraining approach. arXiv preprint arXiv:1907.11692.

Lukose, E., De, S., & Johnson, J. (2022). Privacy pitfalls of online service terms and conditions: A hybrid approach for classification and summarisation. Proceedings of the Natural Legal Language Processing Workshop 2022, 65–75.

Melicher, W., Sharif, M., Tan, J., Bauer, L., Christodorescu, M., & Leon, P. G. (2016). Preferences for web tracking. Proceedings on Privacy Enhancing Technologies, 2016(2), 1–20.

Obar, J. A., & Oeldorf-Hirsch, A. (2020). The biggest lie on the internet: Ignoring the privacy policies and terms of service policies of social networking services. Information, Communication & Society, 23(1), 128–147.

Oliver, I., Howse, J., & Stapleton, G. (2013). Protecting privacy: Towards a visual framework for handling end-user data. 2013 IEEE Symposium on Visual Languages and Human Centric Computing, 67–74.

Özkurt, C. (2024). Comparative analysis of state-of-the-art Q&A models: BERT, RoBERTa, DistilBERT, and ALBERT on SQuAD v2 dataset. Chaos and Fractals, 1(1), 19–30.

Perera, T., & Perera, T. (2021). Barrister-processing and summarisation of terms & conditions/privacy policies. 2021 6th International Conference for Convergence in Technology (I2CT), 1–7.

Pilton, C., Faily, S., & Henriksen-Bulmer, J. (2021). Evaluating privacy-determining user privacy expectations on the web. Computers & Security, 105, 102241.

Pozen, D. E., Talley, E. L., & Nyarko, J. (2019). A computational analysis of constitutional polarisation. Cornell L. Rev., 105, 1.

Privacy Terms. (2023). Privacy policy vs terms and conditions. https://privacyterms. io/privacy/privacy-policy-vs-terms-and-conditions/

Ravichander, A., Black, A. W., Wilson, S., Norton, T., & Sadeh, N. (2019). Question answering for privacy policies: Combining computational and legal perspectives. arXiv preprint arXiv:1911.00841.

Richardson, L. (2007). Beautiful Soup documentation. https : / / beautiful - soup -4.readthedocs.io/en/latest/

Robinson, E. P., & Zhu, Y. (2020). Beyond ‘I agree’: Users’ understanding of web site terms of service. Social media+society, 6(1), 2056305119897321.

Srinath, M., Sundareswara, S. N., Giles, C. L., & Wilson, S. (2021). PrivaSeer: A privacy policy search engine. 21st International Conference on Web Engineering, ICWE 2021, 286–301.

Srinath, M., Wilson, S., & Giles, C. L. (2021). Privacy at scale: Introducing the PrivaSeer corpus of web privacy policies. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 6829–6839.

Tang, C., Liu, Z., Ma, C., Wu, Z., Li, Y., Liu, W., Zhu, D., Li, Q., Li, X., Liu, T., et al. (2023). PolicyGPT: Automated analysis of privacy policies with large language models. arXiv preprint arXiv:2309.10238.

Terms of Service; Didn’t Read. (2012). Project website for ‘Terms of Service; Didn’t Read’. https://tosdr.org/

Tesfay, W. B., Hofmann, P., Nakamura, T., Kiyomoto, S., & Serna, J. (2018). PrivacyGuide: Towards an implementation of the EU GDPR on internet privacy policy evaluation. Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, 15–21.

The European Parliament and the Council of the European Union. (2018). European Union General Data Protection Regulation (GDPR) [Accessed: 1/07/2026]. https://gdpr-info.eu/

Timoneda, J. C., & Vera, S. V. (2025). BERT, RoBERTa, or DeBERTa? comparing performance across transformers models in political science text. The Journal of Politics, 87(1), 347–364.

Wagner, I. (2023). Privacy policies across the ages: Content of privacy policies 1996– 2021. ACM Transactions on Privacy and Security, 26(3), 1–32.

Wilson, S., Schaub, F., Dara, A. A., Liu, F., Cherivirala, S., Leon, P. G., Andersen, M. S., Zimmeck, S., Sathyendra, K. M., Russell, N. C., et al. (2016). The creation and analysis of a website privacy policy corpus. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1330–1340.

Yin, R. K. (2018). Case study research and applications: Design and methods (Sixth edition). SAGE.

Zaeem, R. N., German, R. L., & Barber, K. S. (2018). Privacycheck: Automatic summarisation of privacy policies using data mining. ACM Transactions on Internet Technology (TOIT), 18(4), 1–18.

Zimmeck, S., & Bellovin, S. M. (2014). Privee: An architecture for automatically analysing web privacy policies. 23rd {USENIX} Security Symposium ({USENIX} Security 14), 1–16.

Did I really agree to that?: making sense of policies you didn’t read with models that actually did

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

About the Journal

Make a Submission

Information