Mapping the Geometry of Law Using Natural Language Processing

Sandeep Bhupatiraju; Daniel Chen; Kannan Venkataramanan

doi:10.62355/ejels.18073

Authors

Sandeep Bhupatiraju World Bank
Daniel L. Chen Toulouse School of Economics https://orcid.org/0000-0002-5774-2211
Kannan Venkataramanan World Bank

DOI:

https://doi.org/10.62355/ejels.18073

Keywords:

natural language processing, legal language, judicial reasoning, Judicial behavior, embeddings

Abstract

Judicial documents and judgments are a rich source of information about legal cases, litigants, and judicial decision-makers. Natural language processing (NLP) based approaches have recently received much attention for their ability to decipher implicit information from text. NLP researchers have successfully developed data-driven representations of text using dense vectors that encode the relations between those objects. In this study, we explore the application of the Doc2Vec model to legal language to understand judicial reasoning and identify implicit patterns in judgments and judges. In an application to federal appellate courts, we show that these vectors encode information that distinguishes courts in time and legal topics. We use Doc2Vec document embeddings to study the patterns and train a classifier model to predict cases with a high chance of being appealed at the Supreme Court of the United States (SCOTUS). There are no existing benchmarks, and we present the first results at this task at scale. Furthermore, we analyze generic writing/judgment patterns of prominent judges using deep learning-based autoencoder models. Overall, we observe that Doc2Vec document embeddings capture important legal information and are helpful in downstream tasks.

References

Angelidis, Iosif, Ilias Chalkidis, and Manolis Koubarakis. "Named Entity Recognition, Linking and Generation for Greek Legislation." JURIX. 2018. https://doi.org/10.3233/978-1-61499-935-5-1

Alaskar, H. and Saba, T., 2021. Machine learning and deep learning: a comparative review. Proceedings of Integrated Intelligence Enable Networks and Computing: IIENC 2020, pp.143-150. https://doi.org/10.1007/978-981-33-6307-6_15 DOI: https://doi.org/10.1007/978-981-33-6307-6_15

Aletras, Nikolaos, et al. "Predicting judicial decisions of the European Court of Human Rights: A natural language processing perspective." PeerJ Computer Science 2 (2016): e93. https://doi.org/10.7717/peerj-cs.93 DOI: https://doi.org/10.7717/peerj-cs.93

Ash, E., MacLeod, W. B., and Naidu, S. (2018b). Optimal contract design in the wild: Rigidity and discretion in collective bargaining. Technical report at SSRN 3204832 (2018).

Bhattacharya, Paheli, et al. "A comparative study of summarization algorithms applied to legal case judgments." European Conference on Information Retrieval. Springer, Cham, 2019. https://doi.org/10.1007/978-3-030-15712-8_27 DOI: https://doi.org/10.1007/978-3-030-15712-8_27

Blei, David M. "Probabilistic topic models." Communications of the ACM 55.4 (2012): 77-84. https://doi.org/10.1145/2133806.2133826 DOI: https://doi.org/10.1145/2133806.2133826

Cardellino, Cristian, et al. "A low-cost, high-coverage legal named entity recognizer, classifier, and linker." Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law. 2017. https://doi.org/10.1145/3086512.3086514 DOI: https://doi.org/10.1145/3086512.3086514

Carlson, K., Livermore, M. A., and Rockmore, D. (2015). A quantitative analysis of writing style on the us supreme court. Wash. UL Rev., 93:1461. http://dx.doi.org/10.2139/ssrn.2554516 DOI: https://doi.org/10.2139/ssrn.2554516

Chalkidis, Ilias, and Dimitrios Kampas. "Deep learning in law: early adaptation and legal word embeddings trained on large corpora." Artificial Intelligence and Law 27.2 (2019): 171-198. https://doi.org/10.1007/s10506-018-9238-9 DOI: https://doi.org/10.1007/s10506-018-9238-9

Chen, Daniel L. "Judicial analytics and the great transformation of American Law." Artificial Intelligence and Law 27.1 (2019): 15-42. https://doi.org/10.1007/s10506-018-9237-x DOI: https://doi.org/10.1007/s10506-018-9237-x

Czibula, Gabriela, Mihaiela Lupea, and Anamaria Briciu. "Enhancing the Performance of Software Authorship Attribution Using an Ensemble of Deep Autoencoders." Mathematics 10.15 (2022): 2572. https://doi.org/10.3390/math10152572 DOI: https://doi.org/10.3390/math10152572

Dunn, Matt, et al. "Early predictability of asylum court decisions." Proceedings of the 16th edition of the International Conference on Artificial Intelligence and Law. 2017. https://doi.org/10.1145/3086512.3086537 DOI: https://doi.org/10.1145/3086512.3086537

Garg, N., Schiebinger, L., Jurafsky, D., and Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16): E3635 - E3644. https://doi.org/10.1073/pnas.1720347115 DOI: https://doi.org/10.1073/pnas.1720347115

Hachey, Ben, and Claire Grover. "Extractive summarisation of legal texts." Artificial Intelligence and Law 14.4 (2006): 305-345 https://doi.org/10.1007/s10506-007-9039-z DOI: https://doi.org/10.1007/s10506-007-9039-z

Jurafsky, D. and Martin, J. H. (2014). Speech and language processing, volume 3. Pearson London. https://web.stanford.edu/~jurafsky/slp3/

Kim, Mi-Young, and Randy Goebel. "Two-step cascaded textual entailment for legal bar exam question answering." Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law. 2017. https://doi.org/10.1145/3086512.3086550 DOI: https://doi.org/10.1145/3086512.3086550

Le, Q. and Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research https://proceedings.mlr.press/v32/le14.html

Leibon, G., Livermore, M., Harder, R., Riddell, A., and Rockmore, D. (2018). Bending the law: geometric tools for quantifying influence in the multinetwork of legal opinions. Artificial Intelligence and Law, 26(2):145-167. https://doi.org/10.1007/s10506-018-9224-2 DOI: https://doi.org/10.1007/s10506-018-9224-2

Levy, O., Goldberg, Y., and Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3:211-225. https://doi.org/10.1162/tacl_a_00134 DOI: https://doi.org/10.1162/tacl_a_00134

Lilleberg, J., Zhu, Y. and Zhang, Y., 2015, July. Support vector machines and word2vec for text classification with semantic features. In 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC) (pp. 136-140). IEEE. https://doi.org/10.1109/ICCI-CC.2015.7259377. DOI: https://doi.org/10.1109/ICCI-CC.2015.7259377

Livermore, M. A., Riddell, A., and Rockmore, D. (2016). Agenda formation and the us supreme court: A topic model approach. https://api.semanticscholar.org/CorpusID:155754134

Luo, Bingfeng, et al. "Learning to predict charges for criminal cases with legal basis." arXiv preprint arXiv:1707.09168 (2017). DOI: https://doi.org/10.18653/v1/D17-1289

Maaten, L. v. d. and Hinton, G. (2008). Visualizing data using t-sne. Journal of Machine Learning Research, 9(Nov):2579-2605

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111-3119

Park, S., Lee, J. and Kim, K., 2019. Semi-supervised distributed representations of documents for sentiment analysis. Neural Networks, 119, pp.139-150. https://doi.org/10.1016/j.neunet.2019.08.001 DOI: https://doi.org/10.1016/j.neunet.2019.08.001

Songer, Donald R., and Susan Haire. 1992. "Integrating Alternative Approaches to the Study of Judicial Voting: Obscenity Cases in the U.S. Courts of Appeals." American Journal of Political Science 36:963-82. https://doi.org/10.2307/2111356 DOI: https://doi.org/10.2307/2111356

Taniguchi, Ryosuke, and Yoshinobu Kano. "Legal yes/no question answering system using case-role analysis." JSAI International Symposium on Artificial Intelligence. Springer, Cham, 2016. https://doi.org/10.1007/978-3-319-61572-1_19 DOI: https://doi.org/10.1007/978-3-319-61572-1_19

Tiwari, T., Tiwari, T. and Tiwari, S., 2018. How Artificial Intelligence, Machine Learning and Deep Learning are Radically Different?. International Journal of Advanced Research in Computer Science and Software Engineering, 8(2), p.1. ISSN: 2277-128X DOI: https://doi.org/10.23956/ijarcsse.v8i2.569

Zhong, Haoxi, et al. "Legal judgment prediction via topological learning." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. https://doi.org/10.18653/v1/D18-1390 DOI: https://doi.org/10.18653/v1/D18-1390

Zhong, Haoxi, et al. "How does NLP benefit legal system: A summary of legal artificial intelligence." arXiv preprint arXiv:2004.12158 (2020) DOI: https://doi.org/10.18653/v1/2020.acl-main.466

Ye, Hai, et al. "Interpretable charge predictions for criminal cases: Learning to generate court views from fact descriptions." arXiv preprint arXiv:1802.08504 (2018) DOI: https://doi.org/10.18653/v1/N18-1168

Mapping the Geometry of Law Using Natural Language Processing

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Similar Articles

About the journal

Make a Submission

Information

Mapping the Geometry of Law Using Natural Language Processing

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Similar Articles

About the journal

Make a Submission

Information

Social media