Querying texts with the AntConc corpus software
DOI:
https://doi.org/10.36002/litera.v12i1.4987Keywords:
AntConc, Corpus Linguistics, Indonesian, Reduplication, Regular ExpressionsAbstract
AntConc is an open-source corpus linguistic tool featuring basic corpus linguistic analytical techniques. A powerful element of AntConc is its various pattern-searching methods. This paper describes the flavours of AntConc’s searching methods and illustrates them for the query of phonological, morphological, and syntactic phenomena. The more advanced searching method, namely Regular Expressions (RegEx), is also discussed. The paper illustrates the use of RegEx for querying a rather complex morphological phenomenon in Indonesian, namely reduplication. RegEx search output is compared with the output from the basic Wildcard search feature to accentuate the sophistication and flexibility of RegEx for more targeted and specific results. The paper, overall, underscores the importance of mastering the different methods of querying texts with different complexities in AntConc for investigating a wider range of data-intensive, linguistic inquiries.
Downloads
References
Anthony, L. (2022). What can corpus software do? In A. O’Keeffe & M. J. McCarthy (Eds.), The Routledge Handbook of Corpus Linguistics (2nd ed., pp. 103–125). Routledge. https://doi.org/10.4324/9780367076399-9
Anthony, L. (2024). AntConc (Version 4.3.1) [Computer software]. Waseda University. https://www.laurenceanthony.net/software/AntConc
Arka, I. W. (2010, August 2). Dynamic and stative passives in Indonesian & their computational implementation. MALINDO Workshop, Jakarta.
Bothma, T. J. D. (2017). Lexicography and information science. In P. A. Fuertes-Olivera (Ed.), The Routledge Handbook of Lexicography (1st ed., pp. 197–216). Routledge. https://doi.org/10.4324/9781315104942-14
Cabrera, J. C. M. (2017). Continuity and change: On the iconicity of Ablaut Reduplication (AR). In A. Zirker, M. Bauer, O. Fischer, & C. Ljungberg (Eds.), Dimensions of Iconicity (pp. 63–84). John Benjamins Publishing Company. https://doi.org/10.1075/ill.15.04mor
Daintith, J., & Wright, E. (2008). Kleene star. In A Dictionary of Computing. Oxford University Press. https://www.oxfordreference.com/display/10.1093/acref/9780199234004.001.0001/acref-9780199234004-e-2804
Dryer, M. S., & Haspelmath, M. (Eds.). (2013). WALS Online (v2020.4). Zenodo. https://doi.org/10.5281/zenodo.13950591
Esha, T. (1977). Ali Topan Anak Jalanan. Cypress.
Friedl, J. E. F. (2006). Mastering regular expressions (3rd ed). O’Reilly.
Goldhahn, D., Eckart, T., & Quasthoff, U. (2012). Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages. In N. Calzolari, K. Choukri, T. Declerck, M. U. Doğan, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12) (pp. 759–765). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2012/pdf/327_Paper.pdf
Inkelas, S. (2014). Non-Concatenative Derivation: Reduplication. In R. Lieber & P. Štekauer (Eds.), The Oxford Handbook of Derivational Morphology. Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199641642.013.0011
Kiyomi, S. (1995). A new approach to reduplication: A semantic study of noun and verb reduplication in the Malayo-Polynesian languages. Linguistics, 33, 1145–1167. https://doi.org/10.1515/ling.1995.33.6.1145
Lewenstein, M., Ian Munro, J., Raman, V., & Thankachan, S. V. (2014). Less space: Indexing for queries with wildcards. Theoretical Computer Science, 557, 120–127. https://doi.org/10.1016/j.tcs.2014.09.003
Mertz, D. Q. (2023). Regular expression puzzles and AI coding assistants: 24 puzzles solved by the author, with and without assistance from Copilot, ChatGPT and more. Manning.
Mistica, M., Arka, I. W., Baldwin, T., & Andrews, A. (2009). Double Double, Morphology and Trouble: Looking into Reduplication in Indonesian. Proceedings of the Australasian Language Technology Association Workshop 2009, 44–52. https://www.aclweb.org/anthology/U09-1007
Moeliono, A. M., Lapoliwa, H., Alwi, H., Tjatur, S. S., Sasangka, W., & Sugiyono, S. (2017). Tata bahasa baku bahasa Indonesia (Edisi Keempat). Badan Pengembangan dan Pembinaan Bahasa, Kementrian Pendidikan dan Kebudayaan. http://repositori.kemdikbud.go.id/16351/
Moeljadi, D. (2023). Reduplikasi sebagian, salin suara, dan trilingga dalam bahasa Indonesia [Book Chapter]. https://badanbahasa.kemendikdasmen.go.id/resource/doc/files/Bunga_Rampai_Tata_Bahasa_Kontemporer_Morfologi.pdf
Nomoto, H., Akasegawa, S., & Shiohara, A. (2018). Reclassification of the Leipzig Corpora Collection for Malay and Indonesian. NUSA, 65, 47–66.
Paquot, M., & Gries, S. Th. (Eds.). (2020). A Practical Handbook of Corpus Linguistics. Springer International Publishing. https://doi.org/10.1007/978-3-030-46216-1
Quasthoff, U., & Goldhahn, D. (2013). Indonesian corpora (No. 7; Technical Report Series on Corpus Building). Abteilung Automatische Sprachverarbeitung, Institut für Informatik, Universität Leipzig.
Rajeg, G. P. W. (Director). (2020, May 24). Tutorial AntConc [Video recording]. Universitas Udayana. https://doi.org/10.6084/m9.figshare.12950687
Rubino, C. (2013). Reduplication (v2020.4). In M. S. Dryer & M. Haspelmath (Eds.), The World Atlas of Language Structures Online. Zenodo. https://doi.org/10.5281/zenodo.13950591
Sariah. (2018). Dwilingga salin suara dalam bahasa Indonesia. Widyaparwa, 46(2), 99–262.
Stefanowitsch, A. (2020). Corpus linguistics: A guide to the methodology. Language Science Press.
Weisser, M. (2016). Regular expressions. In Practical corpus linguistics: An introduction to corpus-based language analysis (First edition, pp. 82–100). Wiley-Blackwell.
Wiltshire, C., & Marantz, A. (2000). Reduplication. In G. Booij, C. Lehmann, J. Mugdan, W. Kesselheim, & S. Skopeteas (Eds.), Morphologie (pp. 557–567). Walter de Gruyter. https://doi.org/10.1515/9783110111286.1.8.557
Wivell, G., Miatto, V., Karakaş, A., Kostyszyn, K., & Repetti, L. (2024). All about ablaut: A typology of ablaut reduplicative structures. Linguistic Typology, 28(3), 505–536. https://doi.org/10.1515/lingty-2023-0018
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 LITERA : Jurnal Bahasa Dan Sastra

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
![]()
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.












