返回

行业文章

搜索 导航
超值满减
Monolingual Concordancers
2023-07-13 09:25:41    etogether.net    网络    


A concordancer is a tool that retrieves all the occurrences of a particular search pattern in its immediate contexts and displays these in an easy-to-read format. Some concordancers operate by searching through the entire corpus from beginning to end every time a search pattern is entered. Others work by first creating an index of all the words in the corpus along with a record of the location of each occurrence (e.g., line number), as illustrated in figure 1. The index must be created prior to conducting any searches in a given corpus, but once created, it can be consulted during all subsequent searches for particular items in that corpus.


The advantage of a full-text search is that no pre-processing is required and the corpus can be easily modified (e.g., texts can be added or removed); however, the larger the corpus, the longer a search will take. In contrast, indexed searching requires the preparation of an index ahead of time, but once the index has been created, searches can be conducted relatively quickly, even if the corpus is very large. However, if any changes are made to the corpus, such as the addition or removal of a text, a new index must be created.


Once a search has been conducted, the results are displayed for the user. The most common display format is known as a KWIC ("key word in context") display. In a KWIC display, all occurrences of the search pattern are lined up in the centre of the screen. The extent of the context on either side of the search pattern is variable and can often be specified by the user.


Figure 1.png

Figure 1 A sample extract from an index indicating the words contained in the corpus and the location of each occurrence.



The KWIC display in figure 2 shows the concordance produced for the search pattern virus.


As with word-frequency lists, these contexts can be sorted in a variety of ways, such as order of appearance in the corpus (as shown in figure 2), or alphabetically according to the words preceding or following the search pattern, as illustrated in figures 3 and 4.


Sorting the data helps to reveal patterns that might otherwise go undetected. For instance, in the KWIC display sorted according to the word preceding the search pattern (figure 3), a cluster of contexts for the multi-word unit "macro virus" is revealed. Similarly, in the KWIC display sorted by the word following the search pattern (figure 4), clusters for the multi-word units "virus protection" and "virus signature" come to light.




[1] [2] [下一页] 【欢迎大家踊跃评论】

上一篇:Indirect Speech Acts
下一篇:Word-Frequency Lists

微信公众号搜索“译员”关注我们,每天为您推送翻译理论和技巧,外语学习及翻译招聘信息。

  相关行业文章






PC版首页 -关于我们 -联系我们