QLVL-bibliography-publications_abstracts


This publication list was last updated on 16-06-2015.
Heylen, Kris; De Hertog, Dirk (2015)
Automatic Term Extraction
This chapter focuses on computational approaches to the automatic extraction of terms from domain specific corpora. The different subtasks of Automatic Term Extraction are presented in detail, including corpus compilation, unithood, termhood and variant detection, and system evaluation.

Heylen, Kris; Wielfaert, Thomas; Speelman, Dirk; Geeraerts, Dirk (2015)
Monitoring Polysemy. Word Space Models as a Tool for Large-Scale Lexical Semantic Analysis
This paper demonstrates how token-level word space models (a distributional semantic technique that was originally developed in statistical natural language processing) can be developed into a heuristic tool to support lexicological and lexicographical analyses of large amounts of corpus data. The paper provides a non-technical introduction to the statistical methods and illustrates with a case study analysis of the Dutch polysemous noun 'monitor' how token-level word space models in combination with visualisation techniques allow human analysts to identify semantic patterns in an unstructured set of attestations. Additionally, we show how the interactive features of the visualisation make it possible to explore the effect of different contextual factors on the distributional model.

Tummers, José; Sterckx, Lieve; Vanhoren, Dominique (2015)
(On)nut van Nederlandse taalbeheersing in het hoger onderwijs
In het hoger onderwijs bestaan diverse initiatieven om de schrijfvaardigheid van instromende studenten op peil te brengen, vooral voor het schrijven van zakelijke en academische teksten. Na een decennium is de tijd aangebroken om de efficiëntie van deze activiteiten te meten. Deze bijdrage brengt de evolutie in kaart van de schriftelijke zakelijke taalbeheersing van eerstejaarsstudenten in de professionele bachelor Bedrijfsmanagement aan UC Leuven-Limburg.

Zenner, Eline; Speelman, Dirk; Geeraerts, Dirk (2015)
A sociolinguistic analysis of borrowing in weak contact situations: English loanwords and phrases in expressive utterances in a Dutch reality TV show
This paper presents a quantitative corpus-based variationist analysis of the English insertions used by Belgian Dutch and Netherlandic Dutch participants to the reality TV show "Expeditie Robinson". The data consists of manual transcriptions of 35 hours of recordings for 46 speakers from 3 seasons of the show. Focusing on the expressive utterances in the corpus, we present a mixed-effect logistic regression analysis to pattern which of a variety of speaker-related and context-related features can help explain the occurrence of English insertions in Dutch. Results show a strong impact of typical variationist variables such as gender, age and location, but more situational features like emotional charge and topic of the conversation also prove relevant. Overall, in its combined focus on (a) oral corpora of spontaneous language use, (b) social patterns in the use of English, and (c) inferential statistical modeling, this paper presents new perspectives on the study of anglicisms in weak contact settings.

Bertels, Ann; Speelman, Dirk (2014)
Clustering for Semantic Purposes: Exploration of Semantic Similarity in a Technical Corpus
This paper presents an innovative approach, within the framework of distributional semantics, for the exploration of semantic similarity in a technical corpus. In complement to a previous quantitative semantic analysis conducted in the same domain of machining terminology, this paper sets out to discover finegrained semantic distinctions in an attempt to explore the semantic heterogeneity of a number of technical items. Multidimensional scaling analysis (MDS) was carried out in order to cluster first-order co-occurrences of a technical node with respect to shared second-order and third-order co-occurrences. By taking into account the association values between relevant first and second-order co-occurrences, semantic similarities and dissimilarities between first-order co-occurrences could be determined, as well as proximities and distances on a graph. In our discussion of the methodology and results of statistical clustering techniques for semantic purposes, we pay special attention to the linguistic and terminological interpretation.

Ghesquière, Lobke; Brems, Lieselotte; Van de Velde, Freek (2014)
Intersubjectivity and intersubjectification: Typology and operationalization
In this paper we present our views on intersubjectivity and intersubjectification with reference to case studies on adjectives, hedges, tags, honorifics, etc. Building on Diessel’s notion of “joint attention” and Traugott’s approach to intersubjectiv- ity, we propose a distinction between three types of intersubjectivity: attitudinal, responsive, and textual. We evaluate and propose formal recognition criteria to operationalize this essentially semantic typology, such as leſt versus right pe- riphery and prosodic features. In addition, we address the issue of directionality between subjectification and intersubjectification. Rather than seeing subjectiv- ity as a prerequisite for intersubjectivity, we argue that in our typology intersub- jective meanings of constructions may diachronically precede subjective ones.

Heylen, Kris; Steurs, Frieda (2014)
Translating legal and administrative language: How to deal with legal terms and their flexible meaning potential
Compared to other Languages for Specific Purposes (LSPs), the terminology of the legal domain poses a number of specific challenges to translation. Firstly, most legal terms refer to abstract concepts and are not defined through referential properties (contrary to e.g. machine parts in a technical domain), but rather intentionally, using other abstract concepts. Secondly, because the law deals with all aspects of everyday life, legal terminology shows a considerable overlap with Language for General Purposes (LGP). Thirdly, the legal jargon often uses a number near-synonyms (e.g. violation, breach, infringement or null and void) for the same concept (Gozdz-Roszkowski 2013). These properties potentially cause semantic vagueness and necessitate that an onomasiological approach to legal terminology (first defining a concept and then listing the terms) be supplemented with a semasiological approach that investigates how near-synonymous terms and terms shared with LGP realize their flexible meaning potential in specific contexts. In a translation setting, this problem of semantic instability is further aggravated because the terms’ contextual nuances do not only have to be adequately rendered in other language but the terms’ translational equivalents also have their own ambiguity in a different legal system, whose concepts might be similar but often not completely equivalent. In this paper, we look at a case study of the translation of the Statutes and Regulations of the University of Leuven (Organiek Reglement KU Leuven) from Dutch and the Belgian-Flemish legal system into English with different legal systems present in the background (UK Law, American Law, European Law). First we analyse the ambiguity of the terms in the source text language-internally through a corpus-based collocation analysis of terms extracted from the document and compared with their use in the official decrees of the Flemish government pertaining to higher education. Secondly, we analyse the consistency with which ambiguous terms are translated in two (independent) English translations of the Statutes and Regulations as well as translations of other university policy documents.

Levshina, Natalia; Heylen, Kris (2014)
A radically data-driven Construction Grammar: Experiments with Dutch causative constructions
In this paper we propose a novel, radically data-driven approach to constructional semantics. It is based on Semantic Vector Space models, which are commonly used in computational linguistics to model the semantic relationships between words on the basis of their distribution in a large corpus. In a case study of the near-synonymous Dutch causative constructions with doen ‘do’ and laten ‘let’ we show this method in action by testing a variety of distributional models and clustering options applied to the constructional collexemes. The method opens new perspectives for generating hypotheses about constructional semantics, providing a quick estimation of large amounts of data. The paper also contributes to bridging the gap between the neostructuralist distributional approaches still predominant in computational linguistics, on the one hand, and the non-reductionist constructionist approaches to grammar, on the other hand.

Pijpops, Dirk; Van de Velde, Freek (2014)
A multivariate analysis of the partitive genitive in Dutch. Bringing quantitative data into a theoretical discussion
This article takes a usage-based perspective on the partitive genitive construction in Dutch (iets moois, ‘something beautiful’), which has previously drawn scholarly attention from a theoretical perspective, due to the challenges it presents to Dutch nominal morphosyntax. We will argue that a good understanding of the construction at issue cannot circumvent the enormous variation in the expression of the genitive marker. Within the wide variation space, regular patterns can be discerned, which we uncovered by using mixed-effects logistic regression. This approach allows us to assess the precise contribution of internal factors (e.g. length of the adjective, or the type of quantifier) and external factors (e.g. regional variety, or register), as well as their interactions. This article has three objectives then: first, it wants to contribute to the description of Dutch syntax, second it aspires to advance methodological standards in grammatical investigation, and third, it makes a theoretical plea for a usage-based perspective, with full recognition of variation.

Siegel, Jeff; Szmrecsanyi, Benedikt; Kortmann, Bernd (2014)
Measuring analyticity and syntheticity in creoles
Creoles (here including expanded pidgins) are commonly viewed as being more analytic than their lexifiers and other languages in terms of grammatical marking. The purpose of the study reported in this article was to examine the validity of this view by measuring the frequency of analytic (and synthetic) markers in corpora of two different English-lexified creoles — Tok Pisin and Hawai‘i Creole — and comparing the quantitative results with those for other language varieties. To measure token frequency, 1000 randomly selected words in each creole corpus were tagged with regard to word class, and categorized as being analytic, synthetic, both analytic and synthetic, or purely lexical. On this basis, an Analyticity Index and a Syntheticity Index were calculated. These were first compared to indices for other languages and then to L1 varieties of English (e.g. standard British and American English and British dialects) and L2 varieties (e.g. Singapore English and Hong Kong English). Type frequency was determined by the size of the inventories of analytic and synthetic markers used in the corpora, and similar comparisons were made. The results show that in terms of both token and type frequency of grammatical markers, the creoles are not more analytic than the other varieties. However, they are significantly less synthetic, resulting in much higher ratios of analytic to synthetic marking. An explanation for this finding relates to the particular strategy for grammatical expansion used by individuals when the creoles were developing.

Steurs, Frieda; Kockaert, Hendrik J. (2014)
Language planning and domain dynamics: Challenges in term creation
This paper deals with the challenges encountered in the creation of terminology for specialized, technical and scientific communication, and focuses especially on the phenomenon of borrowing. Next to term creation in the source language, another challenge is raised by the need for multilingual communication, where terms have to be translated and equivalents have to be found in different target languages. How do pluricentric languages behave in the choice of terms? In a second part of the paper, we will draw the attention to the very specific challenges of domain dynamics and domain loss in specialized communication.

Tummers, José; Speelman, Dirk; Geeraerts, Dirk (2014)
Spurious Effects in Variational Corpus Linguistics: Identification and Implications of Confounding
In this paper, the methodological issues of confounding and spurious effects will be tackled. Those phenomena will be proven to be a corollary of the unbalanced and uncontrolled nature of corpus data. Analyzing a variational case study – the alternation between inflected and uninflected attributive adjectives in Dutch – it will be demonstrated how confounding variables alter the impact of explanatory variables on the response variable, resulting in spurious marginal effects in the bivariate analyses. We will perform a Multiple Correspondence Analysis to unveil the association patterns between explanatory variables in the data matrix. Those patterns will be shown to induce the spurious marginal effects. Based on these findings, we will argue for a consistent use of multivariate analyses in variational corpus linguistics to model the true conditional effect of explanatory variables, completed by an analysis of the database patterns to gain insight in the underlying associations between explanatory variables.

Van de Velde, Freek (2014)
The discourse motivation for split-ergative alignment in Dutch nominalisations (and elsewhere)
Dutch nominalisations of the type het eten van vlees (‘the eating of meat’) have ergative alignment. The alignment is functionally motivated, in that it is a natural consequence of the flow of discourse. The functional account that is put forward here draws on the notion of Preferred Argument Structure (Du Bois 1987) and on the distinction between foregrounded and backgrounded discourse (Hopper & Thompson 1980). Support for this account comes from other domains of ergativity in Dutch, such as causativised predicates and participial constructions and from the observation that the alignment in Dutch nominalisations is in fact split-ergative. The present study adduces corpus evidence to corroborate the claims. In the last section, the analysis is cast in a Functional Discourse Grammar model (Hengeveld & Mackenzie 2008), including its hitherto underdescribed Contextual Component.

Van de Velde, Freek (2014)
Degeneracy: the maintenance of constructional networks
Language is a complex adaptive system. One of the properties of such systems is that they rely on what in biology is called ‘degeneracy’, a technical term for the phenomenon that structurally different elements can fulfill the same function. In this article, it is suggested that degenerate strategies help languages sustain instability in times of syntactic changes. Taking a Construction Grammar approach, it is shown that so-called horizontal relations in constructional networks -- in which related constructions in a functional domain are mutually defined by differential values they take on a set of grammatical parameters -- can be transmitted through time, even if the specific grammatical parameters on which they are defined are under threat. Evidence is drawn from two different domains: argument realisation in experience processes and adverbial subordination.

Van de Velde, Freek; Weerman, Fred (2014)
The resilient nature of adjectival inflection in Dutch
The rich Germanic adjectival inflection dramatically eroded in the history of Dutch as part of a general process of deflection. At present, Dutch is left with what appears to be a vestigial structure: an alternation between an inflected form in schwa and an uninflected form, the distribution of which is semantically ill-motivated. As a consequence, one might expect that the inflection is moribund and that Dutch will follow its West Germanic neighbor English in doing away with this dysfunctional piece of morphology. This is not what is happening, though. Dutch adjectival inflection is remarkably resilient. In this article it is argued that this resilience is due to refunctionalization. The inflectional schwa is turning into a transparent marker of attributive adjectives, and comes to function as a watershed between the modification and determination zone in the noun phrase. This account explains many erratic inflectional patterns in non-standard language. The whole reanalysis is a long-term process, which has not yet come to completion, but is supported by a detailed investigation of corpus data.

Van linden, An; Van de Velde, Freek (2014)
(Semi-)autonomous subordination in Dutch : Structures and semantic-pragmatic values
This article presents an analysis of autonomous and semi-autonomous subordination patterns in Dutch, some of which have so far gone unnoticed. It proposes a four-way classification of such constructions with the general subordinator dat (‘that’), drawing on Internet Relay Chat corpus data of Flemish varieties. Generalizing over the four types and their various subtypes distinguished here, we find that they all share the semantic property of expressing interpersonal meaning, and most of them also have exclamative illocutionary force. We propose a diachronic explanation for this shared semantic-pragmatic value in terms of the concept of hypoanalysis, and assess to what extent our proposal meshes with extant ellipsis accounts of the patterns studied.

Bertels, Ann; Speelman, Dirk (2013)
‘Keywords Method’ versus ‘Calcul des Spécificités’: a Comparison of Tools and Methods
This paper explores two tools and methods for keyword extraction. As several tools are available, it makes a comparison of two widely used tools, namely Lexico3 (Lamalle et al. 2003) and WordSmith Tools (Scott 2013). It shows the importance of keywords and discusses recent studies involving keyword extraction. Since no previous study has attempted to compare two different tools, used by different language communities and which use different methodologies to extract keywords, this paper aims at filling the gap by comparing not only the tools and their practical use, but also the underlying methodologies and statistics. By means of a comparative study on a small test corpus, this paper shows major similarities and differences between the tools. The similarities mainly concern the most typical keywords, whereas the differences concern the total number of significant keywords extracted, the granularity of both probability value and typicality coefficient and the type of the reference corpus.

De Hertog, Dirk; Heylen, Kris; Speelman, Dirk (2013)
Stable Lexical Marker Analysis: a corpus-based identification of lexical variation
Research questions that deal with mutual intelligibility and that investigate language attitudes in pluricentric languages rely on a correct assessment of the loci of divergence, differences in word choice being one of the most salient. Quantitative corpus-based methods can aid researchers to identify this lexical variation. This paper will focus on the language-independent method of Stable Lexical Marker Analysis (SLMA, Speelman et al. 2008) to find variety-specific words in representative corpora. The method is based on the keyword-analysis approach (Scott, 1997) but allows a graded rather than a categorical assessment of markedness and includes a mechanism to circumvent topical bias in corpora. The paper discusses further improvements to SLMA in order to deal with gradedness and offers a quantitative and qualitative analysis of results from a case study on the identification of lexical markers for Netherlandic and Belgian Dutch.

De Smet, Hendrik; Van de Velde, Freek (2013)
Serving two masters: form-function friction in syntactic amalgams
This paper examines two cases of so-called syntactic amalgams. In syntactic amalgams a particular string that is shared by two constructions is exploited to combine them, in such a way that one of the constructions functions as a modifier of the other. Typical examples are after God knows how many years (< after many years + God knows how many years) and a big enough house (< a big house + big enough). In formal theories, these kinds of constructions have been insightfully described as ‘grafts’. However, the exact process through which these amalgams arise remains unexplored. When studied closely, these processes reveal form–function friction not fully accounted for by the graft metaphor. Syntactic amalgams typically serve a subjective function and have been recruited for this purpose. However, because they consist of a syntagm that is still internally parsable, they tend to resist full reanalysis. More precisely, their original syntax continues to constrain their use. As such, amalgams get caught between their original syntax, which remains transparent, and their new function, which suggests a new syntactic status. This appears clearly from contrastive studies of amalgams in Dutch and English that are functionally similar but whose use is constrained in different ways due to structural differences between the two languages. Our first case study deals with the Dutch and English amalgam wie weet / who knows. A contrastive analysis of the development of the respective items shows both the conservative effect of the origin of change and the attraction exerted by the target of change. The second case we discuss in detail involves so-called transparent free relatives. A contrastive analysis shows the role of the overall grammar of a language in licensing change, in this case with Dutch word order posing more difficulties to the new focusing function of transparent free relatives. In general, both case studies show the formation of syntactic amalgams to be sensitive to system pressures both in the course of their development and in the eventual outcome of change.

Deveneyns, Annelies; Tummers, José (2013)
Zoek de fout: een foutenclassificatie als aanzet tot gerichte remediëring Nederlands in het hoger professioneel onderwijs in Vlaanderen
Ondanks de hype rond taalbeheersing in het hoger onderwijs in Vlaanderen is er relatief weinig onderzoek verricht naar de schriftelijke taalbeheersing Nederlands van studenten en blijven spelling en woordenschat de primaire aandachtspunten in het aanbod Nederlandse communicatie. Deze verkennende studie heeft tot doelstelling om in kaart te brengen welke de meest typische en frequente fouten zijn die studenten in het hoger professioneel onderwijs maken en wat dat voor de onderwijspraktijk betekent.

Grieve, Jack; Asnaghi, Costanza; Ruette, Tom (2013)
Site-Restricted Web Searches for Data Collection in Regional Dialectology
This article presents a new method for data collection in regional dialectology based on site-restricted web searches. The method measures the usage and determines the distribution of lexical variants across a region of interest using common web search engines, such as Google or Bing. The method involves estimating the proportions of the variants of a lexical alternation variable over a series of cities by counting the number of webpages that contain the variants on newspaper websites originating from these cities through site-restricted web searches. The method is evaluated by mapping the 26 variants of 10 lexical variables with known distributions in American English. In almost all cases, the maps based on site-restricted web searches align closely with traditional dialect maps based on data gathered through questionnaires, demonstrating the accuracy of this method for the observation of regional linguistic variation. However, unlike collecting dialect data using traditional methods, which is a relatively slow process, the use of site-restricted web searches allows for dialect data to be collected from across a region as large as the United States in a matter of days.

Szmrecsanyi, Benedikt (2013)
Grammatical Variation in British English Dialects: A Study in Corpus-Based Dialectometry
Variation within the English language is a vast research area, of which dialectology, the study of geographic variation, is a significant part. This book explores grammatical differences between British English dialects, drawing on authentic speech data collected in over 30 counties. In doing so it presents a new approach known as 'corpus-based dialectometry', which focuses on the joint quantitative measurement of dozens of grammatical features to gauge regional differences. These features include, for example, multiple negation (e.g. don't you make no damn mistake), non-standard verbal-s (e.g. so I says, What have you to do?), or non-standard weak past tense and past participle forms (e.g. they knowed all about these things). Utilizing state-of-the-art dialectometrical analysis and visualization techniques, the book is original both in terms of its fundamental research question ('What are the large-scale patterns of grammatical variability in British English dialects?') and in terms of its methodology.

Szmrecsanyi, Benedikt (2013)
The great regression: genitive variability in Late Modern English news texts
Utilizing the variationist method, this contribution is concerned with the alternation between the s-genitive (the president’s speech) and the of-genitive (the speech of the president) in Late Modern English news prose as sampled in ARCHER. A frequency analysis reveals that text frequencies of the s-genitive collapsed in the early 19th century, but recovered afterwards. Linear regression analysis indicates that slightly over half of this frequency variability is induced by “environmental” changes in the news genre habitat, such as varying input frequencies of human possessors. To investigate the remaining variability, we fit a logistic regression model and show that genitive choice grammars changed genuinely in regard to four language-internal conditioning factors: POSSESSOR ANIMACY, GENITIVE RELATION, POSSESSUM LENGTH, and POSSESSOR THEMATICITY. Applying customary grammaticalization diagnostics, we conclude that while the s-genitive was subject to grammaticalization in the 19th century, it actually degrammaticalized during the 20th century.

Szmrecsanyi, Benedikt (2013)
Diachronic Probabilistic Grammar
The paper sketches a novel, usage-based framework – Diachronic Probabilistic Grammar (DPG) – to analyze variation and change in diachrony. The approach builds on previous work in the Probabilistic Grammar tradition (see, for example, Bresnan 2007; Bresnan and Ford 2010) demonstrating, based on converging experimental and observational evidence, that syntactic knowledge is to some extent probabilistic, and that language users have excellent predictive abilities. What takes center stage in the approach is how contextual predictors (such as, for example, the principle of end weight) constrain linguistic variation. DPG is specifically interested in the extent to which such probabilistic constraints are (un)stable in the course of time. To highlight the diagnostic potential of the DPG framework, the paper explores three case studies: the development of the alternation between non-finite and finite complementation in the Late Modern English period, recent changes in the genitive alternation in the late 20th century, and a cross-constructional analysis of parallelisms in the development of the genitive and the dative alternation in the Late Modern English period.

Van de Velde, Freek (2013)
External possessors and related constructions in Functional Discourse Grammar
This chapter is concerned with indirect object (or dative) external possessors and related constructions in Dutch. The related constructions are on the one hand other possessive constructions and on the other hand other indirect object constructions. The central question is what the semantic and pragmatic contribution of these constructions is and how these constructions can be adequately modeled in Functional Discourse Grammar. To account for the semantic-syntactic characteristics, the division between the Representational Level and the Interpersonal Level in Functional Discourse Grammar is invoked.

Van de Velde, Freek; van der Horst, Joop (2013)
Homoplasy in diachronic grammar
The application of evolutionary thinking to language change has a long tradition, and especially in functional approaches it is currently widely accepted that certain mechanisms can be fruitfully used to describe both biological and linguistic processes. In this article, the evolutionary concept of homoplasy, the recurrence of similar traits in unrelated lineages, is applied to language change. Extending the earlier application of the concept by Lass (1997), homoplasy is here argued to operate not only on the phonological level, but on the morphosyntactic level as well, and not only between languages but also within languages, at the level of constructions. The idea is that phenotypic resemblance in constructions may hide etymological differences. In other words: what looks the same from a synchronic perspective may derive from multiple source constructions historically. On the basis of four case studies in Dutch diachronic morphosyntax, it is shown that homoplasy can offer an insightful account of some long-standing puzzles.

Bertels, Ann; De Hertog, Dirk; Heylen, Kris (2012)
Etude sémantique des mots-clés et des marqueurs lexicaux stables dans un corpus technique
This article presents the results of a quantitative semantic analysis of typical lexical units in a specialised technical corpus of metalworking machinery in French. The study aims to find out whether and to what extent the keywords of the technical corpus are monosemous. A simple regression analysis, used to examine the correlation between typicality rank and monosemy rank of the keywords, points out some statistical and methodological problems, notably a frequency bias. In order to overcome these problems, we adopt an alternative approach for the identification of typical lexical units, called Stable Lexical Marker Analysis (SLMA). We discuss the quantitative and statistical results of this approach with respect to the correlation between typicality rank and monosemy rank.

De Hertog, Dirk; Heylen, Kris; Kockaert, Hendrik J.; Speelman, Dirk (2012)
The prevalence of multiword term candidates in a legal corpus.
Many approaches to term extraction focus on the extraction of multiword units, assuming that multiword units comprise the majority of terms in most subject fields. However, this supposed prevalence of multiword terms has gone largely untested in the literature. In this paper, we perform a quantitative corpus-based analysis of the claim that multiword units are more technical than single word units, and that multiword units are more widespread in specialized domains. As a case study, we look at Dutch terminology from the Belgian legal domain. First, the relevant units are extracted using linguistic filters and an algorithm to identify Dutch compounds and multiword units. In a second step, we calculate for all units an association measure that captures the degree to which a linguistic unit belongs to the domain. Thirdly, we analyze the relationship between the units' technicality, frequency and their status as a simplex, compound or multiword unit.

Deveneyns, Annelies; Tummers, José (2012)
Learning from Natives’ Errors: An Analysis of Students’ Errors in Written Language
There is a growing concern in Flanders about the deterioration of native (written) language proficiency amongst youngsters. In this paper, we will outline a study of the errors in texts written in Dutch by bachelor students. We will pin-point the most acute and most frequent errors in order to develop adapted language material to bridge the gap between the actual and the desired level of proficiency. An error coding scheme was designed that, in line with learner corpus research, combines linguistic information (spelling; lexicon; syntax; textual structure) and error information (erroneous use; omission; redundancy). The most widespread and recurrent errors belong to the categories textual grammar (especially referential coherence), syntax, punctuation and lexical use. Those errors typically cause interpretative problems which interrupt the reading process. The results are the starting point of a usage-based remediation process of the students’ written language proficiency by creating a growing awareness of correct formal language use.

Ghesquière, Lobke; Brems, Lieselotte; Van de Velde, Freek (2012)
Intersubjectivity and intersubjectification: Typology and operationalization
In this paper we present our views on intersubjectivity and intersubjectification with reference to case studies on adjectives, hedges, tags, honorifics, etc. Building on Diessel’s notion of “joint attention” and Traugott’s approach to intersubjectiv- ity, we propose a distinction between three types of intersubjectivity: attitudinal, responsive, and textual. We evaluate and propose formal recognition criteria to operationalize this essentially semantic typology, such as leſt versus right pe- riphery and prosodic features. In addition, we address the issue of directionality between subjectification and intersubjectification. Rather than seeing subjectiv- ity as a prerequisite for intersubjectivity, we argue that in our typology intersub- jective meanings of constructions may diachronically precede subjective ones.

Heylen, Kris; Speelman, Dirk; Geeraerts, Dirk (2012)
Looking at word meaning. An interactive visualization of Semantic Vector Spaces for Dutch synsets
In statistical NLP, Semantic Vector Spaces (SVS) are the standard technique for the automatic modeling of lexical semantics. However, it is largely unclear how these black-box techniques exactly capture word meaning. To explore the way an SVS structures the individual occurrences of words, we use a non-parametric MDS solution of a token-by-token similarity matrix. The MDS solution is visualized in an interactive plot with the Google Chart Tools. As a case study, we look at the occurrences of 476 Dutch nouns grouped in 214 synsets.

Kockaert, Hendrik J.; Segers, Winibert (2012)
L’assurance qualité des traductions : items sélectionnés et évaluation assistée par ordinateur
Inspired by the EMT quality assurance criteria, and by the growing importance of translation quality assurance in the world of translation services providers, this paper aims in the first place at linking these two developments to the use of recently designed and successfully implemented formats of hybrid learning environments in university curricula. Secondly, this paper proposes to adopt an item-based evaluation method when it comes to evaluating translations for the purpose of pursuing an evaluator independent equality and objectivity in e.g. entrance exams at international institutions. At the same time, the commonly used analytical evaluation method is revisited so that it can be implemented in a quality assurance tool with the purpose of offering an exhaustive and identical feedback to all candidates and students. This revisited analytical method is hoped to provide a feedback instrument, which complements nicely the hybrid learning format and the item-based evaluation method.

Van de Velde, Freek (2012)
Degeneracy: the maintenance of constructional networks
Language is a complex adaptive system. One of the properties of such systems is that they rely on what in biology is called ‘degeneracy’, a technical term for the phenomenon that structurally different elements can fulfill the same function. In this article, it is suggested that degenerate strategies help languages sustain instability in times of syntactic changes. Taking a Construction Grammar approach, it is shown that so-called horizontal relations in constructional networks -- in which related constructions in a functional domain are mutually defined by differential values they take on a set of grammatical parameters -- can be transmitted through time, even if the specific grammatical parameters on which they are defined are under threat. Evidence is drawn from two different domains: argument realisation in experience processes and adverbial subordination.

Van de Velde, Freek (2012)
PP extraction and extraposition in Functional Discourse Grammar
This article inquires into the nature of ‘attributive’ prepositional phrases from a Functional Discourse Grammar (FDG) perspective. On the basis of the observation that such prepositional phrases can easily be separated from their host noun phrases by extraposition or extraction, it is argued that they do not belong to the noun phrase syntactically, as discontinuity is vital in determining the constituency boundaries. The idea that attributive prepositional phrases are really independent clause-level modifiers goes counter to what is generally assumed in most syntactic frameworks, but it can be shown that the arguments that are traditionally given in favour of shared constituency do not adequately distinguish between syntactic, semantic and pragmatic association between language units. The layered structure of the FDG model, on the other hand, makes it possible to tease those different types of association apart, so that we can recognise the semantic link at the Representation Level, while at the same time accounting for the loose syntactic connection between the noun and the prepositional phrase at the Morphosyntactic Level.

Ghesquière, Lobke; Van de Velde, Freek (2011)
A corpus-based account of the development of English such and Dutch zulk: Identification, intensification and (inter)subjectification
On the basis of synchronic English language material, Bolinger (1972) has put forward the hypothesis that intensifying meanings or “degree words” often develop from identifying expressions. This paper will empirically test Bolinger’s hypothesis by means of in-depth diachronic study of the development of such — one of Bolinger’s central examples — and of its Dutch cognate zulk in historical text corpora. To this aim, a detailed cognitive-functional account will first be provided of the (differences between the) identifying and intensifying uses of such and zulk, with attention for diachronic changes affecting the syntax and semantics of these uses, cross-linguistically as well as language-specifically. It will be shown that, as predicted by Bolinger (1972), the proportion of identifying uses decreases over time in favor of the intensifying uses both in English and Dutch. The comparison between such and zulk will, however, show that, despite the close relation between these two languages, the development does not run strictly parallel in English and Dutch, thus endorsing a view that language change does not necessarily follow predetermined pathways. We will argue that minute differences in the syntax of such and zulk steer the diachronic course these elements follow. Finally, Bolinger’s shift from identification to intensification will be discussed in terms of its relation to existing (inter)subjectification hypotheses.

Gillaerts, Paul; Van de Velde, Freek (2011)
Metadiscourse on the move: The CEO's letter revisited
In this article we combine the theoretical frameworks of genre and metadiscourse. The genre at stake is the CEO’s letter, on which we have conducted a corpus study looking at the CEO’s letters from a Belgian bank. Compared to the results in a study by Hyland (1998) on the same genre, we have found fewer interactives and more interactionals; fewer hedges, but more boosters and self-mentions. The Belgian CEO’s letters clearly show more credibility appeals and less rational appeals. Diachronically, the metadiscourse in the CEO’s letters of the bank shows a dynamic pattern that reflects the economic context. In the good years self-mentions and boosters sharply increase; in the bad years there is a rise in transitions and engagement markers. Apparently, in good times the bank displays its self-assurance and in bad times it tries to be more coherent. As for the generic integrity of the CEO’s letters, they prove to remain a fairly stable genre in terms of their dialogical, interpersonal nature, keeping a strong ethos/sender orientation, which can be explained by the lack of unity of the bank group.

Gillaerts, Paul; Van de Velde, Freek (2011)
Jaarverslagen op de golven van de conjunctuur
Financiële jaarverslagen brengen verslag uit over de prestaties van het voorbije boekjaar van een organisatie en geven de aandeelhouders een beargumenteerde prognose over de verwachtingen van het volgende boekjaar. In de praktijk blijken financiële jaarverslagen toch vooral ook pr-instrumenten te zijn, die op een ruimer publiek gericht zijn. Uit een analyse van de CEO's letters van Fortis gedurende ruim een decennium blijkt dat de schrijvers van CEO's letters hun stijl aanpassen aan de economische omstandigheden.

Van de Velde, Freek (2011)
Left-peripheral expansion of the English NP
This article is concerned with peripheral modifiers in the English noun phrase. It is argued that this kind of modification is an Early Modern English innovation. Later, in the nineteenth century, the slot underwent a rapid extension on both the type and the token levels, as is shown by historical corpus inquiry. To account for the diachronic processes involved, a constructional, usage-based approach is used, with an onomasiological rather than a semasiological perspective on grammaticalisation.

Van de Velde, Freek (2011)
Anaphoric adjectives becoming determiners. A corpus-based account
Standard accounts of determiners typically deal with the few well-known elements that fall under this category: articles, demonstratives, possessives and (some) quantifiers. It can be shown, however, that the determiner slot in Dutch can also be occupied by certain elements that do not regularly feature in reference grammars, namely the anaphoric adjectives like voornoemd (“aforementioned”). Their syntax is subject to variation in Present-day Dutch, and possibly to change as well: a corpus study reveals that they are increasingly used as unequivocal determiners, irrespective of their token frequency.

De Hertog, Dirk; Heylen, Kris; Kockaert, Hendrik J. (2010)
A variational linguistics approach to term extraction
This paper describes how a toolset developed within variational linguistics for the purposes of identifying regional lexical variants, can be used in the field of term extraction. The notion of stable lexical marker analysis will be introduced as a method to quantify termhood as a function of both high relative frequency and uniform dispersion of single word units in a specialised domain. As such, the work is an extension of so called contrastive approaches to term extraction. The Belgian financial legal domain will serve as a case study and its results will be used to investigate how the method works and how it relates to approaches striving for the same goal.

Diepeveen, Janneke; Van de Velde, Freek (2010)
Adverbial morphology: How Dutch and German are moving away from English
English marks the distinction between adjectives and adverbs with an adverbial suffix, whereas Dutch and German allow adjectives to be used adverbially without extra morphology. This may give rise to the idea that English, like Latin, is more specific in its classification of various types of modifiers. We propose an alternative analysis: Dutch and German draw a different dividing line, between attributive modifiers (NP-level) on the one hand, and predicative and adverbial modifiers (clause-level) on the other. To this end, they use adjectival inflection instead of derivational morphology. We describe how the adverbial systems in these three West-Germanic languages have developed and we try to explain the changes that have occurred.

Gillaerts, Paul; Van de Velde, Freek (2010)
Interactional metadiscourse in research article abstracts
This paper deals with interpersonality in research article abstracts analysed in terms of interactional metadiscourse. The evolution in the distribution of three prominent interactional markers comprised in Hyland’s (2005a) model, viz. hedges, boosters and attitude markers, is investigated in three decades of abstract writing in the field of applied linguistics in the broad sense. On the basis of a quantitative corpus survey of abstracts in Journal of Pragmatics, two major points are made. One is that the distribution of hedges, boosters and attitude markers in abstracts, when compared with their distribution in research articles, supports the idea that abstracts are not just pale reflections of the full-length articles, but rather have a specific make-up, which can plausibly be linked to their function. The second point is that the use of interactional metadiscourse in abstracts has undergone interesting changes in the course of the past 30 years. On the whole, the degree of interpersonality realised by hedges, boosters and attitude markers diminishes over time, though notable differences exist with regard to the subcategories in the interactional domain. In the discussion section, we try to arrive at an explanation for the changes that have occurred, taking genre, discourse community, research practice and rhetoric strategy into account.

Van de Velde, Freek (2010)
The emergence of the determiner in the Dutch NP
This article inquires into the diachrony of the determiner in Dutch. First, it is argued that the determiner is an emergent syntactic category, and that it must be consequently excluded from universal grammar. Second, it is argued that languages that do have a determiner slot in the NP di¤er considerably with regard to which lexemes they allow in this function. On the basis of these two observations, an in depth usage-based analysis of the emergence of the Dutch determiner is undertaken. It seems that over the centuries, the determiner projection consolidates its position in Dutch. It first cropped up in Old Dutch, and was further elaborated in Middle Dutch, Modern Dutch and Present-day Dutch by the recruitment of ever new slotfillers. Di‰culties in the demarcation of the determiner phrase and the notoriously elusive syntax of some adjectives are claimed to be due to diachronic instability: what is e.g., conveniently but somewhat misleadingly called postdeterminers, can be argued to be an instable syntactic category that represents an intermediate stage in the diachronic process. Evidence will be drawn from (quantitative) corpus inquiry.

Van de Velde, Freek (2010)
Ontwikkelingen in de linkerperiferie van de nominale constituent
This article argues that the Modern Dutch noun phrase (NP/DP) has acquired a new slot in its left periphery. This slot contains interpersonal modifiers, like focus particles, modal adverbs and other epistemic modifiers. On the basis of historical corpus inquiry it is shown how this slot has developed and how it has been accommodating more and more complex elements in the course of time. This diachronic evolution has consequences for the synchronic description of the noun phrase. Any synchronic structural description of the noun phrase, at whatever stage in the history of the language, has temporary status only.

Gillaerts, Paul; Van de Velde, Freek (2009)
"Het lijkt met name ook een goede strategie": interactionele metadiscursiviteit in abstracts van onderzoeksartikelen
Abstracts bij onderzoeksartikelen lijken een objectief, op informatie gericht en erg statisch, voorspelbaar genre te vormen. Onderzoek van o.a. Hyland (2005a) naar de metadiscursiviteit van onderzoeksartikelen heeft aangetoond dat zelfs academische teksten niet louter een informatief karakter hebben, maar ook persuasief zijn en in die zin ook subjectieve trekken vertonen. Geldt die persuasiviteit ook voor abstracts? In dit artikel gaan we in op de interactionele metadiscursiviteit van abstracts in twee vaktijdschriften op het terrein van de Toegepaste Taalkunde in de ruime zin: Tijdschrift voor Taalbeheersing en Journal of Pragmatics. We tonen aan dat interactionele metadiscursiviteit ook in abstracts kan worden teruggevonden en dat die ook een zekere ontwikkeling vertoont. Abstracts blijken veel minder objectief en statisch te zijn dan verwacht. Ook blijken culturele verschillen in weerwil van de globalisering van het wetenschappelijk onderzoek niet helemaal afwezig te zijn.

Van de Velde, Freek (2009)
De nominale constituent
De nominale constituent is een onderbelicht domein van de syntaxis, zeker in de historische taalkunde. Dat is niet terecht, want over de syntactische bouw ervan is men het lang niet eens, en de veranderingen die zich in de loop van de geschiedenis hebben voorgedaan, zijn talrijk. In dit boek wordt geargumenteerd dat achter die diverse veranderingen een grote tendens schuilgaat: de Nederlandse nominale constituent is het resultaat van een eeuwenlang proces van stapsgewijze uitbreiding ter linkerzijde met een aantal duidelijk onderscheiden kavels (‘slots’) in het voorveld. Deze visie laat toe een aantal ogenschijnlijk heel uiteenlopende taalveranderingen samenhangend te verklaren. Het onderzoek strekt zich uit over verschillende eeuwen taalgeschiedenis. De klemtoon ligt uiteraard op het Oud-, Middel- en Nieuwnederlands, maar waar mogelijk wordt nog verder teruggegaan, tot de gereconstrueerde fasen van het Proto-Germaans en het Proto-Indo-Europees. In de argumentatie worden data uit verschillende talen betrokken -onder andere Hittitisch, Sanskriet, Grieks, Latijn, Gotisch, Oudengels, Oudhoogduits- en er wordt gebruik gemaakt van allerhande technieken, van theoretisch onderzoek tot kwantitatief corpusonderzoek. Verder wordt er ook uitvoerig verwezen naar de bestaande internationale vakliteratuur.

Van de Velde, Freek (2009)
Do we need the category of postdeterminer in the NP?
The structural make-up of the English NP is a matter of long-standing debate. In this paper, a closer look is taken at a notably intricate part of it, viz. the determiner, and more specifically where it fuzzily borders on the adjective. It will be argued that any attempt to resolve the indeterminacy issues associated with this boundary needs to take the diachrony of NP syntax as a vantage point. More specifically, it will be claimed that what are often conveniently but somewhat misleadingly called postdeterminers are in fact elements undergoing a diachronic transition from adjective slotfiller to determiner slotfiller. The postdeterminer slot is hence not a stable position.

Van de Velde, Freek (2009)
The rise of peripheral modifiers in the noun phrase
As is not always recognised in the scholarly literature, the template of the noun phrase (np/dp) in Dutch and English contains a slot for peripheral modifiers, which precedes the determiner slot and hosts adverbial modifiers. In this article I argue that this slot is of recent date —it is an Early Modern Dutch and Early Modern English innovation, probably as a result of a reanalysis process— and that it has steadily been on the increase, both on the token level and on the type level. The analysis is based on historical corpus data. On a theoretical note, the present study shows that the np slot should not be viewed as a fixed, stable constellation, but can change over time.

Bertels, Ann (2008)
Automatiser et quantifier l’analyse sémantique du français technique
Cet article présente la méthodologie mise en œuvre pour automatiser et quantifier l’analyse sémantique du vocabulaire spécifique d’un corpus en français technique. Les textes du corpus relèvent du domaine technique des machines-outils pour l’usinage des métaux. L’objectif principal de l’étude est de vérifier si les unités lexicales (les plus) spécifiques de ce domaine technique sont (les plus) monosémiques. Comme l’analyse sémantique porte sur quelque 5000 unités lexicales du corpus technique, l’automatisation et la quantification s’imposent. A cet effet, nous avons développé une mesure de monosémie, en implémentant la monosémie en termes d’homogénéité sémantique. La mesure de monosémie est basée sur le recoupement formel des cooccurrents de deuxième ordre d’un mot de base, en l’occurrence d’une unité lexicale spécifique. Dans cet article, nous expliquons la méthodologie de l’analyse des cooccurrences et son intérêt pour le développement de la mesure de monosémie. Nous discutons les premiers résultats de l’analyse sémantique quantitative, ainsi que les résultats de l’analyse statistique qui vise à répondre à la question principale de corrélation entre les unités lexicales spécifiques et les unités lexicales monosémiques. Finalement, nous procédons à une mise au point de la mesure de monosémie, en y intégrant des informations linguistiques supplémentaires, telles que des indications de classe lexicale.

Peirsman, Yves; Heylen, Kris; Geeraerts, Dirk (2008)
Size matters: tight and loose context definitions in English word space models
Word Space Models use distributional similarity between two words as a measure of their semantic similarity or relatedness. This distributional similarity, however, is influenced by the type of context the models take into; account. Context definitions range on a continuum from tight to loose, depending on the size of the context window around the target or the order of the context words that are considered. This paper investigates whether two general; ways of loosening the context definition — by extending the context size from one to ten words, and by taking into account second-order context words — produce equivalent results. In particular, we will evaluate the performance; of the models in terms of their ability (1) to discover semantic word classes and (2) to mirror human associations

Van der Horst, Joop; Van de Velde, Freek (2008)
Het voorzetsel diachronisch
Het voorzetsel heeft in zijn geschiedenis groter veranderingen doorgemaakt dan veelal wordt gedacht. Uiteraard waren en zijn er de vele semantische veranderingen van individuele voorzetsels, maar interessanter zijn de veranderingen die het voorzetsel 'als zodanig', oftwel die de categorie in de loop van vele eeuwen heeft ondergaan. Algemeen wordt aangenomen dat onze 'oude' Indo-Europese voorzetsels ontstaan zijn uit bijwoorden. Maar over hun verdere syntactische ontwikkeling is minder bekend. I.e. hoe hun valentie zich ontwikkelde van eenplaatsig naar tweeplaatsig. Die ontwikkeling blijkt nog lang niet tot stilstand gekomen. Momenteel lijkt de 'jongere' syntactische valentie de 'oudere' in kracht te gaan overtreffen. En dat blijkt uit enkele nieuwe constructies.

Van Gijsel, Sofie; Speelman, Dirk; Geeraerts, Dirk (2008)
Style shifting in commercials
This paper presents a quantitative analysis of style shifting in a corpus of Flemish radio and television commercials. Previous research draws attention to styling processes in advertising language, as discursive actions indexing social meanings. It will be shown that the exploitation of different stylistic varieties in our corpus can be analyzed along the same lines. The analysis presented here focuses on the use of `tussentaal' (literally: `in-between language') in the corpus, which is an informal variety of spoken Belgian Dutch, situated between the regional dialects and standard Belgian Dutch. In order to give a detailed account of the stylization processes, the style shifts between standard Dutch and tussentaal within a single spot are investigated. Furthermore, it will be argued that complementing the quantification of stylistic features with a statistical analysis considerably improves the analysis. More specifically, using a linear regression, the effect of a number of sociovariational factors on the style shifts in the commercials is investigated. The significant factors are then interpreted sociolinguistically, drawing on the concepts of stylization and audience design. Finally, it will be shown that the analysis can be extended to incorporate multilingualism in the commercials. (c) 2007 Elsevier B.V. All rights reserved.

Van de Velde, Freek (2007)
Interpersonal modification in the English noun phrase
This article gives an overview of the various interpersonal modifiers in the English noun phrase, several of which seem to have been overlooked in formal as well as functional grammars. On the theoretical side, this article is concerned with how to bring the constructions at issue under a Functional Discourse Grammar representation (Hengeveld and Mackenzie 2006, in prep.).

Van de Velde, Freek (2006)
Herhaalde exaptatie: een diachrone analyse van de Germaanse adjectiefflexie
In this article, it is argued that Germanic adjectival inflexion underwent a series of exaptation processes: due to various changes in the overall structure of noun phrases, adjectival inflexion suffered successive function loss and function renewal restoring isomorphism. As a consequence, the traditional view on one of the most conspicuous features of Germanic, namely the difference between weak and strong adjectival inflexion, can be argued to be a gross oversimplification of the historical facts.

Heylen, Kris (2005)
Zur Abfolge (pro)nominaler Satzglieder im Deutschen : eine korpusbasierte Analyse der relativen Abfolge von nominalem Subjekt und pronominalem Objekt im Mittelfeld
Woordvolgordevariatie in het Duits in kaart gebracht aan de hand van nieuwe taalkundige onderzoeksmethodes. Wie al eens Duitse teksten leest, heeft beslist al gemerkt dat de woordvolgorde in het Duits heel wat vrijer is dan in het Nederlands. Zinsdelen kunnen in het Duits, vaker dan in het Nederlands, verschillende plaatsen tegenover elkaar innemen, zo ook in de volgende zinnetjes: . Nederlands: (a) Omdat Thomas zich verslapen heeft, brengt vader hem met de auto naar school. (b) Omdat zich Thomas verslapen heeft, brengt hem vader met de auto naar school. . Duits: (a) Weil Thomas sich verschlafen hat, bringt der Vater ihn im Auto zur Schule. (b) Weil sich Thomas verschlafen hat, bringt ihn der Vater im Auto zur Schule. . Terwijl in het Nederlands alleen variant (a) een goede zin is, zijn in het Duits beide varianten mogelijk. Taalkundig geformuleerd: Als het onderwerp niet helemaal vooraan in de zin staat en een zelfstandige naamwoordgroep is (hier: Thomas / der Vater ) dan kan het in het Duits, anders dan in het Nederlands, zowel v��r als n�een lijdend of meerwerkend voorwerp staan dat als persoonlijk of wederkerig voornaamwoord gerealiseerd is (hier: sich / ihn ). Op het eerste zicht lijkt er geen verschil in betekenis te zijn tussen de twee woordvolgordevarianten in het Duits. Vraag is dan of de twee varianten altijd vrij uitwisselbaar zijn, dan wel of elke variant toch typisch in bepaalde contexten voorkomt. Om zulke contextuele verschillen te onderzoeken worden in de taalkunde traditioneel een hele hoop voorbeeldzinnetjes met licht verschillende gebruikscontexten geconstrueerd, waarbij de taalkundige dan beroep doet op het taalgevoel van moedertaalsprekers om te beoordelen welke volgorde best bij welke context past. Het is al langer geweten dat zulke intu�tieve beoordelingen niet erg betrouwbaar zijn en in dit geval al helemaal niet, omdat in geen enkele context ��n van de varianten echt fout is of een andere betekenis heeft. Daarom werd in dit onderzoek een methodologie toegepast die in het mainstream taalkundig onderzoek vrij nieuw is: Uit een grote tekstverzameling werden 5700 gevallen met ��n van beide woordvolgordevarianten ge�xtraheerd en die gevallen werden vervolgens voor een 25-tal contextbepalende factoren onderzocht. Aan de hand van multivariate statistische analyses werd nagegaan welke contextbepalende factoren effectief een invloed hebben op de volgorde en hoe de verschillende factoren samenwerken. Uit dit onderzoek blijkt dat de (b)-volgorde uit de zinnetjes bovenaan in het Duits veel vaker voorkomt dan de (a)-volgorde (die in het Nederlands de normale is). Als de (a)-volgorde toch voorkomt, dan relatief vaker in gesproken dan in geschreven taalgebruik en vaker in Duits Duits dan in Zwitsers of Oostenrijks Duits. De (a)-volgorde komt ook vaker in bijzinnen voor en vaker als het onderwerp relatief kort is, nog maar net vernoemd is in de voorafgaande context en als het aan een mens refereert. Uit de statische analyse blijkt ook dat aan de hand van de onderzochte factoren weliswaar een deel van de variatie verklaard kan worden, maar dat er in toekomstig onderzoek toch naar meer en vooral naar specifiekere factoren gekeken zal moeten worden om de echt typische gebruikscontexten van de relatief zeldzame (a)-volgorde te identificeren.

Van de Velde, Freek (2005)
Exaptatie en subjectificatie in de Nederlandse adverbiale morfologie
Dutch adverbialising suffixes -gewijs and -erwijs are alleged to stand in complementary distribution in that the former attaches to nouns, whereas the latter combines with adjectives. This article claims that this historically merely allomorphic pair of suffixes acquired another important distinction, of a syntactic nature, which has thus far been overlooked in the literature: whereas - gewijs derives predicate adverbs, -erwijs derives sentence adverbs. Occasional exceptions to this distinction can easily be predicted on morphological grounds. The whole process is to be considered a typical instance of subjectification (see e.g. Stein & Wright 1995) and possibly of exaptation (Lass 1990) as well. Evidence is drawn from historical corpus inquiry.

Van de Velde, Freek (2004)
De Middelnederlandse onpersoonlijke constructie en haar grammaticale concurrenten: semantische motivering van de argumentstructuur
Middle Dutch verbs of perception and cognition can occur in the impersonal construction: the third person singular verb takes a dative argument and a genitive argument, respectively encoding the semantic roles of experiencer and source. As has not always been recognised, however, such verbs can occur in other constructions as well. This article claims on the basis of corpus evidence that the impersonal construction, which is traditionally considered something of a syntactic aberration, is part of a well-structured set of different constructional possibilities. The wide array of constructions can be semantically motivated in that the construction accurately indicates the agency of the experiencer. A consequence of this claim is that the argument structure of verbs of perception, but probably of other verbs as well, is not to be stipulated ad hoc in the lexicon, but is in part to be considered as contributing to the semantics of the sentence.

van der Horst, Joop; Van de Velde, Freek (2003)
Zo vreemd een groep
In the history of Dutch, the construction in which an adverb of degree 'zo' (Eng. 'so') modifies the adjective in an (indeterminate) nominal group occurs in at least three ways: (a) 'zo groot een man' (b) 'zo een grote man' and (c) 'een zo grote man' (i.e. with 'zo' in different positions with regard to the adjective and the indefinite article). This article shows that for the oldest period of Dutch, a fourth construction is to be assumed, viz. one without an article: 'zo grote(n) man' (x). While what little literature there is on the subject consistently stresses that (x) 'zo groten man' is an enclitic result of (a) 'zo groot een man' we reach the entirely contrary conclusion that (a) - probably a mere literary variant - is a reanalysis of (x), in which '-en' is historically a flexional ending.