You are here: Home / Research / Leuven Statistic Days / lsd2012 / Leuven Statistics Days 2012

Leuven Statistics Days 2012

KU Leuven, June 7 and 8, 2012

About Date&Location Speakers Programme Registration Sponsors

Description and Announcement: Following the tradition of celebrating statistics at KU Leuven, we are glad to announce the biennial Leuven Statistics Days 2012, to be held in Leuven on June 7 and 8, 2012. This edition’s theme is “Mixed models and modern multivariate methods in linguistics.” This international Leuven-based meeting warmly welcomes everyone interested in the methodological and applied themes. Through excellent invited speakers, a number of fine contributions, and a panel discussion, the meeting organizers aim to reach their twofold goal:  (1) provide an exchange forum for statistical methods in linguistics and (2) stimulate cross-fertilization, centered around the methodological themes, between researchers in linguistics, bioinformatics, computer science, data mining, engineering, bio-engineering, imaging, and medical statistics, to name a few.
Contributed presentations (oral and poster) are very welcome. To stimulate interaction, the entire meeting is held in a plenary format. Because of that, the number of oral slots available is limited, but the poster sessions are designed to foster interaction.

Leuven Statistics Days (LSD) are supported by LStat (Leuven Statistics Research Centre). The 2012 edition is co-organised with the linguistic research group QLVL (Quantitative Lexicology and Variational Linguistics).

Scientific and organizing committee: Dirk Speelman (chairman), Geert Molenberghs (LStat organizing chair), An Carbonez, Lilian Wassink, Dirk Geeraerts, Kris Heylen.


Date & Location

Leuven Statistics Days 2012
Thursday 7 and Friday 8 June, 2012


College De Valk

Tiensestraat 41


Auditorium Zeger Van Hee (room 91.56)

How to reach the location: click here


Public parking is available in the vicinity of the congres location (Ladeuze parking).

Invited Speakers

Harald Baayen (Eberhard Karls University, Tübingen; University of Alberta, Edmonton) The main themes in the research of Harald Baayen are morphological productivity, morphological processing, language variation and statistical data analysis. His book on Word Frequency Distributions (Kluwer, 2001) is widely considered to be the reference work on statistical models for word frequency distributions. Together with several of his journal articles his book with Cambridge University Press, Analyzing Linguistic Data: A practical introduction to Statistics using R (2008), offers a comprehensive coverage of mixed models in linguistics and has been very instrumental in the wider adoption of this technique in linguistics. 

Marco Baroni (CIMeC, Università di Trento, Roverto, Italy.) Marco Baroni (PhD UCLA, 2000) is a tenured researcher and lecturer at the Center for Mind and Brain Sciences of the University of Trento. His research focuses on using mathematical tools, in particular from linear algebra, to model the induction of linguistic meaning from naturally occurring language data. In 2011, he was awared an ERC starting grant to work on the linguistic notion of compositionality from a vector-based perspective.

Geert Verbeke (Leuven Biostatistics and Statistical Bioinformatics Centre (L-BioStat ) KU Leuven). Geert Verbeke has published extensively on various aspects of mixed models for longitudinal data analysis about which he co-authored and co-edited several text books (Springer Lecture Notes 1997; Springer Series in Statistics 2000 and 2005; Chapman & Hall / CRC 2009). Recent research has focused on the use of mixed models for the joint analysis of multiple outcomes. Applications include modeling of multivariate longitudinal data possibly of a different nature (continuous, binary, counts, …), or the use of multiple longitudinally measured markers in the prediction of a time-to-event outcome. For work in this area, he received the International Biometric Society Award for the best Biometrics paper in 2006.

Luc De Raedt (Department of Computer Science, KU Leuven). Luc De Raedt's research interests are in the areas of Artificial Intelligence, Machine Learning and Data Mining as well as their applications. He is currently working on statistical relational learning, which combines probabilistic graphical models with logical representations and machine learning, the integration of constraint programming with data mining and machine learning principles, the development of programming languages for machine learning, and the analysis of graph and network data. He is also interested in applications of these methods to chemo- and bio-informatics, to natural language processing, vision, robotics and action- and activity learning.

Special contributions from:

Antti Arppe (University of Helsinki)
Roeland van Hout (Radboud University Nijmegen)

Stefan Evert (TU Darmstadt)
Sien Moens (KU Leuven)



Preliminary programme:

Thursday 7 June, 2012

Time   Speaker Affiliation Title
9.00-9.30 Registration and coffee      
Chair: Dirk Speelman
9.30 - 9.45 Opening of the meeting

Irène Gijbels

Dirk Speelman

Chair LStat

Chair Scientific and organizing committee

9.45 - 10.30 Keynote lecture Harald Baayen Eberhard Karls University, Tübingen Mixed-effects models in linguistics and psycholinguistics:
A useR's perspective. (presentation)
10.30 - 11.00 Coffee break      
11.00 - 11.30 Special contributor

Antti Arppe

University of Helsinki

Mixed-effects logistic regression modeling and analysis for polytomous outcomes without a reference category. (presentation)

11.30 - 12.00  

Bram Vandekerckhove

University of Antwerp

Identifying and tracing tussentaal in Flemish TV fiction between 1980 and today (presentation)

12.00 - 12.30 Poster teaser     Slides
12.30 - 13.30 Lunch/Poster session      
Chair: Geert Molenberghs
13.30 - 14.15 Keynote lecture Geert Verbeke KU Leuven Mixed models with applications to large data sets. (presentation)
14.15 - 14.45 Special contributor Roeland van Hout Radboud University Nijmegen

Mixed models, mixed feelings. (abstract)

14.45 - 15.15   Job Schepens Radboud University Nijmegen

The L2 Impact on Acquiring Dutch as a L3: the L2 Distance Effect (presentation)

15.15 - 15.45 Coffee break      
15.45 - 16.15  

Dannielle Barth

Vsevolod Kapatsinski

University of Oregon Evaluating Mixed-Models with LOOCV and Effect Size (presentation)
16.15 - 16.45   Martijn Wieling University of Groningen Lexical Differences between Tuscan Dialects and Standard Italian: A Sociolinguistic Analysis using Generalized Additive Mixed Modeling (presentation)
16.45 - 17.15 Annual General Meeting for LStat members      

Friday 8 June, 2012

Time   Speaker Affiliation Title
Chair: Kris Heylen
9.00 - 9.45 Keynote lecture Marco Baroni Università di Trento, Roverto Compositional operations to represent phrases and sentences in
distributional semantics (presentation)
9.45 - 10.15   Samuel Iddi KU Leuven A Combined Overdispersed and Marginalized Multilevel Model (presentation)
10.15 - 10.45 Coffee break      
10.45 - 11.15   Paul De Boeck KU Leuven

Doubly Mixed Designs with Covariates in Three Steps (presentation)

11.15 - 11.45   Elasma Milanzi Hasselt University Quantifying expert opinion for drug discovery with high
dimensional data. (presentation)
11.45 - 12.15   Rui Rothe-Neves UFMG - Brazil Formant transition as a cue to place of articulation in Brazilian Portuguese
coronal fricatives (presentation)
12.15 - 13.30 Lunch/Poster session      
Chair: Irène Gijbels
13.30 - 14.15 Keynote lecture Luc De Raedt KU Leuven

Analysing structured data – symbolic and probabilistic approaches. (presentation)

14.15 - 14.45 Special contributor

Marie-Francine Moens

KU Leuven

Cross-Language Probabilistic Topic Models. (presentation)

14.45 - 15.15 Coffee break      
15.15 - 15.45 Special contributor Stefan Evert TU Darmstadt The role of dimensionality reduction in distributional semantics (presentation)
15.45 - 16.45 Panel discussion

Antti Arppe, Harald Baayen, Marco Baroni, Luc De Raedt, Stefan Evert, Roeland van Hout, Geert Verbeke.

16.45 - Closure and Reception

Geert Molenberghs

Kris Heylen



Poster presentations:

Presenter Affiliation Title
Ann Bertels KU Leuven The Exploration of Polysemy in a Technical Corpus:
a Stepwise Multiple Regression Analysis (abstract)
Wilfried Cools KU Leuven KULAK

Learning to Conjugate Verbs in a Second Language (abstract)

Dirk Goldhahn University of Leipzig Finding Language Universals: Multivariate Analysis of Language Statistics using the Leipzig Corpora Collection (abstract)
Kris Heylen KU Leuven A distributional corpus analysis of Dutch endo- and exocentric compounds. (abstract)
Damazo Kadengye KU Leuven Multiple imputation for missing binary item scores in multilevel cross-classified educational data when the Analysis and Imputation models differ (abstract)
Stijn Luca K.H.Kempen University College Detecting hypermotor seizures using extreme value statistics (abstract)
Gamze Özel Hacettepe University

Special Cases of Compound Poisson Process for Word Occurrences in DNA Sequences. (abstract)

Koen Plevoets Ghent University The correspondence analysis of partitioned tables with multiple factors (abstract)
Tom Ruette KU Leuven Regional varieties and their vowels: Individual
Differences Scaling on the phonetic distances between
Northern American cities.(abstract)
Clara Vanderschueren Ghent University The Portuguese inflected infinitive: empirical approaches compared. (abstract)

Jelle Van Eyck

KU Leuven

Data mining techniques for predicting acute kidney injury after elective cardiac surgery (abstract)

Mathias Verbeke KU Leuven A Statistical Relational Learning Approach to Identify Sections in Scientific Abstracts Using Sentence and Document Structure. (abstract)
Jurgen Vercauteren KU Leuven The prevalence of multidrug resistant HIV-1 decreased during
the last decade in Portugal: a mixed model accounting for
multiple measures per patient. (abstract)
Kelly Wauters KU Leuven KULAK Evaluating growth modeling and change tracking for the estimation of the learner’s progress in ability level. (abstract)
Laure Wynants KU Leuven Variable selection for prediction models based on multicenter data (abstract)
Weiwei Zhang KU Leuven (Non)metonymic expressions for GOVERNMENT in Chinese: A mixed regression (abstract)




Registration fee for KU Leuven students and staff of KU Leuven and Association KU Leuven: € 20

BVS-SBS members (Non- KU Leuven): € 40

Registration fee for other participants: € 50

Registration has been closed.

Payments have to be settled before 1 June 2012.

Please transfer the fee to the following account:

IBAN: BE09 4320 0000 1157        

Of:  KU Leuven, Krakenstraat 3, B-3000 LEUVEN

Stating: your name and  400/0006/42865 ( Please use the free message area for both,  the name and the code)


Accomodation in the vicinity of the congres location:

Mercure Leuven Center

Theater Hotel

Hotel Binnenhof

Martin's Klooster Hotel

Hotel Professor

Hotel Malon

Slightly further away:

Park Inn by Radisson

Novotel Leuven Centrum



Gold sponsors

logo BVS-SBS


SAS logo


Silver sponsors

logo Quetelet Society

The Adolphe Quetelet Society


logo Oxford University press


logo Wiley Blackwell


logo CRC press