NWAV48 has ended
Back To Schedule
Thursday, October 10 • 4:25pm - 4:50pm
Willis et al.: Apparent-time and spatial diffusion in large social-media corpora

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apparent-time and spatial diffusion in large social-media corpora

Social media offers a historically-unparalleled data source for sociolinguistics and dialectology. The problem with these datasets for language change research is the difficulty of identifying enough metadata to trace diffusion. Previous variationist work using Twitter data deals largely with geography, raising the concern that patterns interact with non-geographic confounds. Here, we estimate user ages and consider the interaction between space and time.

Using a corpus of 104,657,500 Tweets from 1,734,260 users in the UK and Ireland, we assign 25.6% of users an age or age-category from mentions of birth year, age, family relationships and employment status. We consider the performance of these predicted ages over a set of variables in British English: loss of the preposition ‘to’ with ‘go’ and certain nouns (“go __(the) pub”); paradigmatic levelling to ‘was’ (“you was”); and de-levelling of “I/he/she were” to ‘was’. Apparent-time effects emerge that add to our understanding of each variable.


David Willis

University of Cambridge

Adrian Leemann

University of Bern

Deepthi Gopal

University of Cambridge

Tam Blaxter

Gonville & Caius College, University of Cambridge

Thursday October 10, 2019 4:25pm - 4:50pm PDT
EMU Cedar & Spruce
  T: Computational sociolinguistics
  • Session type Talk
  • Chair: Jack Grieve