Week 10 practical

reading trial lists from CSVs again, PHP scripts for iteration

The plan for week 10 practical

This week we are going to look at code for an iterated learning experiment, a simplified version of the experiment described in Beckner et al. (2017). There’s no new material to look at in the Online Experiments with jsPsych tutorial.

In terms of the trial types we need to present to participants, this experiment is actually very simple, and uses elements of the code we developed in the practicals on word learning, perceptual learning and dyadic interaction.

Participants go through an initial observation phase where they are exposed to objects paired with labels/descriptions. Again, this is basically identical to the observation phase of the word learning experiment in word_learning.js.
In the final test, where participants try to reproduce the language they are trained on, on each trial participants are presented with an object and asked to produce a label/description for it. Beckner et al. used a free-typing production method, where people type stuff in. In some recent work with online iterated learning we (my RA Clem Ashton and I) switched to a more constrained production model: participants are provided with a set of syllable options, and build complex labels by clicking on those syllable buttons. This reduces or removes the problem of participants typing English (e.g. “don’t know”, “no idea”) or near-English versions of random labels (e.g. “vukano” -> “volcano”), which we were getting a lot of with free-typed responses on MTurk. Anyway, the upshot is that I have made a new plugin, image-button-buildlabel-response which is like the standard image-button-response plugin but allows you to click multiple buttons and build up a label. We use that in the production trials.

The complication this week is that rather than pre-specifying the language participants have to learn, we are running an iterated learning design: the language produced by one participant in the production phase becomes the input language to another participant in the observation phase, allowing us to pass the language from person to person and watch it evolve. Participants are organised in chains, where the participant at generation n in a particular chain learns from the language produced by the generation n-1 participant in that chain.

There are a number of ways you could do iterated learning in an online experiment. You could write a python server, a bit like the one we used for dyadic interaction, that keeps track of which chains are running, which participants are in which chains, and then passes over the appropriate training data when a new participant starts the experiment. Or you could run a database on the server (in another language, SQL, designed for managing databases), that does the same kind of thing, keeping track of which chains are open, which participants are in which chain, and so on.

Here we are going to go for a low-tech approach, using CSV files on the server to store the languages participants produce, and then using PHP scripts (just like the ones we use for saving data) to write to those files, read from those files, and move files around to different folders. This hopefully means that we can get an iterated learning experiment up and running without any extra fancy bells and whistles. We already looked at reading CSV files from the server to build a trial list (in the confederate priming code), so some of the principles involved here are the same (e.g. using asynchronous functions to make javascript wait while PHP is off reading a file from the server). Again, like last week, I won’t bother you with the contents of the technical infrastucture too much (the contents of the PHP scripts), and instead talk you through the code at a conceptual level, focussing on the jsPsych end of things.

Remember, as usual the idea is that you work through as much of this as you can on your own (might be none of it, might be all of it) and then come to the practical drop-in sessions or use the chat on Teams to get help with stuff you need help with.

Acknowledgments

The object stimuli for this week’s experiment were provided by my colleague Dr Jennifer Culbertson, who uses slightly different variants of these images in several of her excellent papers on word order biases in noun phrase learning (e.g. this paper in Cognition).

An iterated learning experiment

Getting started

Important note: This experiment requires a bit of careful set-up in your server_data folder on the jspsychlearning server, so don’t just download it and start running it - read below for instructions on how to set everything up, otherwise the code will behave strangely and you’ll be confused!

You need a bunch of files for this experiment - an html file, a couple of js files, some images, and a bunch of php files. Download the following zip file and then uncompress it into your usual jspsych folder:

Download iterated_learning.zip

This code won’t work on your local computer, it needs to be on the jspsychlearning server - so once you have extracted the zip file, you need to upload the whole iterated_learning folder to your public_html folder on the jspsychlearning server, alongside your various other experiment folders and your jspsych-6.1.0 folder. Here’s what my public_html folder looks like on cyberduck.

my public_html folder

You also need to tweak the iterated_learning.js code so it saves data to your server_data folder rather than mine. I have tried to make this more straightforward this week, so rather than messing with any of the PHP files, you can edit this in one place and it will work nicely everywhere. Open iterated_learning.js in an editor and find the line that says

var myUUN = 'ksmith7'

In the version of the code I am looking at, that’s around line 50. Then just swap my UUN (ksmith7) for yours - e.g. if your account name on the server is s1234567, change that line so it reads:

var myUUN = 's1234567'

Finally, we need to set up some stuff in your server_data folder. Managing this iterated learning experiment means we need to keep track of several things. First, we want to record participant data trial-by-trial as it comes in, just like we always do. But we also need to keep track of which chains are available to iterate, which chains are currently being worked on by a participant, and which generations of which chains are completed and don’t need to be messed with any more. We are going to manage that stuff by moving files from folder to folder in server_data, so we need to set up those directories, and also drop in some starting languages to initialise our chains.

To do that, navigate into your server_data folder on cyberduck. You need to make sure that the folders you are creating inherit their access permissions etc from the main server_data folder, which you do by getting right into that folder on cyberduck before creating any new folders. Double-click the server_data folder so your navigation bar in cyberduck looks something like this (but with your UUN rather than mine obviously) cyberduck in server_data

Once you are there, create a new folder (Action … New folder in cyberduck) and call that folder il (short for iterated learning). Then double-click to enter the il folder, and create four new folders in there, called ready_to_iterate, undergoing_iteration, completed_iteration and participant_data. Here’s what my server_data folder looks like after that step - you can see the il directory with the 4 sub-directories. Note that you have to get the folder names exactly right, otherwise the code won’t be able to find the stuff it needs.

il directory structure

We are going to use participant_data folder to save our trial-by-trial data like we usually do; the other 3 folders will be used to keep track of the state of each iterated learning chain.

Finally, we need to make some initial (generation 0) languages available. If you look in the iterated_learning directory you got from the zip you downloaded, there’s a sub-folder called initial_languages_for_server_data, containing two CSV files called chain1_g0.csv and chain2_g0.csv. Grab those and put them in the ready_to_iterate folder you just created in server_data/il - these are random languages that will serve as the starting point for two iterated learning chains.

Once you’ve done that, your server_data folder looks like this, and you are ready to go!

il directory structure with initial languages

Managing an iterated learning experiment via PHP scripts

In an iterated learning experiment, one participant’s output becomes the input for another participant. Participants are organised in chains, and you’ll typically have several chains open at once (“open” means that you need to add more participants to those chains to get them to the desired number of generations). There are three main kinds of events you have to handle:

When a new participant starts the experiment, you have to allocate them to an open chain (or deal with them some other way if there are no chains open), and avoid allocating any other new participants to the same chain until they are finished (i.e. there’s no point in having two participants both competing to be generation 3 of chain 2 or whatever).
When a participant completes the experiment, you need to make their output language available as the input language for the next participant in their chain.
If a participant drops out (which happens a lot online) you need to recycle the chain that was allocated to them, making it available to another participant.

As I mentioned above, there are a bunch of ways you could do this, but here I’ve gone for a relatively simple solution. We will store input languages as CSV files on the jspsychlearning server. Those oinput language files will contain a list of object-label pairs which we can easily read in to create training for a participant, or write out based on what a participant does during production testing. The file name will give the chain number and generation number - so for instance, the file chain1_g0.csv is the language of generation 0 (i.e. the initial language) for chain 1, and the top of that file looks like this:

object,label
images/o1_cB_n1.png,visivu
images/o1_cB_n2.png,kotisu
images/o1_cB_n3.png,vovaso
images/o2_cB_n1.png,kukati

You can see that this is a CSV (comma-separated) file with two columns; the object column is just the name of one of our stimulus images and the label column is a label for that object. Note that the image files have structured names too - the code doesn’t care about that, but each image file specifies an object shape (o1, o2 or o3), a colour (cB, cG or cO for blue, green and orange respectively), and a number (n1, n2 or n3, for 1, 2 or 3 objects in the image).

Reading from and writing to these CSV files provides a simple way to pass a language from participant to participant - we read in the language from a CSV file to create training data, then when the participant completes the production phase we can write a new language file capturing their language, which can be read in by the next participant in the chain. You should already be familiar with the idea that we can write to CSV files using PHP - that’s what we have been doing every time we save a participant’s trial data. You have also seen one example of reading in a CSV and creating a trial list, in the confederate priming practical.

We also need a way to keep track of which chains are open, which are in progress etc. We’ll do this by moving files among the various directories you created in server_data/il.

Any files in the ready_to_iterate folder indicate chains that are ready to iterate - these are available to be allocated to a new participant who loads the experiment. Once we allocate a given generation of a given chain to a new participant (event 1 of the 3 events above), we move the input language CSV file to the undergoing_iteration folder - that stops it being allocated to anyone else while our participant is working on it.

If the participant completes the experiment (event 2 above) then we take their production output and write it as a new CSV file in ready_to_iterate, making it available for the next participant in the chain (and updating the generation number - e.g. if we train someone on the language in chain1_g0.csv, we write the language they produce to chain1_g1.csv). We also move their input language file out of undergoing_iteration and into completed_iteration. That’s mainly to keep it nice and clear which generations of which chains are currently being worked on and which are complete - you can look at the various folders on server_data/il/ and immediately see which chains are running, which are waiting for new participants, and which are done.

Finally, if our participant drops out (event 3 above), we just move the input language file they were working on back from undergoing_iteration to the ready_to_iterate folder - that makes it available again, and prevents drop-out participants clogging up our chains. Conveniently jsPsych provides a way of handling participant dropouts, allowing you to run a function to do stuff when the participant closes their browser window before completing the experiment.

All of these various actions are carried out by PHP scripts, which we can call from our jsPsych experiment. list_input_languages.php returns a list of files in the ready_to_iterate folder back to the jsPsych experiment, which we can use to figure out which chains are open and then pick a random chain for our new participant. load_input_language.php reads in a specific language file and sends it back to jsPsych in a usable format (note that this is slightly different from how we were reading in CSV files in the confederate priming code; now we are doing the reading in PHP, in the confederate priming code I was doing it in the javascript). We can use save_data.php to write participant’s output languages (as well as saving participants’ trial-by-trial data to the participant_data folder). And move_input_language.php handles the process of shuffling CSV files back and forth between our various directories.

Digging in to the code: updating our save_data function

Now you (hopefully) get the general idea, we can have a look at some more detailed aspects of the code. The first thing to flag up is that I have changed the save_data.php script a bit, and also changed the save_data function in the javascript code, to make it a bit more general. In the old version of our save_data.php it was hard-wired to write to a specific sub-folder in my server_data folder, which was bad for two reasons: one, if you forgot to edit the PHP script to point to your server_data folder, all your data, weird voice recordings etc appear in my server_data folder and frighten/confuse me; two, for the new experiment we want to save stuff to two different sub-folders of server-data/il (saving participant data to server-data/il/participant_data and output languages to server-data/il/ready_to_iterate), and we really don’t want to have to write two different PHP scripts which are trivially different from each other just to handle that.

The solution is to make the save_data.php and save_data javascript functions a bit more general - we pass in information about which user’s server_data directory to use (ksmith7 for me, s… for you), and also which directory to save the data in (which avoids us having to create different PHP scripts for saving in slightly different directories). The new more general code looks like this:

var myUUN = 'ksmith7'

function save_data(directory,filename,data){
  var url = 'save_data.php';
  var data_to_send = {user: myUUN, directory: directory, filename: filename, filedata: data};
  fetch(url, {
      method: 'POST',
      body: JSON.stringify(data_to_send),
      headers: new Headers({
              'Content-Type': 'application/json'
      })
  });
}

In particular, we now have to specify myUUN (to point the PHP script to the correct user’s server_data directory) and pass more complex information over to the PHP script (in data_to_send) - not just the filename and data, but also the user info and the directory to save in. The other stuff implementational (the fetch command, the JSON stuff, etc) is the same as the old version you don’t have to worry about those details.

Our save_iterated_learning_data function (which is the next bit in the code) then uses this new save_data function to save participant trial-by-trial data to the participant_data folder. But we’ll also use the same function to write final output languages to the ready_for_iteration folder too.

Calling PHP scripts to do various things with input language files

The next chunk of code sets out 4 functions which do all our manipulation of language CSV files for us. The first two functions, list_input_languages and read_input_language, have the same structure - they use fetch to run a PHP script on the server, and then receive back a response from the PHP script, which they do a little formatting on (to turn the data into something we can work with). Because these functions interact with a PHP script which is reading data files on the server, we have to set them up as async (asynchronous) functions, and tell them to await the response from the PHP server before doing anything. I mentioned this async/await stuff briefly at the end of the confederate priming practical, but to recap: fetching data from the server via PHP takes some time - only a fraction of a second, so it appears instantaneous to us, but for the computer this is very slow. Rather than wait for the fetch command to finish, your browser tries to press on and run the rest of the code - if you are used to ‘normal’ programming languages like python, that run things one step at a time, this is a very weird behaviour that takes some time to get used to! In this particular case, trying to carry on while the fetch function goes off and does its job is a bad idea, since we actually need to get the response back from the PHP script before we can continue - running off ahead before the fetch returns the data we need will cause our code to break, because until the fetch command returns its data we can’t actually process it!

There are various solutions to this problem, but I think the simplest one is to use the async and await functions (which are part of newer versions of javascript). This allows us to declare some functions as async (i.e. asynchronous, in other words there are some steps that involve waiting for one function to complete before proceeding, rather than running everything synchronously/simultaneously), and then use await to tell the browser to wait for a certain operation to complete before moving on. This means we can wait until the fetch command has done its job and got the data we need.

Here are the two functions that have this await fetch structure (with some irrelevant details given as … to save your eyes):

async function list_input_languages(){
  var data_to_send = {user: myUUN}; //we need to send over myUUN so the PHP looks in the correct user directory
  var response = await fetch('list_input_languages.php', {
      ...});
  // various steps to convert the string returned by the PHP script into
  // something we can work with
  var text_response = await response.text()
  var object_response = JSON.parse(text_response)
  var language_file_list = Object.values(object_response)
  return language_file_list
}


async function read_input_language(input_language_filename){
  var data_to_send = {user:myUUN,filename: input_language_filename}; //we need to send myUUN and input_language_filename to the PHP script
  var response = await fetch('load_input_language.php', {
      ...});
  var input_language_as_text = await response.text()
  var input_language = JSON.parse(input_language_as_text)
  return input_language
}

You can see that in both cases we put together the data we need to give the PHP script (in the data_to_send variable) - this is the username and maybe the language filename. Then we call the relevant PHP script via fetch, and await the result. When the PHP script returns some data (which we store in response) we then do some additional processing, which involves digging out the bit of the response we need (we can access it in response.text()) and then doing some additional formatting stuff until we can eventually return the information we were after. In the case of list_input_languages what eventually gets returned is a list of the CSV files in the server_data/il/ready_to_iterate folder; in the case of read_input_language, we tell it specifically which CSV file to read from the ready_to_iterate folder and it eventually gives us back a nice javascript representation of the contents of the file, in the form of a list of javascript objects - so the first few rows of chain1_g0.csv I showed you above would be read in as:

[
  {object:'images/o1_cB_n1.png',label:'visivu'},
{object:'images/o1_cB_n2.png',label:'kotisu'},
{object:'images/o1_cB_n3.png',label:'vovaso'},
{object:'images/o2_cB_n1.png',label:'kukati'}
]

We can then use this list of object-label pairs to build training and testing timelines.

The third function in this section, move_input_language, basically follows the same idea - we bundle up some info and ask a PHP script to do a job for us - but in this case we don’t need to wait for any information back from the PHP script (we just assume it moved the file from from_folder to to_folder for us) so we can immediately move on, without any need for the async and await stuff. Here’s the function:

function move_input_language(input_language_filename,from_folder,to_folder){
  var data_to_send = {user:myUUN,filename: input_language_filename, source:from_folder,destination:to_folder};
  fetch('move_input_language.php', {
      ...});
}

Finally, save_output_language is called when a participant completes the production phase, and we use it to save their set of object-label pairs (which we build during the production phase) to a new file in server_data/il/ready_to_iterate.

function save_output_language(object_label_list) {
  var output_string = "object,label\n" //column headers plus a new line
  for (object_label_pair of object_label_list) {//for each object_label_pair
    output_string = output_string + object_label_pair.object + "," + object_label_pair.label + "\n"
  }
  var output_file_name = 'chain' + chain + '_g' + generation + '.csv'
  save_data('ready_to_iterate',output_file_name, output_string)
}

object_label_list is in the same format we end up with when we read in a language from an input language file, i.e. something like this:

[
{object:'images/o1_cB_n1.png',label:'visivu'},
{object:'images/o1_cB_n2.png',label:'kotisu'},
{object:'images/o1_cB_n3.png',label:'vovaso'},
{object:'images/o2_cB_n1.png',label:'kukati'}
]

We simply work through that list, building a CSV-formatted string (commas between object and label, newline "\n" after each label), and then write it to a file using save_data. The output file name follows our usual format, i.e. chainX_gY.csv, where this participant’s chain and generation number are stored in variables chain and generation that we create below.

Observation and production trials

As per usual, we need to define a function to build observation trials (where the participant sees a picture plus label) and production trials (the participant sees an image, produces a label). Observation trials are based on our usual code from the practical on word learning, so I won’t go over it in detail here. We will take a look at production trials though, because there are a couple of things to note.

First, since we are providing participants with a limited set of syllables to build labels from, we need to list the syllables they can use.

var available_syllables = jsPsych.randomization.shuffle(["ti","ta","to","tu",
                                                         "ki","ka","ko","ku",
                                                         "si","sa","so","su",
                                                         "vi","va","vo","vu"])

Notice that I am shuffling the list of syllables once: this means that in production trials the syllables will appear appear on-screen in random order, but that order will be consistent throughout the experiment. Shuffling the syllables every trial would be an option, but that makes it very hard work for the participant, who has to hunt for the syllable they want in a different place on every trial.

Second, we need to keep track of the labels the participant produces as they work through the production phase, so that if/when they complete the production phase, we can save their final language to use as input for another participant. We will store their building list of productions to participant_final_label_set, which is initially just an empty list.

var participant_final_label_set = []

Finally, we define a function, make_production_trial, which takes an object_filename (e.g. images/o1_cG_n2.png) and creates a production trial for that object. The production trial uses a new plugin I created, which is based on the standard image-button-response plugin but allows participants to click repeatedly to build a complex label. Participants’ choices are the available_syllables we created above, plus buttons labelled DELETE (to delete the last-selected syllable) and DONE (to move to the next trial). The DONE button only works if the participant has built a label with at least one syllable, to prevent null responses that would cause problems in iteration (alternatively we could allow null responses, but then we’d have to be on the lookout for them when creating training data from an input language file).

Here’s the function:

function make_production_trial(object_filename) {
  var trial = {type:'image-button-buildlabel-response',
               stimulus:object_filename,
               stimulus_height:150,
               choices:[].concat(available_syllables,["DELETE","DONE"]), //add the DELETE and DONE buttons
               data:{block:'production'},
               on_finish: function(data) {
                 participant_final_label_set.push({object:object_filename,label:data.label})
                 save_iterated_learning_data(data)}}
  return trial
}

Note that at the end of the trial (on_finish) we do two things: we add a representation of the object-label pair the participant produced to participant_final_label_set:

participant_final_label_set.push({object:object_filename,label:data.label})

data.label is created by the image-button-buildlabel-response plugin and is the final label the participant produced (i.e. their label when they clicked DONE). We also save the trial data to the server as usual, using save_iterated_learning_data, to keep a more detailed record of the participant’s response (including e.g. total time to complete the trial).

We can then use this make_production_trial function (and the equivalent make_observation_trial function) to create a trial list of observation and production trials. That’s what the next block of code does: build_training_timeline and build_testing_timeline both take an input language specified as a list of {object:object_filename,label:a_label} object-label pairs, and build a training or testing timeline.

build_training_timeline takes a list of object-label pairs and builds a training timeline consisting of n_repetitions blocks (n_repetitions is set to 1 below, so only one repetition of each training item) - each block contains one observation trial for each object-label pair in object_label_pairs.

function build_training_timeline(object_label_pairs,n_repetitions) {
  var training_timeline = [] //build up our training timeline here
  //this for-loop works through the n_repetitions blocks
  for (i=0;i<n_repetitions;i++) {
    //randomise order of presentation in each block
    var shuffled_object_label_pairs = jsPsych.randomization.shuffle(object_label_pairs)
    //in each block, present each object-label pair once
    for (object_label_pair of shuffled_object_label_pairs) {
      var trial = make_observation_trial(object_label_pair.object,object_label_pair.label)
      training_timeline.push(trial)
    }
  }
  return training_timeline
}

build_testing_timeline takes a list of object-label pairs and builds a testing timeline with one production trial for each object in object_label_pairs, in random order. Note that the labels are simply discarded here - to create a production trial we just need the object filename.

function build_testing_timeline(object_label_pairs) {
  var testing_timeline = []
  var shuffled_object_label_pairs = jsPsych.randomization.shuffle(object_label_pairs)
  for (object_label_pair of shuffled_object_label_pairs) {
      var trial = make_production_trial(object_label_pair.object)
      testing_timeline.push(trial)
    }
  return testing_timeline
}

Putting it all together

Finally we are in a position to put all of these functions together. The function run_experiment(), code below, runs through a 9-step process of looking for open chains, selecting an input filename to iterate from, building timelines and running the experiment, then saving an output language for the next generation to iterate from. The 9 steps are:

We see if there are any input languages available for iteration. If not, we just tell the participant to come back later. If there at least one input language available, we proceed.
We select a random input language to use. The name of this file tells us what chain and generation we are running (e.g. if the filename is chain10_g7.csv we know we are running generation 7 of chain 10), so we can extract that info from the filename (extracting this info from the filename is a little bit fiddly).
We read in the input language from the appropriate file.
We use that input language to generate training trials for this participant. We impose a bottleneck on transmission by taking a subset of the language of the previous generation (here, 14 randomly-selected object-label pairs) and using that to build the training timeline (here, repeating each of those object-label pairs once).
We also use that input language to build a testing timeline, requiring the participant to do a production trial for all possible objects (i.e. not just the 14 we selected for training - they have to generalise).
We build the full experiment timeline, combining the training and testing timelines with the various information screens which we defined earlier (I skipped the information trials in the code walkthrough, they are just the usual html-keyboard-response trials).
We move the input language file we are using from server_data/il/ready_to_iterate to server_data/il/undergoing_iteration, so that another participant doesn’t also start working on this input language.
We run the timeline
a. If the participant completes the experiment (i.e. gets to the end of the production phase), we save the language they produced during production as a new input language in server_data/il/ready_to_iterate and also move the input language they were trained on to server_data/il/completed_iteration, so that we know it’s been done.
b. If the participant abandons the experiment we need to recycle their input language - they haven’t completed the experiment, so we need someone else to run this generation of this chain. We simply move the input language file they were working on back to the server_data/il/ready_to_iterate folder. Note that you can capture this kind of exit event in jsPsych using the on_close parameter of jsPsych.init. NB Some of you have reported that this method for recycling input languages is a bit unreliable - sometimes it works, sometimes it doesn’t. This might be an issue with on_close (to be honest I’m a bit surprised it’s possible to run functions in a browser window the user is trying to close!), or maybe my implementation of the file move function. Anyway, if you notice when testing that files are getting stuck in ready_to_iterate then you can manually move them using cyberduck. If you are planning on using this code to run a real iterated learning experiment with real participants, and can’t find a way to make this 100% reliable, you’d need to have some kind of clean-up procedure where you check for files that are stuck in undergoing_iteration. That could be automated, e.g. in a PHP script that looks in undergoing_iteration for files that have been there for more than a set time limit and then moves old files back to ready_to_iterate, then you’d run that script every time you ran the code, before the step of looking in ready_to_iterate to see if there’s anything available. But that requires yet another PHP script, so for our purposes here just keep an eye on it and use the manual fix if it becomes a problem. And if you figure out why it doesn’t work (i.e. under what conditions on_close fails) do let me know!

Here’s the code with those 9 steps marked up in the comments. In various places we need to know what chain and generation we are running (e.g. for saving the participant’s final language to a file with the correct name) - we store this info in two variables, chain and generation, which we update once we allocate the participant to a specific chain. Note that in some places we also need to await the response from a PHP script that’s retrieving some info from the server from us.

var chain
var generation

async function run_experiment() {

  //1. We see if there are any input languages available for iteration
  var available_input_languages = await list_input_languages()

  //...If not, we just tell the participant to come back later (using the cannot_iterate_info html-keyboard-response trial created above)
  if (await available_input_languages.length == 0) {
    jsPsych.init({timeline: [cannot_iterate_info]})
  }

  //...If there is, we proceed.
  else {

    //2. We select a random input language to use.
    var input_language_filename = jsPsych.randomization.shuffle(available_input_languages)[0]
    //...The name of this file tells us what chain and generation we are running
    //To retrieve generation and chain info from filename, split the filename at _ and .
    split_filename = input_language_filename.split(/_|\./)
    //chainX will be the first item in split_filename, just need to lop off the 'chain' prefix and convert to integer
    chain = parseInt(split_filename[0].substring(5))
    //gY will be the second item in split_filename, ust need to lop off the 'g' prefix and convert to integer
    var input_generation = parseInt(split_filename[1].substring(1))
    //*This* generation will be the input language generation + 1
    generation = input_generation + 1

    // 3. We read in the input language from the appropriate file.
    var input_language = await read_input_language(input_language_filename)

    // 4. We use that input language to generate training trials for this participant.
    // We impose a bottleneck on transmission by taking a subset of the language
    // of the previous generation (here, 14 randomly-selected object-label pairs)
    // and using that to build the training timeline (here, repeating each of those
    // object-label pairs once)
    var training_object_label_pairs = jsPsych.randomization.sampleWithoutReplacement(input_language,14)
    // Note just one repetition of each label in training, just to keep the experiment duration down for you!
    var training_timeline = build_training_timeline(training_object_label_pairs,1)

    // 5. We use that input language to build a testing timeline, requiring the participant
    // to do a production trial for each object.
    var testing_timeline = build_testing_timeline(input_language)

    // NB I am creating a tidy-up trial, to run when the participant completes the production
    // phase, at this point, so it looks out of sequence! I could have done this in the
    // on_close of the last production trial, but it seemed simpler to do it as a
    // stand-alone event in the timeline, using the call-function trial type.
    // 9a. If the participant completes the experiment (i.e. gets to the end of the production
    // phase), we save the language they produced during production as a new input language
    // in server_data/il/ready_to_iterate and also move the input language they were trained on to
    // server_data/il/completed_iteration, so that we know it's been done.
    var tidy_up_trial = {type:'call-function',
                         func: function() {
                            save_output_language(participant_final_label_set)
                            move_input_language(input_language_filename,'undergoing_iteration','completed_iteration')}}

    // 6. We build the full experiment timeline, combining the training and testing timelines
    // with the various information screens.
    var full_timeline = [].concat(consent_screen,
                                  instruction_screen_observation,
                                  training_timeline,
                                  instruction_screen_testing,
                                  testing_timeline,
                                  tidy_up_trial,
                                  final_screen
                                )
    // 7. We move the input language file we are using from server_data/il/ready_to_iterate to
    // server_data/il/undergoing_iteration, so that another participant doesn't
    // also start working on this input language.
    move_input_language(input_language_filename,'ready_to_iterate','undergoing_iteration')

    // 8. We run the timeline
    jsPsych.init({
        timeline: full_timeline,
        // 9b. If the participant abandons the experiment we need to recycle their input language -
        // they haven't completed the experiment, so we need someone else to run this generation
        // of this chain. We simply move the input language file they were working on back to the
        // server_data/il/ready_to_iterate folder. Note that you can capture this kind of exit event
        // in jsPsych using the on_close parameter of jsPsych.init.
        on_close: function() {move_input_language(input_language_filename,'undergoing_iteration','ready_to_iterate')},
        on_finish: function(){jsPsych.data.displayData('csv')}
    });
  }
  }

An additional thing to note about this code: I have not implemented the deduplication filter - I figured the code was complicated enough! If you want to implement this (it’s an optional and challenging exercise this week) you will need two extra steps:

Before implementing step 9a, saving the participant’s produced language to the ready_to_iterate folder, you need to check it is usable, i.e. contains enough distinct labels. If so, you proceed as normal; if not, you recycle their input language (in the same way as if they had abandoned) and try again.
On step 4, selecting object-label pairs to use for training, you would need to select in a way that avoids duplicate labels, rather than selecting randomly.

Also note that there is no maximum generation number in this code - chains will run forever! If you want to stop at e.g. 10 generations, this could also be implemented in step 9a - check this participant’s generation number, if they are at generation 10 then don’t save their lexicon to the ready_to_iterate folder.

The final line of the code simply runs this run_experiment() function, starting the whole 9-step process described above.

Exercises with the iterated learning experiment code

After changing the myUUN variable and setting up the various folders in server_data, run the experiment and use cyberduck to watch the CSV files appearing and moving around in server_data/il. Experiment with abandoning the experiment part-way through (i.e. closing the browser window) and see what happens. Look at the CSV data files that get created in various places, and check that the contents of the data files make sense and how they relate to what you see as a participant. Try to run a few generations of at least one chain and check that the iteration process works as you expect.
How would you increase the number of training trials in the observation phase of the experiment to provide e.g. 6 passes through the training set? How would you increase or decrease the size of the transmission bottleneck?
How would you randomise the order of the syllables on production trials separately for every production trial? Do you think that is better or worse? How about if you don’t randomise them at all? Have a think about the possible consequences of these various randomisation choices.
How could you insert a small number of test trials after each block of training trials, to keep the participant paying attention?
[Harder, optional] Can you add a maximum generation number, so no chain goes beyond e.g. 10 generations?
[Harder, optional] Could you add a manipulation of production effort to this experiment, borrowing code from the dyadic interaction practical? There’s actually already a bit of production effort involved here, in that participants have to click more times to build a longer label, but could you add an additional “click repeatedly to finish” trial to really ramp up a preference for shorter labels? What effect do you think this would have over iteration?
[Very hard, very optional] Can you implement a deduplication filter like that used by Beckner et al., to avoid presenting participants with ambiguous duplicate labels (where two distinct visual stims map to the same label)?

References

Beckner, C., Pierrehumbert, J., & Hay, J. (2017). The emergence of linguistic structure in an online iterated learning task. Journal of Language Evolution, 2, 160–176.

Re-use

All aspects of this work are licensed under a Creative Commons Attribution 4.0 International License.

Course main page

Project maintained by kennysmithed Hosted on GitHub Pages — Theme by mattgraham