Distractor Intrusions

Poolsizes 10, 15, 30 from Experiment 3

data/raw is not included in the github repo, due to size (but is on osf.io)

dist_pool = list(getUniqueWords("data/raw/E9poolb/task/1/genwordsout.txt"),
                 getUniqueWords("data/raw/E9poolb/task/2/genwordsout.txt"),
                 getUniqueWords("data/raw/E9poolb/task/3/genwordsout.txt"),
                 getUniqueWords("data/raw/E9poolb/task/4/genwordsout.txt"),
                 getUniqueWords("data/raw/E9poolb/task/5/genwordsout.txt"),
                 getUniqueWords("data/raw/E9poolb/task/6/genwordsout.txt"))

groups = subset(read.csv('data/groups.csv'), poolsize %in% c(10, 15, 30))
dat = subset(read.csv('data/1_scored_trial.csv'), task %in% paste0("R.pool", c(10, 15, 30)))
dat = merge(dat, groups, by.x="Subject", by.y="subid")

scored_dat = ddply(dat, .(Subject), transform, disterr = resp %in% dist_pool[[unique(cond)]])
dim(scored_dat)

## [1] 3679   25

More intrusion errors on dissimilar trials

err_tab = ddply(scored_dat, .(poolsize), with, tapply(disterr, trialtype, sum))
err_tab

##   poolsize  D S
## 1       10 20 2
## 2       15 19 2
## 3       30 19 2

What if intrusions were labeled with equal probability?

1 - pbinom(19, 21, .5)

## [1] 1.049042e-05

1 - pbinom(20, 22, .5)

## [1] 5.483627e-06

Are there few subjects making DIs or many?

There are 21 subjects with DIs. Below are all the DIs from pool size of 10 for example.

subset(scored_dat, poolsize==10 & disterr)[c('Subject', 'TBR', 'resp', 'trialnum', 'trialtype')]

##      Subject    TBR   resp trialnum trialtype
## 3         17   leaf   coal        4         D
## 4         17  train   tend        4         D
## 27        17    hat   coal        9         D
## 78        17   bone   cook       21         S
## 86        17   deck   book       23         D
## 87        17  slice    ate       23         D
## 89        17   mare   coal       24         D
## 94        17   tree stress       25         D
## 689       27  beach    sun        6         D
## 1139      31    hat    ate        4         D
## 1200      31   lane   tend       18         D
## 1224      31   bell   book       23         D
## 1246      34   town   coal        4         D
## 1248      34 street stress        4         D
## 1251      34  trash stress        5         D
## 1755      39    ear    sun       16         S
## 3157      58 square stress        8         D
## 3226      58  brain    ate       23         D
## 3238      58  slice stress       26         D
## 3416      60   hair    sun       17         D
## 3434      60          side       21         D
## 3458      60 ground   hard       26         D

Experiment 4 (names)

dnames = subset(read.csv('data/1_scored_trial.csv'), task %in% paste0('Rspan.names.', c('long', 'short')))
dist_names = c(getUniqueWords("data/raw/E11names/task/1/genwordsout.txt"), 
               getUniqueWords("data/raw/E11names/task/2/genwordsout.txt"))
dist_names

##  [1] "benjamin"    "timothy"     "jessica"     "christopher" "kimberly"   
##  [6] "rebecca"     "brianna"     "abigail"     "joshua"      "nicholas"   
## [11] "jess"        "brianne"     "abby"        "becca"       "chris"      
## [16] "nick"        "kim"         "ben"         "josh"        "tim"

scored_names = ddply(dnames, .(Subject), transform, disterr = resp %in% dist_names)

No intrusion errors

ddply(scored_names, .(task), with, tapply(disterr, trialtype, sum))

##                task D S
## 1  Rspan.names.long 0 0
## 2 Rspan.names.short 0 0

title: "2_distractor_intrusions.R" author: "machow" date: "Wed Jan 13 12:42:01 2016"