Distractor Intrusions
Poolsizes 10, 15, 30 from Experiment 3
data/raw is not included in the github repo, due to size (but is on osf.io)
dist_pool = list(getUniqueWords("data/raw/E9poolb/task/1/genwordsout.txt"),
getUniqueWords("data/raw/E9poolb/task/2/genwordsout.txt"),
getUniqueWords("data/raw/E9poolb/task/3/genwordsout.txt"),
getUniqueWords("data/raw/E9poolb/task/4/genwordsout.txt"),
getUniqueWords("data/raw/E9poolb/task/5/genwordsout.txt"),
getUniqueWords("data/raw/E9poolb/task/6/genwordsout.txt"))
groups = subset(read.csv('data/groups.csv'), poolsize %in% c(10, 15, 30))
dat = subset(read.csv('data/1_scored_trial.csv'), task %in% paste0("R.pool", c(10, 15, 30)))
dat = merge(dat, groups, by.x="Subject", by.y="subid")
scored_dat = ddply(dat, .(Subject), transform, disterr = resp %in% dist_pool[[unique(cond)]])
dim(scored_dat)
## [1] 3679 25
More intrusion errors on dissimilar trials
err_tab = ddply(scored_dat, .(poolsize), with, tapply(disterr, trialtype, sum))
err_tab
## poolsize D S
## 1 10 20 2
## 2 15 19 2
## 3 30 19 2
What if intrusions were labeled with equal probability?
1 - pbinom(19, 21, .5)
## [1] 1.049042e-05
1 - pbinom(20, 22, .5)
## [1] 5.483627e-06
Are there few subjects making DIs or many?
There are 21 subjects with DIs. Below are all the DIs from pool size of 10 for example.
subset(scored_dat, poolsize==10 & disterr)[c('Subject', 'TBR', 'resp', 'trialnum', 'trialtype')]
## Subject TBR resp trialnum trialtype
## 3 17 leaf coal 4 D
## 4 17 train tend 4 D
## 27 17 hat coal 9 D
## 78 17 bone cook 21 S
## 86 17 deck book 23 D
## 87 17 slice ate 23 D
## 89 17 mare coal 24 D
## 94 17 tree stress 25 D
## 689 27 beach sun 6 D
## 1139 31 hat ate 4 D
## 1200 31 lane tend 18 D
## 1224 31 bell book 23 D
## 1246 34 town coal 4 D
## 1248 34 street stress 4 D
## 1251 34 trash stress 5 D
## 1755 39 ear sun 16 S
## 3157 58 square stress 8 D
## 3226 58 brain ate 23 D
## 3238 58 slice stress 26 D
## 3416 60 hair sun 17 D
## 3434 60 side 21 D
## 3458 60 ground hard 26 D
Experiment 4 (names)
dnames = subset(read.csv('data/1_scored_trial.csv'), task %in% paste0('Rspan.names.', c('long', 'short')))
dist_names = c(getUniqueWords("data/raw/E11names/task/1/genwordsout.txt"),
getUniqueWords("data/raw/E11names/task/2/genwordsout.txt"))
dist_names
## [1] "benjamin" "timothy" "jessica" "christopher" "kimberly"
## [6] "rebecca" "brianna" "abigail" "joshua" "nicholas"
## [11] "jess" "brianne" "abby" "becca" "chris"
## [16] "nick" "kim" "ben" "josh" "tim"
scored_names = ddply(dnames, .(Subject), transform, disterr = resp %in% dist_names)
No intrusion errors
ddply(scored_names, .(task), with, tapply(disterr, trialtype, sum))
## task D S
## 1 Rspan.names.long 0 0
## 2 Rspan.names.short 0 0
title: "2_distractor_intrusions.R" author: "machow" date: "Wed Jan 13 12:42:01 2016"