Title: | Tools for Preprocessing Visual World Data |
---|---|
Description: | Gaze data from the Visual World Paradigm requires significant preprocessing prior to plotting and analyzing the data. This package provides functions for preparing visual world eye-tracking data for statistical analysis and plotting. It can prepare data for linear analyses (e.g., ANOVA, Gaussian-family LMER, Gaussian-family GAMM) as well as logistic analyses (e.g., binomial-family LMER and binomial-family GAMM). Additionally, it contains various plotting functions for creating grand average and conditional average plots. See the vignette for samples of the functionality. Currently, the functions in this package are designed for handling data collected with SR Research Eyelink eye trackers using Sample Reports created in SR Research Data Viewer. While we would like to add functionality for data collected with other systems in the future, the current package is considered to be feature-complete; further updates will mainly entail maintenance and the addition of minor functionality. |
Authors: | Vincent Porretta [aut, cre], Aki-Juhani Kyröläinen [aut], Jacolien van Rij [ctb], Juhani Järvikivi [ctb] |
Maintainer: | Vincent Porretta <[email protected]> |
License: | GPL-3 |
Version: | 1.2.4 |
Built: | 2024-11-15 03:51:48 UTC |
Source: | https://github.com/cran/VWPre |
Internal helper function, not intended to be called externally.
.check_for_PupilPre(type, suggest)
.check_for_PupilPre(type, suggest)
type |
A string, "NotAvailable" of "UseOther". |
suggest |
A string, PupilPre function name. |
Text feedback and instruction.
align_msg
examines the data from each recording event and locates the
first instance of the specified message in the column SAMPLE_MESSAGE.
The function creates a new column containing the aligned series which can be
utilized by subsequent functions for checking and creating the time series
column.
align_msg(data, Msg = NULL)
align_msg(data, Msg = NULL)
data |
A data table object output from |
Msg |
An obligatory string containing the message to be found in the column SAMPLE_MESSAGE or a regular expression for locating the appropriate message. |
A data table object.
## Not run: # To align the samples to a specific message... library(VWPre) df <- align_msg(data = dat, Msg = "ExperimentDisplay") # For a more complete tutorial on VWPre plotting functions: vignette("SR_Message_Alignment", package="VWPre") ## End(Not run)
## Not run: # To align the samples to a specific message... library(VWPre) df <- align_msg(data = dat, Msg = "ExperimentDisplay") # For a more complete tutorial on VWPre plotting functions: vignette("SR_Message_Alignment", package="VWPre") ## End(Not run)
bin_prop
calculates the proportion of looks (samples) to each
interest area in a particular window of time (bin size). This function first
checks to see if the procedure is possible given the sampling rate and
desired bin size. It then performs the calculation and downsampling, returning
new columns corresponding to each interest area ID (e.g., 'IA_1_C', 'IA_1_P').
The extention '_C' indicates the count of samples in the bin and the
extension '_P' indicates the proportion. N.B.: This function will work for
data with a maximum of 8 interest areas.
bin_prop(data, NoIA = NULL, BinSize = NULL, SamplingRate = NULL)
bin_prop(data, NoIA = NULL, BinSize = NULL, SamplingRate = NULL)
data |
A data table object output by |
NoIA |
A positive integer indicating the number of interest areas defined when creating the study. |
BinSize |
A positive integer indicating the size of the binning window (in milliseconds). |
SamplingRate |
A positive integer indicating the sampling rate (in Hertz)
used to record the gaze data, which can be determined with the function
|
A data table with additional columns (the number of which depends on
the number of interest areas specified) added to data
.
## Not run: library(VWPre) # Bin samples and calculation proportions... df <- bin_prop(dat, NoIA = 4, BinSize = 20, SamplingRate = 1000) ## End(Not run)
## Not run: library(VWPre) # Bin samples and calculation proportions... df <- bin_prop(dat, NoIA = 4, BinSize = 20, SamplingRate = 1000) ## End(Not run)
check_all_msgs
outputs all sample messages present in the data.
Optionally, the result can be output to a dataframe containing all event-level
information.
check_all_msgs(data, ReturnData = FALSE)
check_all_msgs(data, ReturnData = FALSE)
data |
A data table object output by |
ReturnData |
A logical indicating whether to return a data table containing Message Time information for each event. |
All Sample Messages found in the data.
## Not run: library(VWPre) # Check all messages in the data... check_all_msgs(data = dat) ## End(Not run)
## Not run: library(VWPre) # Check all messages in the data... check_all_msgs(data = dat) ## End(Not run)
check_eye_recording
quickly checks which eyes contain gaze data
either using the EYE_TRACKED column (if available) or the Right and
Left interest area columns. It prints a summary and
suggests which setting to use for the Recording
parameter in the
function select_recorded_eye
.
check_eye_recording(data)
check_eye_recording(data)
data |
A data table object output by |
Text feedback and instruction.
## Not run: library(VWPre) # Create a unified columns for the gaze data... check_eye_recording(dat) ## End(Not run)
## Not run: library(VWPre) # Create a unified columns for the gaze data... check_eye_recording(dat) ## End(Not run)
check_ia
examines both the interest area IDs and interest area labels
(and their mapping) for both eyes. It returns a summary of the information
and will stop if the data are not consistent with the requirements for
further processing.
check_ia(data)
check_ia(data)
data |
A data table object output by |
The value(s) and label(s) of interest areas and how they map for each eye.
## Not run: library(VWPre) # Check the interest area information... check_ia(dat) ## End(Not run)
## Not run: library(VWPre) # Check the interest area information... check_ia(dat) ## End(Not run)
check_msg_time
examines the time point of a specific Sample Message
for each event. Depending on the format of the data, it will use one of three
columns: TIMESTAMP, Align, or Time. If times at which the message occurs
are equal, it will return a single value. Optionally, the result can be
output to a dataframe containing all event-level information.
check_msg_time(data, Msg = NULL, ReturnData = FALSE)
check_msg_time(data, Msg = NULL, ReturnData = FALSE)
data |
A data table object output by |
Msg |
A character string containing the exact message to be found in the column SAMPLE_MESSAGE or a regular expression for locating the appropriate message. |
ReturnData |
A logical indicating whether to return a data table containing Message Time information for each event. |
The value(s) of Time (in milliseconds) at which the Sample Message is found.
## Not run: library(VWPre) # Check the Sample Message time... check_msg_time(data = dat) ## End(Not run)
## Not run: library(VWPre) # Check the Sample Message time... check_msg_time(data = dat) ## End(Not run)
check_samples_per_bin
determines the number of samples in each
bin produced by bin_prop
.
This function may be helpful for determining the obligatory parameter
'ObsPerBin' which is input to transform_to_elogit
.
check_samples_per_bin(data)
check_samples_per_bin(data)
data |
A data table object output by |
A printed summary of the number of samples in each bin.
## Not run: library(VWPre) # Determine the number of samples per bin... check_samples_per_bin(dat) ## End(Not run)
## Not run: library(VWPre) # Determine the number of samples per bin... check_samples_per_bin(dat) ## End(Not run)
check_samplingrate
determines the sampling rate in the data.
This function is helpful for determining the obligatory parameter input to
bin_prop
. If different sampling rates were used, the
function adds a sampling rate column, which can be used to subset the
data for further processing.
check_samplingrate(data, ReturnData = FALSE)
check_samplingrate(data, ReturnData = FALSE)
data |
A data table object output by |
ReturnData |
A logical indicating whether to return a data table containing a new column called SamplingRate |
A printed summary and/or a data table object
## Not run: library(VWPre) # Determine the sampling rate... check_samplingrate(dat) ## End(Not run)
## Not run: library(VWPre) # Determine the sampling rate... check_samplingrate(dat) ## End(Not run)
check_time_series
examines the first value in the Time column
for each event. If they are equal, it will return a single value. The returned
value(s) will vary depending on the interest period (if defined), message
alignment (if completed), and the Adjustment parameter ('Adj') supplied to
create_time_series
. Optionally, the result can be output to a
dataframe containing all event-level information.
check_time_series(data, ReturnData = FALSE)
check_time_series(data, ReturnData = FALSE)
data |
A data table object output by |
ReturnData |
A logical indicating whether to return a data table containing Start Time information for each event. |
The value(s) of Time (in milliseconds) at which events begin relative to the onset of the auditory stimulus.
## Not run: library(VWPre) # Check the starting Time column... check_time_series(data = dat) ## End(Not run)
## Not run: library(VWPre) # Check the starting Time column... check_time_series(data = dat) ## End(Not run)
create_binomial
uses interest area count columns to create
a success/failure column for each IA which is suitable for logistic regression.
N.B.: This function will work for data with a maximum of 8 interest areas.
create_binomial( data, NoIA = NULL, ObsPerBin = NULL, ObsOverride = FALSE, CustomBinom = NULL )
create_binomial( data, NoIA = NULL, ObsPerBin = NULL, ObsOverride = FALSE, CustomBinom = NULL )
data |
A data table object output by either |
NoIA |
A positive integer indicating the number of interest areas defined when creating the study. |
ObsPerBin |
A positive integer indicating the number of observations to
use in the calculation. Typically, this will be the number of samples per
bin, which can be determined with |
ObsOverride |
A logical value controlling restrictions on the value provided to ObsPerBin. Default value is FALSE. |
CustomBinom |
An optional parameter specifying a vector containing two integers corresponding to the interest area IDs to be combined. |
A data table with additional columns (the number of which depends on
the number of interest areas specified) added to data
.
## Not run: library(VWPre) # Create binomial (success/failure) column... df <- create_binomial(data = dat, NoIA = 4, ObsPerBin = 20) ## End(Not run)
## Not run: library(VWPre) # Create binomial (success/failure) column... df <- create_binomial(data = dat, NoIA = 4, ObsPerBin = 20) ## End(Not run)
create_time_series
standardizes the starting point for each
event, creates a time series for each event including the offset for the
amount of time prior to (or after) the zero point. The time series
is indicated in a new column called Time.
create_time_series(data, Adjust = 0)
create_time_series(data, Adjust = 0)
data |
A data table object output by |
Adjust |
Optionally an integer value or a text string. If an integer (positive or negative), this will indicate an amount of time in milliseconds. The value is subtracted from the time points: positive values shift the zero forward; negative values shift the zero backward. If a text string, this will be the name of a column in the data set which contains values indicating when the critical stimulus was presented relative to the zero point. |
A data table object.
## Not run: library(VWPre) # To create the Time column... df <- create_time_series(data = dat, Adjust = "SoundOnsetColumn") # or df <- create_time_series(data = dat, Adjust = -100) # or df <- create_time_series(data = dat, Adjust = 100) ## End(Not run)
## Not run: library(VWPre) # To create the Time column... df <- create_time_series(data = dat, Adjust = "SoundOnsetColumn") # or df <- create_time_series(data = dat, Adjust = -100) # or df <- create_time_series(data = dat, Adjust = 100) ## End(Not run)
custom_ia
uses a lookup data frame to map Left and Right gaze data
to newly defined/supplied interest areas for each recording event. The lookup data
should contain columns Event, IA_LABEL, IA_ID, Top, Bottom, Left, Right, which
specify the Interest area label, its corresponding ID, and the boundaries (in
pixel values) for each recording event. The function will overwrite existing
columns RIGHT_INTEREST_AREA_LABEL, RIGHT_INTEREST_AREA_ID,
LEFT_INTEREST_AREA_LABEL, and LEFT_INTEREST_AREA_ID.
custom_ia(data, iaLookup = NULL)
custom_ia(data, iaLookup = NULL)
data |
A data table object output by |
iaLookup |
A data frame object containing by-event mapping information. |
A data table object.
## Not run: library(VWPre) # Map gaze data to newly defined interest areas... df <- custom_ia(data = dat, iaLookup = LookUpDF) # For a more complete tutorial on VWPre plotting functions: vignette("SR_Interest_Areas", package="VWPre") ## End(Not run)
## Not run: library(VWPre) # Map gaze data to newly defined interest areas... df <- custom_ia(data = dat, iaLookup = LookUpDF) # For a more complete tutorial on VWPre plotting functions: vignette("SR_Interest_Areas", package="VWPre") ## End(Not run)
ds_options
determines the possible rates to which
the current sampling rate can downsampled. It then prints the
options in both bin size (milliseconds) and corresponding
sampling rate (Hertz).
ds_options(SamplingRate, OutputRates = "Suggested")
ds_options(SamplingRate, OutputRates = "Suggested")
SamplingRate |
A positive integer indicating the sampling rate (in Hertz)
used to record the gaze data, which can be determined with the function
|
OutputRates |
A string ("Suggested" or "All") controlling if all rates are output, or if only whole rates (default) are output. |
A printed summary of options (bin size and rate) for downsampling.
## Not run: library(VWPre) # Determine downsampling options... ds_options(SamplingRate = 1000) ## End(Not run)
## Not run: library(VWPre) # Determine downsampling options... ds_options(SamplingRate = 1000) ## End(Not run)
fasttrack
is a meta-function for advanced users who are
already familiar with the package functions and do not need to take remedial
actions such as recoding interest areas, remapping gaze data, or performing
message alignment.It takes all necessary arguments for the component functions
to produce proportion looks and can output either empirical logits or
binomial data. The function returns a dataframe containing the result of
the series of subroutines.
fasttrack( data = data, Subject = NULL, Item = NA, EventColumns = c("Subject", "TRIAL_INDEX"), NoIA = NoIA, Adjust = 0, Recording = NULL, WhenLandR = NA, BinSize = NULL, SamplingRate = NULL, ObsPerBin = NULL, ObsOverride = FALSE, Constant = 0.5, CustomBinom = NULL, Output = NULL )
fasttrack( data = data, Subject = NULL, Item = NA, EventColumns = c("Subject", "TRIAL_INDEX"), NoIA = NoIA, Adjust = 0, Recording = NULL, WhenLandR = NA, BinSize = NULL, SamplingRate = NULL, ObsPerBin = NULL, ObsOverride = FALSE, Constant = 0.5, CustomBinom = NULL, Output = NULL )
data |
A data frame object created from an Eyelink Sample Report. |
Subject |
An obligatory string containing the column name corresponding to the subject identifier. |
Item |
An optional string containing the column name corresponding to the item identifier; by default, NA. |
EventColumns |
A vector specifying the columns which will be used for creating the Event variable; by default, Subject and TRIAL_INDEX. |
NoIA |
A positive integer indicating the number of interest areas defined when creating the study. |
Adjust |
An integer indicating amount of time in milliseconds by which to offset the time series. |
Recording |
A string indicating which eyes were used for recording gaze data. |
WhenLandR |
A string indicating which eye ("Right" or "Left) to use if gaze data is available for both eyes (i.e., Recording = "LandR"). |
BinSize |
A positive integer indicating the size of the binning window (in milliseconds). |
SamplingRate |
A positive integer indicating the sampling rate (in Hertz) used to record the gaze data. |
ObsPerBin |
A positive integer indicating the desired number of observations to be used in the calculations. |
ObsOverride |
A logical value controlling restrictions on the value provided to ObsPerBin. Default value is FALSE. |
Constant |
A positive number used for the empirical logit and weights calculation; by default, 0.5 as in Barr (2008). |
CustomBinom |
An optional parameter specifying a vector containing two integers corresponding to the interest area IDs to be combined. |
Output |
An obligatory string containing either "ELogit" or "Binomial". |
A data table containing formatting and calculations.
## Not run: library(VWPre) # Perform meta-function on data df <- fasttrack(data = dat, Subject = "RECORDING_SESSION_LABEL", Item = "itemid", EventColumns = c("Subject", "TRIAL_INDEX"), NoIA = 4, Adjust = 100, Recording = "LandR", WhenLandR = "Right", BinSize = 20, SamplingRate = 1000, ObsPerBin = 20, Constant = 0.5, Output = "ELogit") ## End(Not run)
## Not run: library(VWPre) # Perform meta-function on data df <- fasttrack(data = dat, Subject = "RECORDING_SESSION_LABEL", Item = "itemid", EventColumns = c("Subject", "TRIAL_INDEX"), NoIA = 4, Adjust = 100, Recording = "LandR", WhenLandR = "Right", BinSize = 20, SamplingRate = 1000, ObsPerBin = 20, Constant = 0.5, Output = "ELogit") ## End(Not run)
make_pelogit_fnc
creates a function that can transform empirical logit
values back to probability scale using the number of samples and constant
that were used in the original transformation. This function can then be use
to backtransform value predicted by a statistical model.
make_pelogit_fnc(ObsPerBin = NULL, Constant = NULL)
make_pelogit_fnc(ObsPerBin = NULL, Constant = NULL)
ObsPerBin |
A positive integer indicating the number of observations used in the original transformation calculation. |
Constant |
A positive number used in the original transformation calculation. |
A function.
## Not run: library(VWPre) # Make backtransformation function pelogit <- make_pelogit_fnc(20, 0.5) ## End(Not run)
## Not run: library(VWPre) # Make backtransformation function pelogit <- make_pelogit_fnc(20, 0.5) ## End(Not run)
mark_trackloss
marks data points related to trackloss for those in
blink, off-screen, or both.
mark_trackloss(data, Type = NULL, ScreenSize = NULL)
mark_trackloss(data, Type = NULL, ScreenSize = NULL)
data |
A data table object output by |
Type |
A string indicating "Blink", "OffScreen", or "Both". |
ScreenSize |
A numeric vector specifying (in pixels) the dimensions of the x and y of the screen used during the experiment. |
An object of type data table as described in tibble.
## Not run: library(VWPre) # Mark trackloss... df <- mark_trackloss(data = dat, Type = "Both", ScreenSize = c(1920, 1080)) ## End(Not run)
## Not run: library(VWPre) # Mark trackloss... df <- mark_trackloss(data = dat, Type = "Both", ScreenSize = c(1920, 1080)) ## End(Not run)
plot_avg
calculates the grand or conditional averages of
looks to each interest area along with standard error. It then plots the results.
N.B.: This function will work for data with a maximum of 8 interest areas
and 2 conditions.
plot_avg( data, type = NULL, xlim = NA, IAColumns = NULL, Averaging = "Event", Condition1 = NULL, Condition2 = NULL, Cond1Labels = NA, Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ConfLev = 95, CItype = "simultaneous", ErrorBand = FALSE, ErrorType = "SE" )
plot_avg( data, type = NULL, xlim = NA, IAColumns = NULL, Averaging = "Event", Condition1 = NULL, Condition2 = NULL, Cond1Labels = NA, Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ConfLev = 95, CItype = "simultaneous", ErrorBand = FALSE, ErrorType = "SE" )
data |
A data table object output by either |
type |
A character string indicating "proportion" or "elogit" which influences how standard error and confidence intervals are calculated. |
xlim |
A vector of two integers specifying the limits of the x-axis. |
IAColumns |
A named character vector specifying the desired interest area columns with custom strings for the legend. |
Averaging |
A character string indicating how the averaging should be done. "Event" (default) will produce the overall mean in the data, while "Subject" or "Item" (or, in principle, any other column name) will calculate the grand mean by that factor. |
Condition1 |
A string containing the column name corresponding to the first condition, if available. |
Condition2 |
A string containing the column name corresponding to the second condition, if available. |
Cond1Labels |
A named character vector specifying the desired custom labels of the levels of the first condition. |
Cond2Labels |
A named character vector specifying the desired custom labels of the levels of the second condition. |
ErrorBar |
A logical indicating whether error bars should be included in the plot. |
VWPreTheme |
A logical indicating whether the theme included with the function should be applied, or ggplot2's base theme (to which any other custom theme could be added). |
ConfLev |
A number indicating the confidence level of the CI. |
CItype |
A string indicating "simultaneous" or "pointwise". Simultaneous performs a Bonferroni correction for the interval. |
ErrorBand |
A logical indicating whether error bands should be included in the plot. |
ErrorType |
A string indicating "SE" or "CI". For SE, the calculation varies for empirical logits and proportions. Further, for CI, the calculation on proportions uses the Wald method. |
## Not run: library(VWPre) # For plotting the grand average with the included theme and SE bars plot_avg(data = dat, type = "elogit", xlim = c(0, 1000), IAColumns = c(IA_1_ELogit = "Target", IA_2_ELogit = "Rhyme", IA_3_ELogit = "OnsetComp", IA_4_ELogit = "Distractor"), Averaging = "Event", Condition1 = NA, Condition2 = NA, Cond1Labels = NA, Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ErrorType = "SE", ErrorBand = FALSE) # For plotting conditional averages (one condition) with the included theme # and 95% simultaneous CI bars. # This produces plots arranged horizontally plot_avg(data = dat, type = "elogit", xlim = c(0, 1000), IAColumns = c(IA_1_ELogit = "Target", IA_2_ELogit = "Rhyme", IA_3_ELogit = "OnsetComp", IA_4_ELogit = "Distractor"), Averaging = "Event", Condition1 = NA, Condition2 = "talker", Cond1Labels = NA, Cond2Labels = c(CH1 = "Chinese 1", CH10 = "Chinese 3", CH9 = "Chinese 2", EN3 = "English 1"), ErrorBar = TRUE, VWPreTheme = TRUE, ErrorBand = FALSE, ErrorType = "CI", ConfLev = 95, CItype = "simultaneous") # For plotting conditional averages (two conditions) for one interest area with the included theme and 95% simultaneous CI bands. # This produces plots arranged in grid format. plot_avg(data = dat, type = "elogit", xlim = c(0, 1000), IAColumns = c(IA_1_ELogit = "Target"), Averaging = "Event", Condition1 = "talker", Condition2 = "Exp", Cond1Labels = c(CH1 = "Chinese 1", CH10 = "Chinese 3", CH9 = "Chinese 2", EN3 = "English 1"), Cond2Labels = c(High = "H Exp", Low = "L Exp"), ErrorBar = FALSE, VWPreTheme = TRUE, ErrorBand = TRUE, ErrorType = "CI", ConfLev = 95, CItype = "simultaneous") #' # For a more complete tutorial on VWPre plotting functions: vignette("SR_Plotting", package="VWPre") ## End(Not run)
## Not run: library(VWPre) # For plotting the grand average with the included theme and SE bars plot_avg(data = dat, type = "elogit", xlim = c(0, 1000), IAColumns = c(IA_1_ELogit = "Target", IA_2_ELogit = "Rhyme", IA_3_ELogit = "OnsetComp", IA_4_ELogit = "Distractor"), Averaging = "Event", Condition1 = NA, Condition2 = NA, Cond1Labels = NA, Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ErrorType = "SE", ErrorBand = FALSE) # For plotting conditional averages (one condition) with the included theme # and 95% simultaneous CI bars. # This produces plots arranged horizontally plot_avg(data = dat, type = "elogit", xlim = c(0, 1000), IAColumns = c(IA_1_ELogit = "Target", IA_2_ELogit = "Rhyme", IA_3_ELogit = "OnsetComp", IA_4_ELogit = "Distractor"), Averaging = "Event", Condition1 = NA, Condition2 = "talker", Cond1Labels = NA, Cond2Labels = c(CH1 = "Chinese 1", CH10 = "Chinese 3", CH9 = "Chinese 2", EN3 = "English 1"), ErrorBar = TRUE, VWPreTheme = TRUE, ErrorBand = FALSE, ErrorType = "CI", ConfLev = 95, CItype = "simultaneous") # For plotting conditional averages (two conditions) for one interest area with the included theme and 95% simultaneous CI bands. # This produces plots arranged in grid format. plot_avg(data = dat, type = "elogit", xlim = c(0, 1000), IAColumns = c(IA_1_ELogit = "Target"), Averaging = "Event", Condition1 = "talker", Condition2 = "Exp", Cond1Labels = c(CH1 = "Chinese 1", CH10 = "Chinese 3", CH9 = "Chinese 2", EN3 = "English 1"), Cond2Labels = c(High = "H Exp", Low = "L Exp"), ErrorBar = FALSE, VWPreTheme = TRUE, ErrorBand = TRUE, ErrorType = "CI", ConfLev = 95, CItype = "simultaneous") #' # For a more complete tutorial on VWPre plotting functions: vignette("SR_Plotting", package="VWPre") ## End(Not run)
plot_avg_cdiff
calculates the average of differences between
two specified conditions along with standard error and then plots the
results.
plot_avg_cdiff( data, IAColumn = NULL, xlim = NA, type = NULL, Averaging = "Subject", Condition = NULL, CondLabels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ConfLev = 95, CItype = "simultaneous", ErrorBand = FALSE, ErrorType = "SE" )
plot_avg_cdiff( data, IAColumn = NULL, xlim = NA, type = NULL, Averaging = "Subject", Condition = NULL, CondLabels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ConfLev = 95, CItype = "simultaneous", ErrorBand = FALSE, ErrorType = "SE" )
data |
A data table object output by either |
IAColumn |
A character vector specifying the desired column corresponding to the interest area. |
xlim |
A vector of two integers specifying the limits of the x-axis. |
type |
A character string indicating "proportion" or "elogit", which influences how standard error and confidence intervals are calculated. |
Averaging |
A character string indicating how the averaging should be done. "Subject" (default) will produce the grand mean in the data, while "Item" (or, in principle, any other column name) will calculate the grand mean by that factor. |
Condition |
A list containing the column name corresponding to the condition and factor levels to be used for calculating the difference. |
CondLabels |
A named character vector specifying the desired labels of the levels of the condition. |
ErrorBar |
A logical indicating whether error bars should be included in the plot. |
VWPreTheme |
A logical indicating whether the theme included with the function should be applied, or ggplot2's base theme (to which any other custom theme could be added). |
ConfLev |
A number indicating the confidence level of the CI. |
CItype |
A string indicating "simultaneous" or "pointwise". Simultaneous performs a Bonferroni correction for the interval. |
ErrorBand |
A logical indicating whether error bands should be included in the plot. |
ErrorType |
A string indicating "SE" or "CI". |
## Not run: library(VWPre) # For plotting average difference between conditions... plot_avg_cdiff(data = dat, xlim = c(0, 1000), type = "proportion", IAColumn = "IA_1_P", Condition = list(talker = c("EN3", "CH1")), CondLabels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ErrorBand = FALSE, ErrorType = "SE") # For a more complete tutorial on VWPre plotting functions: vignette("SR_Plotting", package="VWPre") ## End(Not run)
## Not run: library(VWPre) # For plotting average difference between conditions... plot_avg_cdiff(data = dat, xlim = c(0, 1000), type = "proportion", IAColumn = "IA_1_P", Condition = list(talker = c("EN3", "CH1")), CondLabels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ErrorBand = FALSE, ErrorType = "SE") # For a more complete tutorial on VWPre plotting functions: vignette("SR_Plotting", package="VWPre") ## End(Not run)
plot_avg_contour
calculates the conditional average of proportions or
empirical logit looks to a given interest area by Time and a specified
continuous variable. It then applies a 3D smooth
(derived using gam
) over
the surface and plots the results as a contour plot.
plot_avg_contour( data, IA = NULL, type = NULL, Var = NULL, Averaging = "Event", VarLabel = NULL, xlim = NA, VWPreTheme = TRUE, Colors = c("gray20", "gray90") )
plot_avg_contour( data, IA = NULL, type = NULL, Var = NULL, Averaging = "Event", VarLabel = NULL, xlim = NA, VWPreTheme = TRUE, Colors = c("gray20", "gray90") )
data |
A data table object output by either |
IA |
A string specifying the column name of the IA to use. |
type |
A character string indicating "proportion" or "elogit". |
Var |
A string containing the column name corresponding to the continuous variable. |
Averaging |
A character string indicating how the averaging should be done. "Event" (default) will produce the overall mean in the data, while "Subject" or "Item" (or, in principle, any other column name) will calculate the grand mean by that factor. |
VarLabel |
A string specifying the axis label to use for |
xlim |
A vector of two integers specifying the limits of the x-axis. |
VWPreTheme |
A logical indicating whether the theme included with the function, or ggplot2's base theme (which any other custom theme could be added). |
Colors |
A vector of two strings specifying the colrs of the contour shading - The default values represent grayscale. |
## Not run: library(VWPre) # For plotting a conditional contour surface... plot_avg_contour(data = dat, IA = "IA_1_ELogit", type = "elogit", Var = "Rating", VarLabel = "Accent Rating", xlim = c(0,1000), VWPreTheme = FALSE, Colors = c("red", "white")) # For a more complete tutorial on VWPre plotting functions: vignette("SR_Plotting", package="VWPre") ## End(Not run)
## Not run: library(VWPre) # For plotting a conditional contour surface... plot_avg_contour(data = dat, IA = "IA_1_ELogit", type = "elogit", Var = "Rating", VarLabel = "Accent Rating", xlim = c(0,1000), VWPreTheme = FALSE, Colors = c("red", "white")) # For a more complete tutorial on VWPre plotting functions: vignette("SR_Plotting", package="VWPre") ## End(Not run)
plot_avg_diff
calculates the grand or conditional averages of
differences between looks to two interest area along with standard error.
It then plots the results.
plot_avg_diff( data, DiffCols = NULL, xlim = NA, type = NULL, Averaging = "Event", Condition1 = NULL, Condition2 = NULL, Cond1Labels = NA, Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ConfLev = 95, CItype = "simultaneous", ErrorBand = FALSE, ErrorType = "SE" )
plot_avg_diff( data, DiffCols = NULL, xlim = NA, type = NULL, Averaging = "Event", Condition1 = NULL, Condition2 = NULL, Cond1Labels = NA, Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ConfLev = 95, CItype = "simultaneous", ErrorBand = FALSE, ErrorType = "SE" )
data |
A data table object output by either |
DiffCols |
A named character vector specifying the desired columns corresponding to the interest areas. |
xlim |
A vector of two integers specifying the limits of the x-axis. |
type |
A character string indicating "proportion" or "elogit" which influences how standard error and confidence intervals are calculated. |
Averaging |
A character string indicating how the averaging should be done. "Event" (default) will produce the overall mean in the data, while "Subject" or "Item" (or, in principle, any other column name) will calculate the grand mean by that factor. |
Condition1 |
A string containing the column name corresponding to the first condition, if available. |
Condition2 |
A string containing the column name corresponding to the second condition, if available. |
Cond1Labels |
A named character vector specifying the desired labels of the levels of the first condition. |
Cond2Labels |
A named character vector specifying the desired labels of the levels of the second condition. |
ErrorBar |
A logical indicating whether error bars should be included in the plot. |
VWPreTheme |
A logical indicating whether the theme included with the function should be applied, or ggplot2's base theme (to which any other custom theme could be added). |
ConfLev |
A number indicating the confidence level of the CI. |
CItype |
A string indicating "simultaneous" or "pointwise". Simultaneous performs a Bonferroni correction for the interval. |
ErrorBand |
A logical indicating whether error bands should be included in the plot. |
ErrorType |
A string indicating "SE" or "CI". |
## Not run: library(VWPre) # For plotting average differences with SE bars... plot_avg_diff(data = dat, xlim = c(0, 1000), type = "proportion", DiffCols = c(IA_1_P = "Target", IA_2_P = "Rhyme"), Condition1 = NA, Condition2 = NA, Cond1Labels = NA, Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ErrorBand = FALSE, ErrorType = "SE") # For plotting conditional average differences (one condition) with the # included theme and 95% pointwise CI bars. plot_avg_diff(data = dat, xlim = c(0, 1000), , type = "proportion", DiffCols = c(IA_1_P = "Target", IA_2_P = "Rhyme"), Condition1 = "talker", Condition2 = NA, Cond1Labels = c(CH1 = "Chinese 1", CH10 = "Chinese 3", CH9 = "Chinese 2", EN3 = "English 1"), Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ErrorBand = FALSE, ErrorType = "CI", ConfLev = 95, CItype = "pointwise") # For plotting conditional average differences (two conditions) with the # included theme and 95% simultaneous CI bands. plot_avg_diff(data = dat, xlim = c(0, 1000), , type = "proportion", DiffCols = c(IA_1_P = "Target", IA_2_P = "Rhyme"), Condition1 = "talker", Condition2 = "Exp", Cond1Labels = c(CH1 = "Chinese 1", CH10 = "Chinese 3", CH9 = "Chinese 2", EN3 = "English 1"), Cond2Labels = c(High = "H Exp", Low = "L Exp"), ErrorBar = FALSE, VWPreTheme = TRUE, ErrorBand = TRUE, ErrorType = "CI", ConfLev = 95, CItype = "simultaneous") # For a more complete tutorial on VWPre plotting functions: vignette("SR_Plotting", package="VWPre") ## End(Not run)
## Not run: library(VWPre) # For plotting average differences with SE bars... plot_avg_diff(data = dat, xlim = c(0, 1000), type = "proportion", DiffCols = c(IA_1_P = "Target", IA_2_P = "Rhyme"), Condition1 = NA, Condition2 = NA, Cond1Labels = NA, Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ErrorBand = FALSE, ErrorType = "SE") # For plotting conditional average differences (one condition) with the # included theme and 95% pointwise CI bars. plot_avg_diff(data = dat, xlim = c(0, 1000), , type = "proportion", DiffCols = c(IA_1_P = "Target", IA_2_P = "Rhyme"), Condition1 = "talker", Condition2 = NA, Cond1Labels = c(CH1 = "Chinese 1", CH10 = "Chinese 3", CH9 = "Chinese 2", EN3 = "English 1"), Cond2Labels = NA, ErrorBar = TRUE, VWPreTheme = TRUE, ErrorBand = FALSE, ErrorType = "CI", ConfLev = 95, CItype = "pointwise") # For plotting conditional average differences (two conditions) with the # included theme and 95% simultaneous CI bands. plot_avg_diff(data = dat, xlim = c(0, 1000), , type = "proportion", DiffCols = c(IA_1_P = "Target", IA_2_P = "Rhyme"), Condition1 = "talker", Condition2 = "Exp", Cond1Labels = c(CH1 = "Chinese 1", CH10 = "Chinese 3", CH9 = "Chinese 2", EN3 = "English 1"), Cond2Labels = c(High = "H Exp", Low = "L Exp"), ErrorBar = FALSE, VWPreTheme = TRUE, ErrorBand = TRUE, ErrorType = "CI", ConfLev = 95, CItype = "simultaneous") # For a more complete tutorial on VWPre plotting functions: vignette("SR_Plotting", package="VWPre") ## End(Not run)
plot_indiv_app
calculates and plots interest area averages for a
given subject/item.
plot_indiv_app(data)
plot_indiv_app(data)
data |
A data table object output by either |
## Not run: library(VWPre) # For plotting subject/item averages plot_indiv_app(data = dat) ## End(Not run)
## Not run: library(VWPre) # For plotting subject/item averages plot_indiv_app(data = dat) ## End(Not run)
plot_transformation_app
plots the empirical logit values for a
given number of observations and constant against proportions, in order
to examine the effect of these variables on the resulting transformation.
plot_transformation_app()
plot_transformation_app()
## Not run: library(VWPre) # For plotting the empirical logit transformation plot_transformation_app() ## End(Not run)
## Not run: library(VWPre) # For plotting the empirical logit transformation plot_transformation_app() ## End(Not run)
plot_var_app
calculates and plots within-subject/item standard deviation,
along with standardized by-subject/item means for a given interest area, within
a given time window.
plot_var_app(data)
plot_var_app(data)
data |
A data table object output by either |
## Not run: library(VWPre) # For plotting variability in the data plot_var_app(data = dat) ## End(Not run)
## Not run: library(VWPre) # For plotting variability in the data plot_var_app(data = dat) ## End(Not run)
prep_data
converts the data frame to a data table and examines the
required columns (RECORDING_SESSION_LABEL, LEFT_INTEREST_AREA_ID,
RIGHT_INTEREST_AREA_ID, LEFT_INTEREST_AREA_LABEL, RIGHT_INTEREST_AREA_LABEL,
TIMESTAMP, and TRIAL_INDEX) and optional columns (SAMPLE_MESSAGE, LEFT_GAZE_X,
LEFT_GAZE_Y, RIGHT_GAZE_X, and RIGHT_GAZE_Y). It renames the subject and item
columns, ensures required/optional columns are of the appropriate data class,
and creates a new column called Event which indexes each unique
series of samples corresponding to the combination of Subject and
TRIAL_INDEX (can be changed), necessary for performing subsequent operations.
prep_data( data, Subject = NULL, Item = NA, EventColumns = c("Subject", "TRIAL_INDEX") )
prep_data( data, Subject = NULL, Item = NA, EventColumns = c("Subject", "TRIAL_INDEX") )
data |
A data frame object created from an Eyelink Sample Report. |
Subject |
An obligatory string containing the column name corresponding to the subject identifier. |
Item |
An optional string containing the column name corresponding to the item identifier; by default, NA. |
EventColumns |
A vector specifying the columns which will be used for creating the Event variable; by default, Subject and TRIAL_INDEX. |
An object of type data table as described in tibble.
## Not run: # Typical DataViewer output contains a column called "RECORDING_SESSION_LABEL" # corresponding to the subject. # To prepare the data... library(VWPre) df <- prep_data(data = dat, Subject = "RECORDING_SESSION_LABEL", Item = "ItemCol") ## End(Not run)
## Not run: # Typical DataViewer output contains a column called "RECORDING_SESSION_LABEL" # corresponding to the subject. # To prepare the data... library(VWPre) df <- prep_data(data = dat, Subject = "RECORDING_SESSION_LABEL", Item = "ItemCol") ## End(Not run)
recode_ia
replaces existing interest area IDs and/or labels for both
eyes. For subsequent data processing, it is important that the ID values range
between 0 and 8 (with 0 representing Outside all predefined interest areas).
recode_ia(data, IDs = NULL, Labels = NULL)
recode_ia(data, IDs = NULL, Labels = NULL)
data |
A data table object output by |
IDs |
A named character vector specifying the desired interest area IDs and the corresponding existing IDs where the first element is the old value and the second element is the new value. |
Labels |
A named character vector specifying the desired interest area labels and the corresponding existing labels where the first element is the old value and the second element is the new value. |
A data table with the same dimensions as data
.
## Not run: library(VWPre) # To recode both IDs and Labels... df <- recode_ia(data=dat, IDs=c("234"="2", "0"="0", "35"="3", "11"="1", "4"="6666"), Labels=c(Outside="Outside", Target="NewTargName", Dist2="NewDist2Name", Comp="NewCompName", Dist1="NewDist1Name")) # For a more complete tutorial on VWPre plotting functions: vignette("SR_Interest_Areas", package="VWPre") ## End(Not run)
## Not run: library(VWPre) # To recode both IDs and Labels... df <- recode_ia(data=dat, IDs=c("234"="2", "0"="0", "35"="3", "11"="1", "4"="6666"), Labels=c(Outside="Outside", Target="NewTargName", Dist2="NewDist2Name", Comp="NewCompName", Dist1="NewDist1Name")) # For a more complete tutorial on VWPre plotting functions: vignette("SR_Interest_Areas", package="VWPre") ## End(Not run)
relabel_na
examines interest area columns (LEFT_INTEREST_AREA_ID,
RIGHT_INTEREST_AREA_ID, LEFT_INTEREST_AREA_LABEL, and RIGHT_INTEREST_AREA_LABEL)
for cells containing NAs. If NA, the missing values in the ID columns are
relabeled as 0 and missing values in the LABEL columns are relabeled as 'Outside'.
relabel_na(data, NoIA = NULL)
relabel_na(data, NoIA = NULL)
data |
A data table object output by |
NoIA |
A positive integer indicating the number of interest areas defined when creating the study. |
A data table with the same dimensions as data
.
## Not run: library(VWPre) # To relabel the NAs... df <- relabel_na(data = dat, NoIA = 4) ## End(Not run)
## Not run: library(VWPre) # To relabel the NAs... df <- relabel_na(data = dat, NoIA = 4) ## End(Not run)
rename_columns
will replace the default numerical coding of the
interest area columns with more meaningful user-specified names. For example,
IA_1_C and IA_1_P could be converted to IA_Target_C and IA_Target_P. Again,
this will work for upto 8 interest areas.
rename_columns(data, Labels = NULL)
rename_columns(data, Labels = NULL)
data |
A data table object output by either |
Labels |
A named character vector specifying the interest areas and the desired names to be inserted in place of the numerical labelling. |
A data table object with renamed columns.
## Not run: library(VWPre) # For renaming default interest area columns dat2 <- rename_columns(dat, Labels = c(IA1="Target", IA2="Rhyme", IA3="OnsetComp", IA4="Distractor")) ## End(Not run)
## Not run: library(VWPre) # For renaming default interest area columns dat2 <- rename_columns(dat, Labels = c(IA1="Target", IA2="Rhyme", IA3="OnsetComp", IA4="Distractor")) ## End(Not run)
rm_extra_DVcols
checks for unnecessary DataViewer output columns and
removes them, unless specified.
rm_extra_DVcols(data, Keep = NULL)
rm_extra_DVcols(data, Keep = NULL)
data |
A data frame object created from an Eyelink Sample Report. |
Keep |
An optional string or character vector containing the column names of SR sample report columns the user would like to keep in the data set. |
An object of type data table as described in tibble.
## Not run: library(VWPre) df <- rm_extra_DVcols(data = dat, Keep = NULL) ## End(Not run)
## Not run: library(VWPre) df <- rm_extra_DVcols(data = dat, Keep = NULL) ## End(Not run)
rm_trackloss_events
removes events with less data than the specified
amount.
rm_trackloss_events(data = data, RequiredData = NULL)
rm_trackloss_events(data = data, RequiredData = NULL)
data |
A data table object output by |
RequiredData |
A number indicating the percentage of data required to be included (i.e., removes events with less than this amount of data). |
An object of type data table as described in tibble.
## Not run: library(VWPre) # Remove events... df <- rm_trackloss_events(data = dat, RequiredData = 50) ## End(Not run)
## Not run: library(VWPre) # Remove events... df <- rm_trackloss_events(data = dat, RequiredData = 50) ## End(Not run)
select_recorded_eye
examines each event and determines which eye contains
interest area information, based on the Recording
parameter (which
can be determined using check_eye_recording
).
This function then selects the data from the recorded eye and copies it to
new columns (IA_ID, IA_LABEL, IA_Data). The function prints a summary of the output.
select_recorded_eye(data, Recording = NULL, WhenLandR = NA)
select_recorded_eye(data, Recording = NULL, WhenLandR = NA)
data |
A data table object output by |
Recording |
A string indicating which eyes were used for recording gaze data ("R" when only right eye recording is present, "L" when only left eye recording is present, "LorR" when either the left or the right eye was recorded, "LandR" when both the left and the right eyes were recorded). |
WhenLandR |
A string indicating which eye ("Right" or "Left) to use if gaze data is available for both eyes (i.e., Recording = "LandR"). |
A data table with four additional columns ('EyeRecorded', 'EyeSelected',
'IA_ID', 'IA_LABEL', 'IA_Data') added to data
.
## Not run: library(VWPre) # Create a unified columns for the gaze data... df <- select_recorded_eye(data = dat, Recording = "LandR", WhenLandR = "Right") ## End(Not run)
## Not run: library(VWPre) # Create a unified columns for the gaze data... df <- select_recorded_eye(data = dat, Recording = "LandR", WhenLandR = "Right") ## End(Not run)
transform_to_elogit
transforms the proportion of looks for
each interest area to empirical logits. Proportions are inherently bound
between 0 and 1 and are therefore not suitable for some types of analysis.
Logits provide an unbounded measure, though range from negative infinity to
infinity, so it is important to know that this logit function adds a constant
(hence, empirical logit). Additionally this calculates weights which estimate
the variance in each bin (because the variance of the logit depends on the
mean). This is important for regression analyses. N.B.: This function will
work for data with a maximum of 8 interest areas.
transform_to_elogit( data, NoIA = NULL, ObsPerBin = NULL, Constant = 0.5, ObsOverride = FALSE )
transform_to_elogit( data, NoIA = NULL, ObsPerBin = NULL, Constant = 0.5, ObsOverride = FALSE )
data |
A data table object output by |
NoIA |
A positive integer indicating the number of interest areas defined when creating the study. |
ObsPerBin |
A positive integer indicating the number of observations to
use in the calculation. Typically, this will be the number of samples per
bin, which can be determined with |
Constant |
A positive number used for the empirical logit and weights calculation; by default, 0.5 as in Barr (2008). |
ObsOverride |
A logical value controlling restrictions on the value provided to ObsPerBin. Default value is FALSE. |
These calculations were adapted from: Barr, D. J., (2008) Analyzing 'visual world' eyetracking data using multilevel logistic regression, Journal of Memory and Language, 59(4), 457–474.
A data table with additional columns (the number of which depends on
the number of interest areas specified) added to data
.
## Not run: library(VWPre) # Convert proportions to empirical logits and calculate weights... df <- transform_to_elogit(dat, NoIA = 4, ObsPerBin = 20, Constant = 0.5) ## End(Not run)
## Not run: library(VWPre) # Convert proportions to empirical logits and calculate weights... df <- transform_to_elogit(dat, NoIA = 4, ObsPerBin = 20, Constant = 0.5) ## End(Not run)
This is a sample eye-tracking dataset included in the package
Vincent Porretta
The VWPre package provides a set of functions for preparing Visual World data collected with SR Research Eyelink eye trackers.
The function create_time_series
returns a time columns
in milliseconds.
The function prep_data
returns a data table with
correctly assigned classes for important columns.
The function relabel_na
returns a data table with
samples containing 'NA' relabeled as outside any interest area.
The function recode_ia
returns a data table containing
recoded interest area IDs and/or interest area labels.
The function select_recorded_eye
returns a data table
with data from the the recorded eye in new columns (IA_ID and IA_LABEL).
The function custom_ia
returns a data table
with gaze data remapped to new interest areas.
The function align_msg
returns a data table
with newly aligned sample data in a new column (Align).
The function rm_extra_DVcols
removed DataViewer coumns
that are not necessary for preprocessing with this package.
The function bin_prop
returns a downsampled data table
containing proportion of looks (samples) to each interest area in a
particular window of time (bin size).
The function transform_to_elogit
returns a data table
with proportion looks transformed to empirical logits with weights.
The function create_binomial
returns a data table with
a new success/failure column for each IA which is suitable for logistic
regression.
The function mark_trackloss
returns a data table
with data information regarding trackloss of the sample.
The function rm_trackloss_events
returns a data table
from which events without the minimum amount of quality data have been
removed.
The function fasttrack
a meta-function that returns a
data table of processed data containing the result of the series of
necessary subroutines. This is intended for experienced users doing basic
preprocessing.
The function check_eye_recording
returns a summary
of whether or not the dataset contains gaze data in both the Right and
Left interest area columns.
The function check_time_series
returns the first value
in the Time column for each event.
The function check_samples_per_bin
returns the number
of samples in each bin.
The function check_samplingrate
returns the value
corresponding to the sampling rate in the data.
The function ds_options
returns the binning
(downsampling) options possible for the given sampling rate.
The function check_ia
returns a summary of the
interest area IDs and Labels present in the data.
The function check_msg_time
returns a summary of the
the time value at a given sample message for each recording event.
The function check_all_msgs
returns all messages
in the data and their time stamp.
The function plot_avg
returns a plot of the grand
or conditional averages of proportion (or empirical logit) looks to each
interest area along with error bars.
The function plot_avg_contour
returns a contour plot of
the conditional average of proportion (or empirical logit) looks to a
given interest area over Time and a specified continuous variable.
The function plot_avg_diff
returns a plot of the grand
or conditional averages of the difference between looks to two interest
areas (proportions or empirical logits) with error bars.
The function plot_avg_cdiff
returns a plot of the
average difference between two conditions for looks to a given interest
area (proportions or empirical logits) with error bars.
The function make_pelogit_fnc
returns a function
that can backtransform predicted empirical logit to probability scale,
particularly (though not exclusively) useful for plotting purposes.
The function plot_transformation_app
opens a Shiny app
for visualizing the effect of both number of observations and constant on
the results of the empirical logit transformation and weight calculations.
The function plot_indiv_app
opens a Shiny app for
inspecting by-subject or by-item averages for all interest areas, alongside
the grand average (for proportion or empirical logit looks) within a
specified time window.
The function plot_var_app
opens a Shiny app for
inspecting by-subject or by-item Z-scores with respect to the overall mean
for a given interest area within a specified time window.
The vignettes are available via browseVignettes(package = "VWPre")
.
A list of all available functions is provided in
help(package = "VWPre")
.
This package can be cited using the information obtained from
citation("VWPre") or print(citation("VWPre"), bibtex = TRUE)
Vincent Porretta, Aki-Juhani Kyröläinen, Jacolien van Rij, Juhani Järvikivi
Maintainer: Vincent Porretta ([email protected])
University of Windsor, Canada