Pre-analysis

Pre-analysis is the creation of neural activity ndarrays (Y_mat: with dimensions n_chans x n_timepoints x n_trials)) and lists of stimulus conditions.

After finding stimulus onset times and putting subject metadata into intonation_subject_data.py, the pre-analysis pipeline can be run with save_Y_mat_sns_sts_sps_for_subject_number.

  1. For each block, find stimulus onsets using cross-correlation between stimulus waveform and recorded speaker/microphone channel (This uses Liberty’s event dectection code). These are then saved to the .mat file for that block in the variable times.

  2. For each subject, get metadata from intonation_subject_data.py about what the block numbers were and which stimulus set was played during that block.

  3. For each subject, load .mat data (hg, times, bcs, badTimeSegments) for each of their blocks along with stimulus set information.

  4. To get Y_mat for each subject:
    • neural activity time-locked to stimulus onset is extracted from the hg time series.
    • a moving average is used to reduce the number of samples in time and reduce timepoint by timepoint variability.
    • trials that overlap with bad time segments (marked from preprocessing) are excluded.

At the end of pre-analysis, a Y_mat.mat file is created that contains five variables

  1. Y_mat (ndarray): ndarray with neural activity that is time-averaged in a moving window. Default settings use a window size of 60ms moving in 30ms steps that

    starts with a center of -150ms before stimulus onset (window from -180ms to -130ms) and ends with a center of +2850ms after stimulus onset (+650ms after stimulus offset).

  2. sns (list): list of sentence conditions which are integers 1, 2, 3, or 4. (or 1, 2, 3, 4, 5 for non-speech control)

  3. sts (list): list of intonation conditions (sts is short for “sentence types”) which are integers 1, 2, 3, or 4 for both speech and non-speech controls.

  4. sps (list): list of speaker conditions which are integers 1, 2, 3 for speech and 1, 2 for non-speech control.

  5. Y_mat_plotter (ndarray): neural activity ndarray used by the Plotter. With default settings, this contains the high-gamma

    from 250ms before stimulus onset to 550ms after stimulus offset for a total window duration of 3s (stimulus duration is 2.2s). dimensions are n_chans x n_timepoints x n_trials

Two toggleable options for generating the Y_mat.mat files using save_Y_mat_sns_sts_sps_for_subject_number are the two parameters:

  1. control_stim (bool): whether to generate the Y_mat.mat file for this subject’s non-speech control blocks.

  2. zscore_to_silence (bool): whether to z-score neural activity to a silent baseline (otherwise z-score to the entire block).

    The silent baseline consists of silent periods within the intertrial interval that exclude the first 500ms after stimulus offset.

intonatang.intonation_preanalysis.get_all_good_trials(good_trials, control_stim=False, missing_f0_stim=False)

From list of good trials for each block, return one list of all good trials.

The numbers in all_good_trials are used to index into data which is concatenated from all blocks. For example, for two blocks with 96 trials in which the first two trials of each block are good, all_good_trials would be [0, 1, 96, 97]

Parameters:
  • good_trials (list of lists) – list of good trials for each block, returned from get_times_hg_for_subject_number
  • control_stim (bool) – whether using non-speech control stimuli
Returns:

  • all_good_trials: one list of all good trials

Return type:

(list)

intonatang.intonation_preanalysis.get_bcs(subject_number)

Returns list of all bad channels for each subject

Bad channels are channels that were bad in at least one block.

Parameters:subject_number (int) – xx in ECxx
Returns:
  • bcs: list of bad channels
Return type:(list of ints)
intonatang.intonation_preanalysis.get_centers(back=15, forward=285, window=6)

Returns a numpy array of center indexes (no parameters needed for default)

The returned list of centers starts at -1 * back and steps every half window. centers will end at forward if forward can be reached from -1 * back in half window steps.

The default arguments (back=15, forward=285, and window=6) returns a length 101 ndarray.

intonatang.intonation_preanalysis.get_concatenated_data(times_list, hg_list, hz=100, back=0, forward=250, zscore=True, zscore_to_silence=True)

Use to get time locked activity for multiple blocks.

Parameters:
  • times_list – list of individual times variables where times[0] are start times_list
  • hg_list – list of hg for blocks in same order as times_list
intonatang.intonation_preanalysis.get_full_data_path_for_subject_number_and_block(subject_number, block, data_path_different=None)

Returns path to the .mat data for each subject and block combination.

adds “ECxx/ECxx_Bxxx/ECxx_Bxxx.mat” where xx is the subject_number and xxx is the block to the data_path. data_path is defined at the top of this file as a global constant. A different data path can also be passed in through the parameter data_path_different

Parameters:
  • subject_number (int) – xx in ECxx
  • block (int) – block number
  • data_path_different (str) – optional if path to data is not the globally defined data_path
Returns:

  • full_data_path: data_path (or data_path_different when passed in) + “ECxx/ECxx_Bxxx/ECxx_Bxxx.mat”

Return type:

(str)

intonatang.intonation_preanalysis.get_gcs(subject_number)

Returns list of all good channels for each subject

Good channels are channels that were good in all blocks (said another way, they are channels which were never bad).

Parameters:subject_number (int) – xx in ECxx
Returns:
  • gcs: list of good channels
Return type:(list of ints)
intonatang.intonation_preanalysis.get_sentence_numbers_sentence_types_speakers_for_subject_number(subject_number, good_trials=None, control_stim=False, missing_f0_stim=False)

Returns lists of sentence, intonation, and speaker conditions with bad trials removed for a given subject

Each list of conditions contains integers from 1 to 4, 1 to 4, and 1 to 3, respectively.

Parameters:
  • subject_number (int) – xx in ECxx
  • good_trials (list of lists) – list of good trials in each block returned from get_times_hg_for_subject_number
  • control_stim (bool) – set to True for non-speech control_stim
Returns:

  • sns (list): sentence conditions (sentence numbers, 1 to 4)
  • sts (list): intonation conditions (sentence types, 1 to 4)
  • sps (list): speaker conditions (speakers, 1 to 3)

Return type:

(tuple)

intonatang.intonation_preanalysis.get_stg(subject_number, path='Imaging/elecs/', filename='TDT_elecs_all.mat')

Returns list of electrode numbers that are on STG.

Loads the anatomy files from the data_path + ‘EC’ + subject_number + path + filename.

intonatang.intonation_preanalysis.get_time_averaged_data(times_list, hg_list, window=6, hz=100, back=15, forward=285, zscore=True, zscore_to_silence=True)

Averages data in a moving window (each step is half a window) to smooth high-gamma for encoding analyses.

Parameters:
  • window – number of samples in each window. Use an even number, so steps are an integer number of samples.
  • back – number of samples back in time, center of the first window
  • forward – center of the last window.
intonatang.intonation_preanalysis.get_timelocked_activity(times, hg, zscore=True, hz=100, back=0, forward=250, zscore_to_silence=True)

Returns a n_chans x n_timepoints x n_trials matrix of high-gamma activity.

Parameters:
  • times – times[0] contains trial start times in seconds.
  • hg – full time-series of high-gamma for multiple electrodes (n_chans x nt)
  • back – number of time samples to take preceding trial start
  • forward – number of time samples to take following trial start.
  • zscore_to_silence – boolean, whether z-scoring should be done to a prestimulus baseline
intonatang.intonation_preanalysis.get_times_hg_for_subject_number(subject_number, only_good_trials=False, control_stim=False, missing_f0_stim=False, use_log_hg=False)

Used to process .mat data files, called by save_Y_mat_sns_sts_sps_for_subject_number

For each subject, all block data (hg, times, bcs, and badTimeSegments) is loaded and neural data is processed (bad time segments are removed, i.e. set as NaN, and each channel is z-scored across time).

Parameters:
  • subject_number (int) – xx in ECxx
  • only_good_trials (bool) – return only good_trials
  • control_stim (bool) – set as True to process non-speech control data
  • use_log_hg (bool) – return log hg instead of hg (log is taken for each high-gamma band and then averaged during preprocessing)
Returns:

  • times (list of ndarrays): list of stimulus onsets (seconds) for each block. Each item in times has dimensions 1 x number of trials
  • good_trials (list of lists): list of good trials for each block
  • hgs_toreturn (list of ndarrays): list of processed hg data for each block
  • gcs (list): good channels (channels that were good in every block)

Return type:

(tuple)

intonatang.intonation_preanalysis.load_Y_mat_sns_sts_sps_for_subject_number(subject_number, control_stim=False, missing_f0_stim=False, zscore_to_silence=True)

Loads data for analysis from file saved by save_Y_mat_sns_sts_sps_for_subject_number

Returns:
  • Y_mat: time averaged neural data for encoding analysis
  • sentence_numbers: list of sentence numbers (1, 2, 3, or 4)
  • sentence_types: list of sentence types or intonation conditions (1, 2, 3, 4)
  • speaker: list of speakers (1, 2, 3)
  • Y_mat_plotter: neural data for visualization (not time averaged).
Return type:(tuple)
intonatang.intonation_preanalysis.save_Y_mat_sns_sts_sps_for_subject_number(subject_number, control_stim=False, missing_f0_stim=False, return_raw_data=False, zscore_to_silence=True)

Pre-analysis processing pipeline. Creates matrix of hg activity and lists of stimulus information.

This function saves a .mat file called ECXXX_Y_mat.mat containing the variables: Y_mat, sentence_numbers, sentence_types, speakers, and Y_mat_plotter

Y_mat is time-averaged data for encoding analysis. Y_mat_plotter is full time series data for visualization and use with the Plotter.labels

To load the data that is saved, use load_Y_mat_sns_sts_sps_for_subject_number

Parameters:
  • subject_number (int) – xx in ECxx
  • control_stim (bool) – set to True for non-speech control_stim
  • return_raw_data (bool) – set to True to return (times, good_trials, hg, gcs) like get_times_hg_for_subject_number
  • zscore_to_silence (bool) – normalize neural data to pre-stimulus baseline rather than entire block