Data Organisation

Data locations

Data type Local path Group drive path
Raw data (unprocessed) C:\MatlabRoot\FreeWalkOptomotor\data (acquisition rig) \\prfs.hhmi.org\reiserlab\oaky-cokey\data\0_unprocessed
Raw data (tracked) C:\Users\labadmin\Documents\freely-walking-optomotor\DATA\01_tracked (processing PC) \\prfs.hhmi.org\reiserlab\oaky-cokey\data\1_tracked
Processed data C:\Users\labadmin\Documents\freely-walking-optomotor\DATA\02_processed (processing PC) \\prfs.hhmi.org\reiserlab\oaky-cokey\data\2_processed
Results (.mat per cohort) C:\Users\labadmin\Documents\freely-walking-optomotor\results\protocol_XX\ (processing PC) \\prfs.hhmi.org\reiserlab\oaky-cokey\exp_results\protocol_XX\
Plots per cohort (overview figs) C:\Users\labadmin\Documents\freely-walking-optomotor\figures\overview_figs\ (processing PC) \\prfs.hhmi.org\reiserlab\oaky-cokey\exp_figures\overview_figs\
Plots per strain C:\Users\labadmin\Documents\freely-walking-optomotor\figures\strain_figs\ (processing PC) \\prfs.hhmi.org\reiserlab\oaky-cokey\exp_figures\XCond_per_strain_figures\

See Configuration & Paths for the full three-computer setup and how these paths are derived from project_root / PROJECT_ROOT.


Raw data generated by running an experiment:

Tip

Each successful run of a protocol generates:

  • One Video: UFMF format via SimpleBiasCameraInterface (compressed — stores frame differences, not full frames)
  • One LOG_….mat file: Trial timing, conditions, temperature, metadata
  • One *stamp_log_cam0.txt file: Timestamps for each frame. This is not currently used in the processing steps however.

The data is stored within the folder C:\MatlabRoot\FreeWalkOptomotor\data (on the acquisition rig, under cfg.rig_data_folder — see Configuration & Paths).

Raw data folder structure :

rig_data_folder/
  YYYY_MM_DD/
    protocol_name/
      strain/
        sex/
          HH_MM_SS/
            LOG_YYYYMMDD_HHMMSS.mat - - - LOG file with metadata and trial timing.
            stamp_log_cam0.txt - - - Timestamp log for each video frame (not currently used).
            REC_.....ufmf - - - - UFMF video file of the experiment recording.

As well as creating the LOG file within the experiment folder, metadata and timing information is also automatically logged to a google sheet after the completion of successful experiments.

TipExcluding protocol runs from the automatic logging pipeline

During the development and troubleshooting of protocols, you may want to exclude certain runs from being logged to the google sheet. To do this, simply set the Strain in the get_input_parameters drop down box to test when starting the experiment.

Here is the documentation on how the automatic logging system was set up.

Tracking output

After recording, FlyTracker processes the .ufmf video and generates two files saved within a subdirectory:

File Description
trx.mat Trajectory data — a MATLAB table with one row per tracked fly. Each row contains arrays of x position, y position, heading, and timestamps across all video frames.
*-feat.mat Behavioural features computed by FlyTracker — includes distance from arena edge, wing angles, body length, and other morphological/behavioural measurements per frame.

Both files have arrays of the same length corresponding to the total number of frames in the video. The trx data provides the spatial coordinates and heading used for computing velocities, while feat provides arena-relative measurements like distance from the wall.

LOG_….mat structure

The LOG file records the timing and metadata for every part of the experiment. It is structured as follows:

Field Contents
LOG.meta Experiment metadata: fly strain, sex, date, time, protocol name
LOG.acclim_off1 Pre-stimulus dark acclimation (typically 300 s)
LOG.acclim_patt Pattern flash acclimation (calibration flash)
LOG.log_1LOG.log_N Per-condition stimulus timing (one field per condition per repetition)
LOG.acclim_off2 Post-stimulus dark acclimation (typically 30 s)

Each stimulus log entry (e.g. LOG.log_1) contains:

Field Description
start_t / stop_t Start and stop timestamps (seconds)
start_f / stop_f Start and stop frame numbers (at 30 fps)
optomotor_pattern Pattern ID used for this condition
optomotor_speed Speed value (0–127 scale)
condition Condition number

Processed data

After running the processing pipeline (process_freely_walking_data), results are saved as *_data.mat files within the results directory C:\Users\labadmin\Documents\freely-walking-optomotor\results (on the processing rig, under cfg.results_folder — see Configuration & Paths).

Each file contains:

Variable Description
LOG The original LOG structure
feat FlyTracker features with poorly tracked flies removed
trx FlyTracker trajectories with poorly tracked flies removed
comb_data Combined behavioural metrics — struct with fields for each metric (e.g. fv_data), each containing a [n_flies x n_frames] array
n_fly_data [3 x 1] array: [n_flies_in_arena, n_flies_tracked, n_flies_removed]

The DATA struct

The final DATA struct organises all experimental data in a nested hierarchy and can be made using either comb_data_one_cohort_cond or comb_data_across_cohorts_cond.

DATA.(strain).(sex)(cohort_idx).(condition).(data_type)

For example: DATA.jfrc100_es_shibire_kir.F(1).R1_condition_1.fv_data returns a [n_flies x n_frames] array. Each condition also contains stimulus metadata (trial_len, interval_dur, optomotor_pattern, optomotor_speed, etc.).

% Load combined data for protocol_27
DATA = comb_data_across_cohorts_cond('/path/to/results/protocol_27');

% List all strains
strain_names = fieldnames(DATA);

% Number of cohorts for a strain/sex
n_cohorts = length(DATA.jfrc100_es_shibire_kir.F);

% Mean forward velocity across all flies in one condition
fv = DATA.jfrc100_es_shibire_kir.F(1).R1_condition_1.fv_data;
mean_fv = mean(fv, 1, 'omitnan');  % [1 x n_frames]

This struct is critical for many of the analysis scripts within the repository. These scripts often start with these lines of code to generate the struct from teh .mat results files:

  if ~exist('DATA', 'var')
      cfg = get_config();
      protocol_dir = fullfile(cfg.results, 'protocol_27');
      DATA = comb_data_across_cohorts_cond(protocol_dir);
      fprintf('Loaded DATA from %s\n', protocol_dir);
  end

The DATA structure uses the following condition names:

  • acclim_off1 — pre-stimulus dark acclimation
  • acclim_patt — calibration flash acclimation
  • R1_condition_1R1_condition_N — first repetition, conditions 1 through N
  • R2_condition_1R2_condition_N — second repetition, conditions 1 through N
  • acclim_off2 — post-stimulus dark acclimation

Where to locate the data

The results from the processing pipeline are saved in the directory C:\Users\labadmin\Documents\freely-walking-optomotor\results (on the processing rig, under cfg.results_folder — see Configuration & Paths).

results/
  protocol_name/
    strain/
      sex/
        YYYYMMDD_HHMMSS_strain_protocol_sex_data.mat