Analysis Overview
Experiment workflow
The data pipeline has five stages:
- Run the experiment — A MATLAB protocol script (e.g.
protocol_27.m) controls the LED arena and camera, producing a UFMF video file and aLOG.matfile with stimulus timing and metadata. - Track the flies — FlyTracker processes the video offline, extracting each fly’s position, heading, and basic features into
trx.mat(trajectories) andfeat.mat(features). - Compute behavioural metrics —
combine_data_one_cohort(feat, trx)filters bad tracking, interpolates gaps, and computes 12 behavioural metrics (forward velocity, angular velocity, turning rate, distance from centre, etc.). - Split by condition —
comb_data_one_cohort_cond(LOG, comb_data)uses LOG frame indices to slice the continuous data into per-condition segments. - Merge experiments —
comb_data_across_cohorts_cond(protocol_dir)combines all sessions into a single hierarchicalDATAstruct for group analysis.
Steps 2–3 can be fully automated by the processing pipeline.
Overview
The data acquired from freely-walking optomotor experiments, especially during the screen using protocol 27, is analysed in two main steps.
The first step (process_freely_walking_data) is done per cohort (each vial of flies that was run). This creates several “overview” level plots for the individual cohort.
The second step (process_screen_data) combines data from across cohorts and parses the data based on the condition too. This creates plots that compare the behaviour of each strain against the empty-split control flies.
A third step (make_summary_heat_maps_p27) performs statistical testing across all strains and conditions.
Requirements for analysing the data from the MIC screen
In order for the processing pipeline to run, within each experiment folder there should be:
- a
.ufmfvideo of the entire experiment - a
.matLOG file - a subdirectory that contains
trx.matand the-feat.matfile outputted by FlyTracker.
The .ufmf video is a compressed video format generated by BIAS. The difference between frames is stored, not the entire frame data. The LOG file contains metadata about the experiment (fly strain, date and time, which pattern was used for which condition) and the frame numbers at which each condition started and ended. These frame numbers are recorded during the experiment by MATLAB interfacing with BIAS.
The two FlyTracker output files serve different purposes:
trx.mat— trajectory data. A MATLAB table with one row per tracked fly containing arrays of x position, y position, heading angle, and timestamps across all video frames.-feat.mat— behavioural features. Contains per-frame measurements computed by FlyTracker such as distance from the arena edge, wing angles, and body dimensions.
Both files are generated during the FlyTracker tracking pipeline — trx is produced first during tracking, then feat is computed from the trajectories in a second pass. All rows should have arrays of the same length corresponding to the total number of frames in the video.
Behavioural metrics
The following metrics are computed from the FlyTracker output by combine_data_one_cohort and stored in the comb_data structure:
| Variable | Units | Source | Computation |
|---|---|---|---|
fv_data |
mm/s | Computed | Forward velocity in heading direction (two-point derivative, negative values set to NaN) |
av_data |
deg/s | Computed | Angular velocity via least-squares line fit to heading (window = 16 frames) |
vel_data |
mm/s | Computed | Total velocity magnitude (three-point central difference) |
curv_data |
deg/mm | Computed | Turning rate: av_data / fv_data |
x_data |
mm | trx |
X position, converted from pixels via PPM (4.1691 px/mm) |
y_data |
mm | trx |
Y position, converted from pixels via PPM |
heading_data |
deg | trx |
Continuous (unwrapped) heading angle |
heading_wrap |
deg | trx |
Heading wrapped to −180° to 180° |
dist_data |
mm | feat |
Distance from arena centre |
dist_data_delta |
mm | Computed | Change in distance relative to stimulus onset |
view_dist |
mm | Computed | Viewing distance to arena wall (ray-circle intersection) |
IFD_data |
mm | Computed | Distance to nearest fly |
IFA_data |
deg | Computed | Angle to nearest fly |
| Parameter | Value | Description |
|---|---|---|
| Max velocity threshold | 50 mm/s | Frames with velocity above this are set to NaN (assumed tracking errors) |
| Angular velocity window | 16 frames | Least-squares fitting window for heading derivative |
| Interpolation method | spline | Used for filling NaN values in position/distance data; heading uses previous-value fill |
| Frame rate | 30 fps | Camera acquisition rate |
Tree structure of processing functions
Functions in red are used for processing the data. Functions in blue are used for plotting the data.
- process_freely_walking_data
- process_data_features
- combine_data_one_cohort
- make_overview
- plot_all_features_filt
- plot_all_features_acclim
- comb_data_one_cohort_cond
- plot_allcond_onecohort_tuning
- plot_errorbar_tuning_curve_diff_contrasts
- plot_errorbar_tuning_diff_speeds
- generate_circ_stim_ufmf
- create_stim_video_loop
- process_data_features
- process_screen_data
- comb_data_across_cohorts_cond
- generate_exp_data_struct
- plot_allcond_acrossgroups_tuning
Level 1 — analyse per cohort: process_freely_walking_data
Inputs
Requires a string of the date for which you want to analyse the data (format 'YYYY_MM_DD'). It will process all of the data from experiments conducted with any protocol that are within that day.
Runs the function process_data_features per cohort and experiment.
Outputs
- Exports a text file of the number of flies ran per protocol and per strain.
- Results
.matfile per vial containing: LOG, feat, trx, comb_data, n_fly_data - Figures:
- Acclimation timeseries
- Full-experiment timeseries overview
- Timeseries per behavioural metric per vial
Description of process_data_features
Processes the tracked data from FlyTracker. Loads LOG, feat, and trx from each experiment folder.
Saves in the results file *_data.mat:
LOG— original experiment metadatafeat— FlyTracker features with poorly tracked flies removedtrx— FlyTracker trajectories with poorly tracked flies removedcomb_data— combined behavioural metrics for all flies across the entire experimentn_fly_data—[3 x 1]array of [n_flies_in_arena, n_flies_tracked, n_flies_removed]
The function proceeds through four steps:
1. Combine the tracking data for all flies within one vial across the entire experiment
The function combine_data_one_cohort combines data from all flies within a single experiment into the comb_data struct. Each field (e.g. fv_data) contains a [n_flies x n_frames] array. The data is not parsed by condition at this stage.
Tracking quality is checked first by check_tracking_FlyTrk, which compares the frame count for each tracked object against the mode. Flies with a different frame count are removed — this catches cases where tracking was split across multiple identities or non-fly objects were tracked.
Data extracted directly from FlyTracker output:
- Distance from the arena edge (from
feat) - Heading angle (from
trx) - X and Y position (from
trx)
Data computed from these:
- Angular velocity — least-squares line fit to heading over a 16-frame window (
vel_estimatewithmethod = 'line_fit') - Forward velocity — position derivative projected onto heading direction, smoothed with Gaussian convolution. Negative values and values exceeding 50 mm/s are set to NaN and filled with linear interpolation.
- Three-point velocity — total speed from central difference of position (
calculate_three_point_velocity) - Turning rate —
av_data / fv_data(degrees per millimetre) - Viewing distance — distance from the fly to the arena wall along its heading direction, computed via ray-circle intersection (
calculate_viewing_distance). A ray is cast from the fly’s position along its heading and the intersection with the arena circle (centre [126.6, 124.7] mm, radius 119.0 mm) is found by solving the resulting quadratic equation. - Inter-fly distance and angle — distance and angle to the nearest other fly (
calculate_distance_to_nearest_fly)
2. Create overview plots of behaviour during the entire experiment
make_overview— histogram subplots of general locomotion metrics (forward velocity, angular velocity, turning rate distributions) over the full protocol.plot_all_features_filt— timeseries for all flies over the full protocol, showing forward velocity, angular velocity, turning rate, and distance from arena centre. Coloured background rectangles indicate when each stimulus condition occurred.plot_all_features_acclim— timeseries during the 5-minute dark acclimation period only, showing forward velocity, angular velocity, turning rate, and both absolute and relative distance from centre.acclim_end = LOG.acclim_off1.stop_f; range_of_data_to_plot = 1:acclim_end;
3. Parse the behavioural data based on conditions
The function comb_data_one_cohort_cond organises the combined data into the nested DATA structure, with fields for each condition (e.g. R1_condition_1, R2_condition_1) and each behavioural metric within those conditions.
4. Plot the condition-parsed data
plot_allcond_onecohort_tuning generates a [(n_conditions/2) x 2] subplot figure showing mean ± SEM timeseries during each condition for all flies in the vial.
Explanation of the different functions used to combine data
combine_data_one_cohort
[comb_data, feat, trx] = combine_data_one_cohort(feat, trx)Combines data from all flies within a single experiment into the comb_data struct. Each field contains a [n_flies x n_frames] array. This function checks for bad tracking, filters high-velocity frames (> 50 mm/s) as tracking errors (setting them to NaN), and fills missing values using spline interpolation for position/distance data and previous-value interpolation for heading. The processed data is saved to the results file and used for all downstream analyses. The original data is never altered.
comb_data_one_cohort_cond
Both comb_data_one_cohort_cond and comb_data_across_cohorts_cond create the nested DATA structure based on experimental conditions. The single-cohort version is only used within process_data_features to create the DATA struct for the per-vial overview timeseries plots.
comb_data_across_cohorts_cond
Used within process_screen_data to combine data from all flies across multiple cohorts. The resulting DATA struct is organised hierarchically:
DATA.(strain).(sex)(cohort_idx).(condition).(data_type)
For example: DATA.jfrc100_es_shibire_kir.F(1).R1_condition_1.fv_data returns a [n_flies x n_frames] array. This function requires that the protocol saves condition numbers to the LOG file — older protocols that do not include this information cannot be processed with this function.
Level 2 — analyse across cohorts: process_screen_data
This function uses the .mat results files generated by process_freely_walking_data to combine data across all cohorts.
- Runs
comb_data_across_cohorts_condto generate the hierarchicalDATAstruct across all strains and cohorts. - Runs
plot_allcond_acrossgroups_tuningto create[(n_conditions/2) x 2]subplot figures for each strain versus the empty-split control flies. It creates 5 figures per strain, one for each data type:fv_data,av_data,curv_data,dist_data,dist_data_delta.
Inputs
- String of the protocol e.g.
'protocol_27' .matresults files fromprocess_data_features
Outputs
- 5 figures per strain (timeseries per condition vs empty-split controls)
- Text file and 2 plots of the number of vials per strain and the number of flies per strain
Level 3 — Statistical analysis: make_summary_heat_maps_p27
This function generates a red-blue heatmap of p-values comparing each strain to the empty-split control across all conditions and behavioural metrics.
The statistical pipeline:
- Combines all data for
protocol_27usingcomb_data_across_cohorts_cond. - Computes p-values for each strain × condition × metric comparison using
make_pvalue_heatmap_across_strains(Wilcoxon rank-sum test). - Applies a False Discovery Rate (FDR) correction using
fdr_bhwith an alpha threshold of 0.001 and the dependent assumption ('dep'method). - Plots the corrected p-values as a heatmap using
plot_pval_heatmap_strains. Red indicates the test strain has a significantly higher value than the control; blue indicates a significantly lower value.
Processing of other protocols
Several analysis scripts handle data from protocols other than the main screen protocol:
| Protocol | Script | Analysis |
|---|---|---|
protocol_30 |
p30_different_contrasts_analysis.m |
Contrast tuning curves — compares optomotor responses across different contrast levels |
protocol_31 |
p31_different_speeds_analysis.m |
Speed tuning curves — compares responses across 4 speeds (32, 64, 127 px/s) at two spatial frequencies |
protocol_25 |
p25_single_lady_analysis.m |
Single-fly analysis — tests individual fly behaviour in isolation |
protocol_33/34 |
analyse_p33_p34.m |
Eye-painted fly experiments |
protocol_35 |
analyse_p35_shiftedCoR.m |
Shifted centre of rotation experiments |
Additional analysis scripts for phototaxis (analyse_phototaxis.m) and viewing distance (analyse_viewing_distance.m) can be applied to data from any protocol.
Processing details
Tracking quality control
Before computing behavioural metrics, the function check_tracking_FlyTrk removes badly tracked flies. FlyTracker sometimes produces artifacts — a single fly split into two tracks, or a non-fly object (dust, shadow) tracked. The function detects these by comparing the frame count for each tracked object against the mode (the most common frame count, which corresponds to the true video length). Any fly whose frame count differs from the mode is removed from both trx and feat.data.
High-velocity filtering
For each fly, frames where the FlyTracker-reported velocity exceeds 50 mm/s are marked as tracking errors. This threshold is well above the typical maximum walking speed of Drosophila (~30 mm/s) and reliably catches tracking jumps. At these frames, position, heading, and distance data are set to NaN for interpolation.
Interpolation methods
Missing values from filtering are filled using method-appropriate interpolation:
| Data | Method | Reason |
|---|---|---|
Distance to wall (d_wall_data) |
Spline | Smooth, continuous changes in distance |
Heading angle (heading_data) |
Previous value | Avoids introducing artificial heading jumps |
| X and Y position | Spline | Smooth position trajectories |
Condition splitting
The function comb_data_one_cohort_cond slices continuous data into per-condition segments using LOG frame indices. For each condition, the data slice runs from start_f(1) - 300 to stop_f(end), where the 300-frame (10-second) pre-buffer captures baseline behaviour before stimulus onset. Conditions are named R1_condition_N or R2_condition_N for repetitions 1 and 2 respectively, plus acclim_off1, acclim_patt, and acclim_off2 for the acclimation phases.
The training guide PDF (docs/training_guide/training_guide.pdf in the freely-walking-optomotor repository) provides detailed walkthroughs of every processing function, including the mathematical derivations for angular velocity, forward velocity, viewing distance, and inter-fly distance computations.