Cheatsheet: “Statistical analysis”
Cytoarchitectural Analysis:
Generate ventricle normals:
scout cyto mesh segment_ventricles.tif voxel_size.csv mesh_ventricles.pkl -d 1 6 6 -g 2 -s 2 -v
Compute radial cell profiles:
scout cyto profiles mesh_ventricles.pkl centroids_um.npy nuclei_gating.npy cyto_profiles.npy -v -p
Setting up analysis folders for group comparison:
mkdir analysis/Arlotta_d34_vs_Lancaster_d35
cd Arlotta_d34_vs_Lancaster_d35
scout multiscale select ../../datasets/summary.csv analysis.csv Arlotta_d34 Lancaster_d35 -v
scout multiscale setup analysis.csv ../../datasets/ -v
Folder structure should look like:
datasets/
| 20190419_14_35_07_AA_org1_488LP13_561LP120_642LP60/
| 20190419_15_50_16_AA_org2_488LP13_561LP120_642LP60/
| 20190509_16_55_31_AA-orgs5.8.19_org1_488LP15_561LP140_642LP50/
| 20190531_14_31_36_AA_org2_488LP13_561LP140_642LP60/
| .,, possibly more datasets
| summary.csv
analysis/analysis_folder_name/
| analysis.csv
| Group1/
| 20190419_14_35_07_AA_org1_488LP13_561LP120_642LP60/
| | dataset -> ../../datasets/20190419_14_35_07_AA_org1_488LP13_561LP120_642LP60/
| 20190419_15_50_16_AA_org2_488LP13_561LP120_642LP60/
| | dataset -> ../../datasets/20190419_15_50_16_AA_org2_488LP13_561LP120_642LP60/
| Group2/
| 20190509_16_55_31_AA-orgs5.8.19_org1_488LP15_561LP140_642LP50/
| | dataset -> ../../datasets/20190509_16_55_31_AA-orgs5.8.19_org1_488LP15_561LP140_642LP50/
| 20190531_14_31_36_AA_org2_488LP13_561LP140_642LP60/
| | dataset -> ../../datasets/20190531_14_31_36_AA_org2_488LP13_561LP140_642LP60/
Clustering sampled profiles: Run this in each individual organoid folders:
cd analysis/analysis_folder_name/Group1/2019...(organoid_folder_name)/dataset(symlink)
scout cyto sample 5000 cyto_sample_index.npy -i cyto_profiles.npy -o cyto_profiles_sample.npy -v
cd ../..
scout cyto sample 5000 cyto_sample_index.npy -i cyto_profiles.npy -o cyto_profiles_sample.npy -v (repeat this for all organoid folders)
Next, combine all sampled profiles with this command:
scout cyto combine analysis.csv -o cyto_profiles_combined.npy -s cyto_profiles_combined_samples.npy -v
analysis/analysis_folder_name/
| analysis.csv
| cyto_profiles_combined.npy
| cyto_profiles_combined_sample.npy
| Group1/
| 20190419_14_35_07_AA_org1_488LP13_561LP120_642LP60/
| | dataset -> ../../datasets/20190419_14_35_07_AA_org1_488LP13_561LP120_642LP60/
| 20190419_15_50_16_AA_org2_488LP13_561LP120_642LP60/
| | dataset -> ../../datasets/20190419_15_50_16_AA_org2_488LP13_561LP120_642LP60/
| Group2/
| 20190509_16_55_31_AA-orgs5.8.19_org1_488LP15_561LP140_642LP50/
| | dataset -> ../../datasets/20190509_16_55_31_AA-orgs5.8.19_org1_488LP15_561LP140_642LP50/
| 20190531_14_31_36_AA_org2_488LP13_561LP140_642LP60/
| | dataset -> ../../datasets/20190531_14_31_36_AA_org2_488LP13_561LP140_642LP60/
Perform cytoarchitecture clustering and visualization using Jupyter notebook “determine cyto clusters.ipynb”.
Next, inspect the images to figure out right names for clusters.
Cytoarchitecture clusters naming:
scout cyto name name1 name2 (...) -o cyto_names.csv -v
Next Step:
scp -r cyto_names.csv /Group1/each_organoid_folder
Classifying cytoarchitectures: Use “determine cyto clusters.ipynb” to get a umap model to use the command below:
cd analysis/analysis_folder_name/Group1/2019...(organoid_folder_name)
scout cyto classify ../../cyto_profiles_combined.npy ../../cyto_labels_combined.npy dataset/cyto_profiles.npy cyto_labels.npy -v --umap ../../model_Group1_and_Group.umap
Exporting OBJ and CSV (3D rendering with Blender 2.8):
Look into the Jupyter notebook “Export mesh and points as OBJ”. Import OBJ into Blender.
Blender script:
In Blender, the following script creates a new material for each unique cytoarchitecture and assigns each face in the ventricle mesh to the corresponding material.
import bpy
import csv
# Path to face labels
labels_csv = 'face_labels.csv'
def read_csv(path):
with open(path, mode='r') as f:
line = f.readline().split('\n')[0]
return line.split(',')
# Load face labels
labels = read_csv(labels_csv)
classes = list(set(labels))
classes.sort()
n_classes = len(classes)
print(f'Read {len(labels)} face labels belonging to {n_classes} classes')
# Make materials for each class
context = bpy.context
obj = context.object
mesh = obj.data
existing_material_names = [m.name for m in mesh.materials]
class_material_names = []
class_material_index = []
for i in range(n_classes):
material_name = f'class {i} material'
class_material_names.append(material_name)
if material_name in existing_material_names:
class_material_index.append(existing_material_names.index(material_name))
else:
class_material_index.append(len(mesh.materials))
mesh.materials.append(bpy.data.materials.new(material_name))
label_to_index = dict(zip(range(n_classes), class_material_index))
# Assign faces to materials based on labels
for f, lbl in zip(mesh.polygons, labels): # iterate over faces
print(lbl)
f.material_index = label_to_index[int(lbl)]
print("face", f.index, "material_index", f.material_index)
slot = obj.material_slots[f.material_index]
mat = slot.material
if mat is not None:
print(mat.name)
print(mat.diffuse_color)
else:
print("No mat in slot", f.material_index)
Multiscale statistical analysis:
Input files needed for the command to work are: centroids_um.npy and cyto_labels.npy
“.” should specify that these files be inputted.
cd analysis/analysis_folder_name/Group1/
scout multiscale features organoid_folder_name(usually 2019...)/. -d 1 6 6 -v
If this worked you should see organoid_features.xlsx file in the organoid folder.
Combine each organoid_features.xlsx from each organoid folder into a cumulative combined_features.xlsx file using:
scout multiscale combine analysis.csv --output combined_features.xlsx -v
Statistical testing:
Use the notebook “T-tests and volcano plots.ipynb” for statistical tests on the combined features.