SAW 参数命令
可在命令行输入saw --help | -h
,查看具体分析流程和参数设置信息,通过 saw --version
检查软件版本信息。
SAW count
将时空转录组测序数据转换为空间特征表达矩阵。
运行方式: saw count [Parameters] --id <ID> --sn <SN> --omics <OMICS> --kit-version <TEXT> --sequencing-type <TEXT>--reference <PATH> --image <IMG> --fastqs <PATH>
saw count -h | --help
Parameter | Description |
---|---|
--id <ID> | (Optional, default to None) A unique task id ([a-zA-Z0-9_-]+) which will be displayed as the output folder name and the title of HTML report. If the parameter is absent, --sn will play the same role. |
--sn <SN> | (Required, default to None) SN (serial number) of the Stereo-seq chip. |
--omics <OMICS> | (Required, default to "transcriptomics") Omics information. "transcriptomics,proteomics" for Stereo-CITE analysis. |
--kit-version <TEXT> | (Required, default to None) The version of the product kit. More in count pipeline introduction. |
--sequencing-type <TEXT> | (Required, default to None) Sequencing type of FASTQs which is recorded in the sequencing report. |
--chip-mask <MASK> | (Required, default to None) Stereo-seq chip mask file. |
--organism <TEXT> | (Optional, default to None) Organism type of sample, usually referring to species. |
--tissue <TEXT> | (Optional, default to None) Physiological tissue of sample. |
--reference <PATH> | (Optional, default to None) Path to the reference folder, containing SAW-compatible index files and GTF/GFF, built by SAW makeRef . |
--ref-libraries <CSV> | (Optional, default to None) Path to a ref_libraries.csv which declares reference indexes, built by SAW makeRef . Not compatible with --reference . |
--fastqs <PATH> | (Required, default to None) Path(s) to folder(s), containing all needed FASTQs. If FASTQs are stored in multiple directories, use it as: --fastqs=/path/to/directory1,/path/to/directory2,... . Notice that all FASTQ files under these directories will be loaded for analysis. |
--adt-fastqs <PATH> | (Optional, default to None) Path(s) to folder(s), containing all ADT FASTQs. If FASTQs are stored in multiple directories, use it as: --adt-fastqs=/path/to/directory1,/path/to/directory2,... . Notice that all FASTQ files under these directories will be loaded for analysis. Also, use --fastqs specifies all gene expression FASTQs. |
--microorganism-detect | (Optional, default to None) Whether to perform analysis related to microorganisms. Notice that the detection only works for FFPE assay currently. |
--uniquely-mapped-only | (Optional, default to None) Only annotate on uniquely mapped reads during read annotation. |
--rRNA-remove | (Optional, default to None) Whether to remove rRNA. Before turning the switch on, make sure that the necessary rRNA information has been added to FASTA, using SAW makeRef . |
--clean-reads-fastq | (Optional, default to None) Whether to output the Clean Reads (before RNA alignment) in FASTQ format, which have undergone CID mapping, RNA filtering, and MID filtering. |
--unmapped-STAR-fastq | (Optional, default to None) Whether to output unmapped reads in FASTQ format. |
--unmapped-fastq | (Optional, default to None) Whether to output unmapped reads in FASTQ format (not including "too many loci" reads from STAR). |
--image <TIFF> | (Optional, default to None) TIFF image for QC (quality control), combined with expression matrix for analysis. Name rule for input TIFF : a. <SN>_<stain_type>.tif b. <SN>_<stain_type>.tiff c. <SN>_<stain_type>.TIF d. <SN>_<stain_type>.TIFF <stainType> includes: a. ssDNA b. DAPI c. HE (referring to H&E) d. <IF_name1>_IF, <IF_name2>_IF, ... |
--image-tar <TAR> | (Optional, default to None) The compressed image .tar.gz file from StereoMap has been through prepositive QC (quality control). |
--output <PATH> | (Optional, default to None) Set a specific output directory for the run. |
--threads-num <NUM> | (Optional, default to 8) Allowed local cores to run the pipeline. |
--memory <NUM> | (Optional, default to detected) Allowed local memory to run the pipeline. |
--gpu-id <NUM> | (Optional, default to -1) Set GPU id, according to GPU resources in the computing environment. Default to -1 , which means running the pipeline using the CPU. |
-h, --help | (Optional, default to None) Print help information. |
SAW makeRef
构建参考基因组的索引文件,支持 SAW count
分析,需输入 GTF/GFF 注释文件和 FASTA 基因组文件,可以加入 rRNA 信息的 FASTA 文件。
运行方式: saw makeRef [Parameters] --mode <MODE> --fasta <FASTA> --gtf <GTF/GFF> --genome <PATH>
saw makeRef -h | --help
Parameter | Description |
---|---|
--mode <MODE> | (Required, default to "STAR") Set the mode to build index files, used for the alignment. There are three modes, including STAR, Bowtie2 and Kraken2 for specific analysis scenarios. |
--fasta <FASTA> | (Optional, default to None) Path to FASTA, to build index files. When it comes to multiple FASTAs, they will be integrated in order of input beforehand. |
--rRNA-fasta <FASTA> | (Optional, default to None) Path to rRNA FASTA that will be added to --fasta file, with the elimination of redundant rRNA fragments. |
--gtf <GTF/GFF> | (Optional, default to None) Path to input GTF/GFF to build index files. |
--basename <TEXT> | (Optional, default to "host") Basename for Bowtie2 index files when set mode=Bowtie2 . If not specified, "host" will be used, which straightforwardly means removing host information in the next step. |
--database <DATABASE> | (Optional, default to None) Path to Kraken2 reference database. If the parameter works, output index files will be saved in the same directory level. |
--genome <PATH> | (Optional, default to detected) Path to the output reference genome with index information. |
--params-csv <CSV> | (Optional, default to detected) Path to CSV file, recording detailed parameters to build Bowtie2/Kraken2 index. It works when --mode is set to Bowtie2/Kraken2. More in the makeRef tutorial. |
--threads-num <INT> | (Optional, default to 8) Set the number of threads to use. |
-h, --help | (Optional, default to None) Print help information. |
SAW checkGTF
检查 GTF/GFF 注释文件是否为标准格式。此外,可以从 GTF/GFF 中提取特定的注释信息。
运行方式: saw checkGTF [Parameters] --input-gtf <GTF/GFF> --attribute <key:value> --output-gtf <GTF/GFF>
saw checkGTF -h | --help
Parameter | Description |
---|---|
--input-gtf <GTF/GFF> | (Required, default to None) Path to input GTF/GFF, for a necessary format check. |
--attribute <key:value> | (Optional, default to None) Extract specific annotation information from GTF/GFF. Input as <gene_biotype:protein_coding>. |
--output-gtf <GTF/GFF> | (Required, default to None) Path to output GTF/GFF after a necessary check, or additional filtration when performing --attribute . |
-h, --help | (Optional, default to None) Print help information. |
SAW realign
接回 StereoMap 手动处理后生成的图像文件,重启分析流程。
运行方式: saw realign [Parameters] --id <ID> --sn <SN> --count-data <PATH> --realigned-image-tar <TAR>
saw realign -h | --help
Parameter | Description |
---|---|
-id <ID> | (Optional, default to None) A unique task id ([a-zA-Z0-9_-]+) which will be displayed as the output folder name and the title of HTML report. If the parameter is absent, --sn will play the same role. |
--sn <SN> | (Required, default to None) SN (serial number) of the Stereo-seq chip. |
--count-data <PATH> | (Required, default to None) Output folder of the corresponding SAW count result, which mainly contains the expression matrices and other related datasets. |
--realigned-image-tar <TAR> | (Required, default to None) Compressed image file from StereoMap, which has been manually processed, including stitching, tissue segmentation, cell segmentation, calibration and registration. |
--lasso-geojson <GEOJSON> | (Optional, default to None) Lasso GeoJSON from StereoMap is used for tissue segmentation when the analysis is without images. It is incompatible with --realigned-image-tar . |
--adjusted-distance <INT> | (Optional, default to 10) Outspread distance based on the cellular contour of the cell segmentation image, in pixels. Default to 10. If --adjusted-distance=0 , the pipeline will not expand the cell border. |
--no-matrix | (Optional, default to None) Whether to output feature expression matrices. |
--no-report | (Optional, default to None) Whether to output HTML report. |
--output <PATH> | (Optional, default to None) Set a specific output directory for the run. |
--threads-num <NUM> | (Optional, default to 8) Set the number of threads to use. |
--gpu-id <NUM> | (Optional, default to -1) Set GPU id, according to GPU resources in the computing environment. Default to -1 , which means running the pipeline using the CPU. |
-h, --help | (Optional, default to None) Print help information. |
SAW reanalyze
进行数据再分析,包含聚类分析、矩阵套索和差异表达分析。
运行方式: saw reanalyze [Parameters] --gef <GEF> --bin-size <INT> --marker --output <PATH>
saw reanalyze -h | --help
cluster
Parameter | Description |
---|---|
--gef <GEF> | (Optional, default to None) Input bin GEF file for analysis. |
--cellbin-gef <GEF> | (Optional, default to None) Input cellbin GEF file for analysis. |
--bin-size <INT or LIST> | (Optional, default to 200) Bin size for analysis. |
--Leiden-resolution <FLOAT> | (Optional, default to 1.0) The resolution parameter controls the coarseness of the clustering when performing Leiden. Higher values lead to more clusters. |
--marker | (Optional, default to None) Whether to perform differential expression analysis. |
--output <PATH> | (Optional, default to None) Path to the output folder, to save analysis results. |
--threads-num <NUM> | (Optional, default to 8) Set the number of threads to use. |
lasso
Parameter | Description |
---|---|
--gef <GEF> | (Optional, default to None) Input bin GEF file for analysis. |
--cellbin-gef <GEF> | (Optional, default to None) Input cellbin GEF file for analysis. |
--bin-size <INT or LIST> | (Optional, default to 200) Bin size for analysis. |
--lasso-geojson <GEOJSON> | (Optional, default to None) GeoJSON from StereoMap to lasso sub expression matrices of targeted regions. |
--output <PATH> | (Optional, default to None) Path to the output folder, to save analysis results. |
diffExp
Parameter | Description |
---|---|
--count-data <PATH> | (Optional, default to None) Output folder of the corresponding SAW count result, which mainly contains the expression matrices and other related datasets. |
--diffexp-geojson <GEOJSON> | (Optional, default to None) GeoJSON from StereoMap to analyze differential expression. |
--output <PATH> | (Optional, default to None) Path to the output folder, to save analysis results. |
multiomics
Parameter | Description |
---|---|
--gef <GEF LIST> | (Optional, default to None) Input protein and gene bin GEF files for analysis, separated by comma. |
--cellbin-gef <GEF LIST> | (Optional, default to None) Input protein and gene cellbin GEF files for analysis, separated by comma. |
--bin-size <INT> | (Optional, default to 200) Bin size for analysis.Recommended to use 20 and 50. |
--protein-panel <PANEL> | (Optional, default to None) Path to a ProteinPanel.list . Not compatible with --ref-libraries . |
--ref-libraries <CSV> | (Optional, default to None) Path to a ref_libraries.csv which declares protein panel. Not compatible with--protein-panel . |
--output <PATH> | (Optional, default to None) Path to the output folder, to save analysis results. |
--gpu-id <NUM> | (Optional, default to -1) Set GPU id, according to GPU resources in the computing environment. Default to -1, which means running the pipeline using the CPU. |
--threads-num <NUM> | (Optional, default to 8) Set the number of threads to use. |
midFilter
Parameter | Description |
---|---|
--gef <GEF> | (Required, default to None) Input bin GEF file for analysis. |
--mid-json <JSON> | (Required, default to None) JSON from StereoMap to manually filter spatial expression matrixces by MID range. |
--output <PATH> | (Optional, default to None) Path to the output folder, to save analysis results. |
removeBackground
Parameter | Description |
---|---|
--gef <GEF> | (Required, default to None) Input bin protein GEF file for analysis |
--bin-size <INT> | (Required, default to None) Bin size for analysis. Recommended to use 20 and 50. |
--protein-panel <PANEL> | (Optional, default to None) Path to a ProteinPanel.list . Not compatible with --ref-libraries . |
--ref-libraries <CSV> | (Optional, default to None) Path to a ref_libraries.csv which declares protein panel. Not compatible with--protein-panel . |
--output <PATH> | (Optional, default to None) Path to the output folder, to save analysis results. |
SAW convert
实现数据格式转换,分析流程下设置子模块用于实现特定的转换需求。
运行方式: saw convert gef2gem [Parameters] --gef <GEF> --bin-size <INT> --marker --gem <GEM>
saw convert -h | --help
Parameter | Description |
---|---|
--threads-num <NUM> | (Optional, default to 8) Set the number of threads to use. |
-h, --help | (Optional, default to None) Print help information. |
Matrix related
gef2gem
Parameter | Description |
---|---|
--gef <GEF> | (Required, default to None) Path to input bin GEF file. |
--bin-size <INT> | (Optional, default to 1) Bin size used during conversion. |
--cellbin-gef <GEF> | (Optional, default to None) Path to input cellbin GEF file. |
--gem <GEM> | (Optional, default to None) Path to output GEM file. |
--cellbin-gem <GEM> | (Optional, default to None) Path to output cellbin GEM file. |
gem2gef
Parameter | Description |
---|---|
--gem <GEM> | (Optional, default to None) Path to input GEM file. |
--gef <GEF> | (Optional, default to None) Path to output bin GEF file. |
--cellbin-gem <GEM> | (Optional, default to None) Path to output cellbin GEM file. |
--cellbin-gef <GEF> | (Optional, default to None) Path to input cellbin GEF file. |
bin2cell
Parameter | Description |
---|---|
--gef <GEF> | (Required, default to None) Path to input bin GEF file. |
--image <TIFF> | (Required, default to None) Path to the image of cell segmentation. |
--cellbin-gef <GEF> | (Required, default to None) Path to output cellbin GEF file. |
--cellbin-gem <GEM> | (Optional, default to None) Path to output cellbin GEM file. |
gef2h5ad
Parameter | Description |
---|---|
--gef <GEF> | (Optional, default to None) Path to input bin GEF file. |
--bin-size <INT> | (Optional, default to 20) Bin size used during conversion. |
--cellbin-gef <GEF> | (Optional, default to None) Path to input cellbin GEF file. |
--h5ad <H5AD> | (Required, default to None) Path to output AnnData H5AD file. |
gem2h5ad
Parameter | Description |
---|---|
--gem <GEM> | (Optional, default to None) Path to input GEM file. |
--bin-size <INT> | (Optional, default to 20) Bin size used during conversion. |
--cellbin-gem <GEM> | (Optional, default to None) Path to input cellbin GEM file. |
--h5ad <H5AD> | (Required, default to None) Path to output AnnData H5AD file. |
gef2img
Parameter | Description |
---|---|
--gef <GEF> | (Required, default to None) Path to input bin GEF. |
--bin-size <INT> | (Required, default to 1) Bin size used to plot expression heatmap. |
--image <TIFF> | (Required, default to None) Path to output heatmap image. |
visualization
Parameter | Description |
---|---|
--gef <GEF> | (Required, default to None) Path to input raw bin GEF file. |
--bin-size <INT> | (Required, default to 1,5,10,20,50,100,150,200) Bin sizes used during conversion. |
--visualization-gef <GEF> | (Required, default to None) Path to output visualization GEF file. |
Image related
tar2img
Parameter | Description |
---|---|
--image-tar <TAR> | (Required, default to None) Path to input image compressed tar file. |
--image <PATH> | (Required, default to None) Path to output folder of images. |
img2rpi
Parameter | Description |
---|---|
--image <TIFF> | (Required, default to None) Path to images, please note that the order of input images, corresponding to --layers names. |
--layers <TEXT> | (Required, default to None) Layer names, recorded in the output RPI file, should correspond to images individually. Layer names can be set arbitrarily, but follow the format of <stain_type>/<image_type> , like DAPI/TissueMask . |
--rpi <RPI> | (Required, default to None) Path to output RPI file. |
merge
Parameter | Description |
---|---|
--image <TIFF> | (Required, default to None) Path to input images (up to 3), to be merged into one image, in the color order of R-G-B. |
--merged-image <TIFF> | (Required, default to None) Path to output multichannel image. |
overlay
Parameter | Description |
---|---|
--image <TIFF> | (Required, default to None) Path to image, used to be the base one. |
--template <TXT> | (Required, default to None) Point information of matrix template. |
--overlaid-image <TIFF> | (Required, default to None) Path to output overlaid image, with the cover of a template. |