Carafe Options Help

Skyline

Carafe Options

The parameter options are divided into three sections. These are related to the three modules of Carafe as described in the Carafe manuscript: https://www.biorxiv.org/content/biorxiv/early/2024/10/18/2024.10.15.618504/F1.large.jpg

 

A: Training Data Generation Options

Search Engine

-se

search engine used to generate the identification result: dia-nn or skyline (default is dia-nn)

False Discovery Rate

-fdr

minimum FDR cutoff to consider (default is 0.01)

PTM Site Probability

-ptm_site_qvalue

minimum PTM-site localization probability used to filter PSMs included for PTM model training (default is 0.75)

PTM Site Qvalue

-ptm_site_qvalue

maximum PTM-site qvalue used to filter PSMs included for PTM model training (default is 0.01)

Fragment Ion Mass Tolerance

-itol

fragment ion tolerance used during fragment ion matching and XIC extraction (default is 20.0)

Fragment Ion Mass Tolerance Units

-itolu

ppm or da (default is ppm)

Refine Peak Detection

-rf

refine peak boundary flag when using DIA data for fine-tuning (disabled by default)

Retention Time Window 

-rf_rt_win

retention time window, in minutes, used when refining peak detection is enabled for refining peak boundaries (default is 3 minutes)

Minimum XIC Correlation 

-cor 

minimum correlation cutoff to consider in determining shared fragments, default is 0.75 (default is 0.75).

Minimum Fragment Ion M/Z 

-min_mz

minimum fragment ion m/z to consider (default is 200.0)

 

B: Model Training

Model Type

-model

model type: general or phosphorylation (default is general)

Normalized Collision Energy 

-nce

normalized collision energy will be automatically extracted from the mzML, unless specified with this parameter  

MS Instrument

-ms_instrument

mass-spectrometer instrument type used to generate the training data will be automatically extracted from the mzML unless specified with this parameter

Device

-device

computation device: cpu or gpu - used for model training and prediction, if gpu is specified but not available, it will fall-back to using the cpu 

 

C: Library Generation

Enzyme

-enzyme

enzyme used for protein digestion, from one of the following (default is 1)

 0:No enzyme, 1:Trypsin, 2:Trypsin (no P rule), 3:Arg-C, 4:Arg-C (no P rule), 5:Arg-N, 6:Glu-C, 7:Lys-C.

Enzyme Missed Cleavages

-miss_c

 maximum  number of missed cleavages allowed (default is 1)

Fixed Modifications

-fixMod

fixed modifications to consider when generating a library

Variable Modifications

-varMod

variable modification to consider when generating a library

Maximum Allowed Variable Modifications

-maxVar

maximum number of variable modifications (default is 1)

Clip N-Terminus Methionine

-clip_n_m

when digesting a protein starting with amino acid M, two copies of the leading peptides (with and without the N-terminal M) will be considered or not (default is not)

Minimum Peptide Length

-minLength

minimum length of peptide to consider (default is 7)

Maximum Peptide Length

-maxLength

maximum length of peptide to consider (default is 35)

Minimum Peptide M/Z 

-min_pep_mz

 minimum mz of peptide to consider (default is 400)

Maximum Peptide M/Z

-max_pep_mz

maximum mz of peptide to consider (default is 1000)

Minimum Peptide Charge

-min_pep_charge

minimum precursor charge to consider (default is 2)

Maximum Peptide Charge

-max_pep_charge

maximum precursor charge to consider (default is 4)

Minimum Fragment Ion M/Z 

-lf_frag_mz_min

minimum mz of fragment to consider for library generation (default is 200)

Maximum Fragment Ion M/Z

-lf_frag_mz_max

minimum mz of fragment to consider for library generation (default is 1800)

Top N Fragments

-lf_top_n_frag

maximum number of fragment ions to consider for library generation (default is 20)

Spectral Library Format

-lf_type

dia-nn, encyclopedia, skyline, or mzspeclib (default is dia-nn)

Spectral Library File Format

-lf_format

spectral library format: tsv or parquet (default is tsv)


previousnext
 
expand allcollapse all