Brief Tutorial

Introduction

Pep2Path automates the matching of mass spectrometry-derived mass shift sequences or amino acid sequences of peptides to either NRPS biosynthetic gene clusters or RiPP prepeptides.

Preparing your input

The input of either NRP2Path or RiPP2Path consists of a comma-separated sequence of mass shifts or amino acids (see Documentation PDF for more details). Genomic data consist either of a Pep2Path database (for Nrp2Path) or of one or more files with nucleotide sequences (for RiPP2Path). In the former case, you can either use the pre-formatted database with all NRPS gene clusters from GenBank, or format your own database(s) using the supplied makedb and mergedb programs.

Running Pep2Path

Pep2Path is run from the command-line, with various parameters available to customize your search. A detailed overview of all the options and parameters is available in the Documentation PDF.

Analyzing the output of a Nrp2Path run

The output is stored in a tab-delimited TXT file, which can be opened with a spreadsheet editor such as MS Excel, Numbers or Open Office. To assess whether the best matches are indeed good candidates to synthesize the detected peptide, you can look at how well the predicted substrate specificities from these gene clusters match the amino acid sequence of the input peptide. For further confirmation, you can analyze the candidate gene cluster with a tool like antiSMASH, to see if any predicted tailoring enzymes match scaffold modifications that can be inferred from the MS data. It will also help to draw the putative NRPS assembly-line to see if it matches with the overall structure of the peptide.

Analyzing the output of a RiPP2Path run

The output is stored in a tab-delimited TXT file, which can be opened with a spreadsheet editor such as MS Excel, Numbers or Open Office. For the top hit (or if there are multiple hits with approximately equal scores: the top hits), you can look up whether the match indeed originates from a small prepeptide-like ORF. Looking at the overall gene neighbourhood (e.g., using a genome browser) will then help establish whether the genomic context is consistent with a RiPP biosynthetic gene cluster. Tools like antiSMASH or BAGEL can also be used to check whether the match(es) reside(s) in a predicted RiPP biosynthetic gene cluster.

Design based on the SWT and mzMatch pages.