Commit 311c9b0e authored by BRANDT's avatar BRANDT
Browse files

Update README.md: clarify steps and differences between extract.sh,...

Update README.md: clarify steps and differences between extract.sh, extract.py, extractR1R2.pbs, check.pbs. extract.py now handles both traditional sample names (xxx_R1/R2.fastq.gz) and Genoscope sample names.
parent 7e0f7a39
......@@ -27,7 +27,8 @@ For Sample :
* Reads from file sample_R2F-cutadapt.tar.gz are all renamed with /1 extension instead of /2
* Files sample_R1F-cutadapt.tar.gz and sample_R2R-cutadapt.tar.gz are re-paired using [BBMAP repair](https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/repair-guide/) script in order to remove singletons and to sort.
* Files sample_R2F-cutadapt.tar.gz and sample_R1R-cutadapt.tar.gz are re-paired using [BBMAP repair](https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/repair-guide/) script in order to remove singletons.
* Reads' name are modified : extensions /1 and /2 are removed in order to have the same name in file R1 than in file R2 which is a requirement for dada2 to recognize the pairs of reads.
* Reads names are modified : extensions /1 and /2 are removed in order to have the same name in file R1 than in file R2 which is a requirement for dada2 to recognize the pairs of reads.
* Final number of output files are checked to make sure that all samples have been processed and samples.tar archive is created for FROGS processing without dada2
Final output files are names :
* sample_R1F.fastq.gz
......@@ -46,11 +47,14 @@ Final output files are names :
- TRIMREADS : set to True if you want to perform all abyss preprocessing (cutadapt, bbmap) (option : True/False)
- **extract.sh** : script which will run extract.py on the configuration file **extract.ini**. Each samples (and its two "paired-end" files) will be parse separately.
- **extract.sh** : script which will run extract.py on the configuration file **extract.ini**. Each sample (and its two "paired-end" files) will be parsed separately. This script is not used when abyss-preprocessing is run within abyss-pipeline, as extract.py is launched by main.sh
- **extract.py** : python script that will read extract.ini file and launch extractR1R2.pbs calculation or each sample. The check.pbs script is run at the end to verify that all files are created at the end of the process for each sample.
- **extract.py** : python script that will read extract.ini file and launch extractR1R2.pbs calculation on each sample. The check.pbs script is run once at the end to verify that all files are created at the end of the process and produce the FROGS archive.
- **extractR1R2.pbs** : pbs script that will run extractR1R2.py, which will execute the steps presented above (cutadapt, bbmap repair...)
- **check.pbs** : pbs script that will run check.py, which will verify that 5 files have been created for each sample and then produce the FROGS archive in ./frogs/samples.tar
- **extractR1R2.pbs** : pbs script that will run extractR1R2.py which will execute the steps presented above (cutadapt, bbmap repair...)
```bash
./extract.sh
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment