Hi,
I am trying to filter my ~30 gb WGS dataset (currently in VCF format) using VCFtools in Cedar. I have prepared a bash script and have successfully ran the script based on the .out file, however, I am unable to locate the actual filtered dataset.
Here is my .sh that has run successfully but won’t produce a filtered VCF:
“
#!/bin/bash
#SBATCH –time=1:00:00
#SBATCH –mem=5000M
#SBATCH –mail-user=<moirc@uoguelph.ca>
#SBATCH –mail-type=ALL
echo ‘The job that Cam submitted is running.’
module load nixpkgs/16.09 intel/2018.3 vcftools/0.1.14
vcftools –vcf CunnerFreebayesCam.vcf –mac 2 –min-meanDP 7 –max-missing-count 1 –minQ 200 –out /projects/def-boulding/Sharedfiles/CunSNPfilter.vcf
“
Here is the slurm output file showing that it worked, even though a filtered VCF file can’t be located:
“
VCFtools – 0.1.14
(C) Adam Auton and Anthony Marcketta 2009
Parameters as interpreted:
–vcf CunnerFreebayesCam.vcf
–mac 2
–max-missing-count 1
–min-meanDP 7
–minQ 200
–out CunSNPfilter.vcf
After filtering, kept 15 out of 15 Individuals
After filtering, kept 2941047 out of a possible 25402547 Sites
Run Time = 238.00 seconds
“
Any Ideas will be greatly appreciated!
Best,
Camden Moir
University of Guelph
moirc@uoguelph.ca
Hi Cam,
Try adding the – -recode flag to produce a new vcf file, and let me know if that works.
See here: http://vcftools.sourceforge.net/man_latest.html