Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird coverage profile for spike in fastqs #197

Open
jayaramanp opened this issue Apr 29, 2019 · 2 comments
Open

Weird coverage profile for spike in fastqs #197

jayaramanp opened this issue Apr 29, 2019 · 2 comments
Labels

Comments

@jayaramanp
Copy link

Hello @yunfeiguo
I was able to successfully run varsim_multi.py and generate fastq with variants "spiked in" from dbsnp snps.

However, the coverage of the samples looks very funky! see how the coverage right over the exon almost resembles 2 prongs instead of one uniform coverage?
Can we implement a parameter for sorts to fix this issue?

Screen Shot 2019-04-29 at 1 34 26 PM

Screen Shot 2019-04-29 at 1 32 14 PM

in this case the top is the original sequence and bottom is the spiked-in sequence..

Screen Shot 2019-04-29 at 2 40 38 PM

@yunfeiguo
Copy link
Contributor

Hi @jayaramanp several comments and questions:

  • could you load the BED file used for simulation at the bottom of the igv screenshot?
  • one strategy is flank the regions (by 500bp, say) and then use flanked regions for simulation to achieve better uniformity around original regions.
  • another solution is increase read length (-l option for --simulator_options) because apparently the green reads (I assume they are real data) are slightly longer than 100bp you used in Invalid literal for float - while parsing dgv to vcf #196

@jayaramanp
Copy link
Author

@yunfeiguo

  1. could you load the BED file used for simulation at the bottom of the IGV screenshot?

I essentially used a CDS region of a gene as a target.

1 | 100316583 | 100316695 | NM_000028.2_cds_1
1 | 100316583 | 100316695 | NM_000642.2_cds_1
1 | 100316583 | 100316695 | NM_000643.2_cds_1
1 | 100316583 | 100316695 | NM_000644.2_cds_1
1 | 100318210 | 100318274 | NM_000646.2_cds_1
1 | 100327043 | 100327284 | NM_000028.2_cds_2
1 | 100327043 | 100327284 | NM_000642.2_cds_2
1 | 100327043 | 100327284 | NM_000643.2_cds_2
1 | 100327043 | 100327284 | NM_000644.2_cds_2
1 | 100327043 | 100327284 | NM_000646.2_cds_2
1 | 100327797 | 100327994 | NM_000028.2_cds_3
1 | 100327797 | 100327994 | NM_000642.2_cds_3
1 | 100327797 | 100327994 | NM_000643.2_cds_3
1 | 100327797 | 100327994 | NM_000644.2_cds_3
1 | 100327797 | 100327994 | NM_000646.2_cds_3
1 | 100329926 | 100330160 | NM_000028.2_cds_4
1 | 100329926 | 100330160 | NM_000642.2_cds_4
1 | 100329926 | 100330160 | NM_000643.2_cds_4
1 | 100329926 | 100330160 | NM_000644.2_cds_4
1 | 100329926 | 100330160 | NM_000646.2_cds_4
1 | 100335940 | 100336152 | NM_000028.2_cds_5
1 | 100335940 | 100336152 | NM_000642.2_cds_5
1 | 100335940 | 100336152 | NM_000643.2_cds_5
1 | 100335940 | 100336152 | NM_000644.2_cds_5
1 | 100335940 | 100336152 | NM_000646.2_cds_5
1 | 100336298 | 100336440 | NM_000028.2_cds_6
1 | 100336298 | 100336440 | NM_000642.2_cds_6
1 | 100336298 | 100336440 | NM_000643.2_cds_6
1 | 100336298 | 100336440 | NM_000644.2_cds_6
1 | 100336298 | 100336440 | NM_000646.2_cds_6
1 | 100340227 | 100340381 | NM_000028.2_cds_7
1 | 100340227 | 100340381 | NM_000642.2_cds_7
1 | 100340227 | 100340381 | NM_000643.2_cds_7
1 | 100340227 | 100340381 | NM_000644.2_cds_7
1 | 100340227 | 100340381 | NM_000646.2_cds_7
1 | 100340694 | 100340827 | NM_000028.2_cds_8
1 | 100340694 | 100340827 | NM_000642.2_cds_8
1 | 100340694 | 100340827 | NM_000643.2_cds_8
1 | 100340694 | 100340827 | NM_000644.2_cds_8
1 | 100340694 | 100340827 | NM_000646.2_cds_8
1 | 100340898 | 100341026 | NM_000028.2_cds_9
1 | 100340898 | 100341026 | NM_000642.2_cds_9
1 | 100340898 | 100341026 | NM_000643.2_cds_9
1 | 100340898 | 100341026 | NM_000644.2_cds_9
1 | 100340898 | 100341026 | NM_000646.2_cds_9
1 | 100341998 | 100342168 | NM_000028.2_cds_10
1 | 100341998 | 100342168 | NM_000642.2_cds_10
1 | 100341998 | 100342168 | NM_000643.2_cds_10
1 | 100341998 | 100342168 | NM_000644.2_cds_10
1 | 100341998 | 100342168 | NM_000646.2_cds_10
1 | 100343181 | 100343399 | NM_000028.2_cds_11
1 | 100343181 | 100343399 | NM_000642.2_cds_11
1 | 100343181 | 100343399 | NM_000643.2_cds_11
1 | 100343181 | 100343399 | NM_000644.2_cds_11
1 | 100343181 | 100343399 | NM_000646.2_cds_11
1 | 100345463 | 100345617 | NM_000028.2_cds_12
1 | 100345463 | 100345617 | NM_000642.2_cds_12
1 | 100345463 | 100345617 | NM_000643.2_cds_12
1 | 100345463 | 100345617 | NM_000644.2_cds_12
1 | 100345463 | 100345617 | NM_000646.2_cds_12
1 | 100346172 | 100346366 | NM_000028.2_cds_13
1 | 100346172 | 100346366 | NM_000642.2_cds_13
1 | 100346172 | 100346366 | NM_000643.2_cds_13
1 | 100346172 | 100346366 | NM_000644.2_cds_13
1 | 100346172 | 100346366 | NM_000646.2_cds_13
1 | 100346616 | 100346748 | NM_000028.2_cds_14
1 | 100346616 | 100346748 | NM_000642.2_cds_14
1 | 100346616 | 100346748 | NM_000643.2_cds_14
1 | 100346616 | 100346748 | NM_000644.2_cds_14
1 | 100346616 | 100346748 | NM_000646.2_cds_14
1 | 100346832 | 100347018 | NM_000028.2_cds_15
1 | 100346832 | 100347018 | NM_000642.2_cds_15
1 | 100346832 | 100347018 | NM_000643.2_cds_15
1 | 100346832 | 100347018 | NM_000644.2_cds_15
1 | 100346832 | 100347018 | NM_000646.2_cds_15
1 | 100347081 | 100347262 | NM_000028.2_cds_16
1 | 100347081 | 100347262 | NM_000642.2_cds_16
1 | 100347081 | 100347262 | NM_000643.2_cds_16
1 | 100347081 | 100347262 | NM_000644.2_cds_16
1 | 100347081 | 100347262 | NM_000646.2_cds_16
1 | 100349660 | 100349815 | NM_000028.2_cds_17
1 | 100349660 | 100349815 | NM_000642.2_cds_17
1 | 100349660 | 100349815 | NM_000643.2_cds_17
1 | 100349660 | 100349815 | NM_000644.2_cds_17
1 | 100349660 | 100349815 | NM_000646.2_cds_17
1 | 100349879 | 100350022 | NM_000028.2_cds_18
1 | 100349879 | 100350022 | NM_000642.2_cds_18
1 | 100349879 | 100350022 | NM_000643.2_cds_18
1 | 100349879 | 100350022 | NM_000644.2_cds_18
1 | 100349879 | 100350022 | NM_000646.2_cds_18
1 | 100350109 | 100350274 | NM_000028.2_cds_19
1 | 100350109 | 100350274 | NM_000642.2_cds_19
1 | 100350109 | 100350274 | NM_000643.2_cds_19
1 | 100350109 | 100350274 | NM_000644.2_cds_19
1 | 100350109 | 100350274 | NM_000646.2_cds_19
1 | 100353518 | 100353679 | NM_000028.2_cds_20
1 | 100353518 | 100353679 | NM_000642.2_cds_20
1 | 100353518 | 100353679 | NM_000643.2_cds_20
1 | 100353518 | 100353679 | NM_000644.2_cds_20
1 | 100353518 | 100353679 | NM_000646.2_cds_20
1 | 100356760 | 100356927 | NM_000028.2_cds_21
1 | 100356760 | 100356927 | NM_000642.2_cds_21
1 | 100356760 | 100356927 | NM_000643.2_cds_21
1 | 100356760 | 100356927 | NM_000644.2_cds_21
1 | 100356760 | 100356927 | NM_000646.2_cds_21
1 | 100357146 | 100357310 | NM_000028.2_cds_22
1 | 100357146 | 100357310 | NM_000642.2_cds_22
1 | 100357146 | 100357310 | NM_000643.2_cds_22
1 | 100357146 | 100357310 | NM_000644.2_cds_22
1 | 100357146 | 100357310 | NM_000646.2_cds_22
1 | 100357972 | 100358178 | NM_000028.2_cds_23
1 | 100357972 | 100358178 | NM_000642.2_cds_23
1 | 100357972 | 100358178 | NM_000643.2_cds_23
1 | 100357972 | 100358178 | NM_000644.2_cds_23
1 | 100357972 | 100358178 | NM_000646.2_cds_23
1 | 100361826 | 100361959 | NM_000028.2_cds_24
1 | 100361826 | 100361959 | NM_000642.2_cds_24
1 | 100361826 | 100361959 | NM_000643.2_cds_24
1 | 100361826 | 100361959 | NM_000644.2_cds_24
1 | 100361826 | 100361959 | NM_000646.2_cds_24
1 | 100366176 | 100366432 | NM_000028.2_cds_25
1 | 100366176 | 100366432 | NM_000642.2_cds_25
1 | 100366176 | 100366432 | NM_000643.2_cds_25
1 | 100366176 | 100366432 | NM_000644.2_cds_25
1 | 100366176 | 100366432 | NM_000646.2_cds_25
1 | 100368223 | 100368365 | NM_000028.2_cds_26
1 | 100368223 | 100368365 | NM_000642.2_cds_26
1 | 100368223 | 100368365 | NM_000643.2_cds_26
1 | 100368223 | 100368365 | NM_000644.2_cds_26
1 | 100368223 | 100368365 | NM_000646.2_cds_26
1 | 100376252 | 100376418 | NM_000028.2_cds_27
1 | 100376252 | 100376418 | NM_000642.2_cds_27
1 | 100376252 | 100376418 | NM_000643.2_cds_27
1 | 100376252 | 100376418 | NM_000644.2_cds_27
1 | 100376252 | 100376418 | NM_000646.2_cds_27
1 | 100377945 | 100378088 | NM_000028.2_cds_28
1 | 100377945 | 100378088 | NM_000642.2_cds_28
1 | 100377945 | 100378088 | NM_000643.2_cds_28
1 | 100377945 | 100378088 | NM_000644.2_cds_28
1 | 100377945 | 100378088 | NM_000646.2_cds_28
1 | 100379067 | 100379309 | NM_000028.2_cds_29
1 | 100379067 | 100379309 | NM_000642.2_cds_29
1 | 100379067 | 100379309 | NM_000643.2_cds_29
1 | 100379067 | 100379309 | NM_000644.2_cds_29
1 | 100379067 | 100379309 | NM_000646.2_cds_29
1 | 100380929 | 100381057 | NM_000028.2_cds_30
1 | 100380929 | 100381057 | NM_000642.2_cds_30
1 | 100380929 | 100381057 | NM_000643.2_cds_30
1 | 100380929 | 100381057 | NM_000644.2_cds_30
1 | 100380929 | 100381057 | NM_000646.2_cds_30
1 | 100381950 | 100382068 | NM_000028.2_cds_31
1 | 100381950 | 100382068 | NM_000642.2_cds_31
1 | 100381950 | 100382068 | NM_000643.2_cds_31
1 | 100381950 | 100382068 | NM_000644.2_cds_31
1 | 100381950 | 100382068 | NM_000646.2_cds_31
1 | 100382138 | 100382302 | NM_000028.2_cds_32
1 | 100382138 | 100382302 | NM_000642.2_cds_32
1 | 100382138 | 100382302 | NM_000643.2_cds_32
1 | 100382138 | 100382302 | NM_000644.2_cds_32
1 | 100382138 | 100382302 | NM_000646.2_cds_32
1 | 100387074 | 100387222 | NM_000028.2_cds_33
1 | 100387074 | 100387222 | NM_000642.2_cds_33
1 | 100387074 | 100387222 | NM_000643.2_cds_33
1 | 100387074 | 100387222 | NM_000644.2_cds_33
1 | 100387074 | 100387222 | NM_000646.2_cds_33

  1. one strategy is flank the regions (by 500bp, say) and then use flanked regions for simulation to achieve better uniformity around original regions.

I can do that.

  1. another solution is increase read length (-l option for --simulator_options) because apparently the green reads (I assume they are real data) are slightly longer than 100bp you used in Invalid literal for float - while parsing dgv to vcf #196

sounds good. will do that. yes the green reads are real data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants