dictionary - Contig Extension with Python -


i want add function program creates dictionaries dna sequences receives contig (incon= initial contig; dna sequence) , extends right finding overlapping parts in form of keys in dictionaries , concatenating values "+" operator.

i'll give quick example:

gatttgaagc initial contig

atttgaagc:a 1 of many entries in dictionary

i want function search such overlapping part (i asked here yesterday , worked fine , specific values not within function variables) key in dictionary , concatenate value of key initial sequence (extend contig right) , save new sequence incon delete dictionary-entry , repeat until there no entries left (this part haven't tried yet).

first want function search keys length of 9 values of length 1 (atttgaagc:a) , if there no overlapping parts keys length 8 values of length 2 (f.e. atttgaag:tg) , on.

additional info: dictionary "suffixdicts" has such entries values length 1 (key has length 14) 10 (key has length 5).

"reads" list of sequences stored

when try steps 1 after work (like search) , don't when tried built function out of it, literally nothing happens. function supposed return smallest possible extension.

def extendcontig (incon, reads, suffixdicts):     incon = reads[0]     x in range(1,len(incon)):         key in suffixdicts.keys():             if incon[x:] == key:                 incon = incon+suffixdicts['key']                 print(incon)             else:                 print("n") return() 

i'm new python , there dire mistakes made , them pointed out. know i'm way on head i'm understanding parts of existing code still have problems implementing myself it, due incorrect synthax. know there programs use understand whole thing behind it.

edit: asked add given functions. of them written parts wrote based on given code (basically copied tweaks). warning: quite lot:

reading fasta file: additional info: fasta file contains large amounts of sequences in form:

"> read 1

ttatgaatattacgcaatggacgtccaaggtacagcgtatttgtacgcta

"> read 2

aactgctatctttcttgtccactcgaaaatccataacgtagcccataacg

"> read 3

tcagttatcctatatactggatcccgactttaatcggcgtcggaattact

i uploaded file here: http://s000.tinyupload.com/?file_id=52090273537190816031

edit: edited large blocks of code out doesn't seem necessary.


Comments

Popular posts from this blog

python - TypeError: start must be a integer -

c# - DevExpress RepositoryItemComboBox BackColor property ignored -

django - Creating multiple model instances in DRF3 -