In the case of a putative GDP mannose 4,6 dehydratase, the nonsense SNP, present only in strains from the TcI lineage, is located near the N terminus of the protein, therefore theoretically resulting in a complete truncation. Although there is a downstream ATG that could be used to produce a product with only a 11% reduction of its size, this product would lack the conserved NAD nucleotide binding motif GGxGxxG, and therefore we believe it cannot produce a functional protein. In another case, the presence of a nonsense SNP in one CL Brener Inhibitors,Modulators,Libraries allele, causes the shorter TcCLB. 506801. 70 allele to lose a potential glycosylphosphatidyl inositol C terminal anchor sequence, generating a potential significant change in localization of the protein.
The number Inhibitors,Modulators,Libraries of SNPs identified between Carfilzomib these two sequences is approximately twice the average found in other sequences. This, together with the observed diffe rences in sub cellular targeting signals, suggests that these alleles may have divergent functions. Another case invol ving a potential change in sub cellular localization due to a missing GPI anchor in one allele, was identified in align ment tcsnp,442281, encoding a puta tive proteins that belongs to the RNI like superfamily of leucine rich containing proteins, which are thought to me diate protein protein interactions. Distribution of SNPs in T. cruzi coding regions Next, we analyzed the distribution of SNPs along the coding region, and in the context of different sequence fea tures, trans membrane domains, signal peptides, globular vs unstructured regions.
We reasoned that the selection Inhibitors,Modulators,Libraries acting on the gene might be different in these different regions or domains. Based on this idea, we performed a number of comparisons, evaluating differences in the density of synonymous and non synonymous changes in Inhibitors,Modulators,Libraries one of these domains vs the rest of the protein. However, although some significant signal can be observed when per forming pairwise comparisons, these differences are not significant when using the complete data that includes alleles from TcI, TcII, TcIII, and TcVI. One of the features analyzed, was the presence of SNPs in natively unstructured domains. Several recent papers report an observation that natively unfolded domains can support higher non synonymous substitution rates. Based on predictions made using IUPred we identified globular and natively unstructured domains in T.
cruzi proteins. A comparison of the SNP density found in these regions showed no statistically significant differences. However, we did observe a great dispersion in the density of SNPs in non globular regions, with more outliers with higher densities of non synonymous SNPs in this category. Analysis of the functional annotation of these outliers showed enrichment in transporters, kinases and hydrolases. A particularly striking outlier is the TcCLB. 506553.