<?xml version="1.0" encoding="UTF-8" ?><?xml-stylesheet type="text/xsl" href="oaicat.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2009-11-21T06:28:43Z</responseDate><request metadataPrefix="oai_dc" verb="ListRecords" set="bioinfo:23">http://open-archive.highwire.org/handler</request><ListRecords>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2361</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>The global trace graph, a novel paradigm for searching protein sequence databases</dc:title>
<dc:creator>Heger, Andreas</dc:creator>
<dc:creator>Mallick, Swapan</dc:creator>
<dc:creator>Wilton, Christopher</dc:creator>
<dc:creator>Holm, Liisa</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Propagating functional annotations to sequence-similar, presumably homologous proteins lies at &lt;cross-ref type=&quot;fn&quot; refid=&quot;FN1&quot;&gt;&lt;/cross-ref&gt;the &lt;cross-ref type=&quot;fn&quot; refid=&quot;FN2&quot;&gt;&lt;/cross-ref&gt;heart &lt;cross-ref type=&quot;fn&quot; refid=&quot;FN3&quot;&gt;&lt;/cross-ref&gt;of the bioinformatics industry. Correct propagation is crucially dependent on the accurate identification of subtle sequence motifs that are conserved in evolution. The evolutionary signal can be difficult to detect because functional sites may consist of non-contiguous residues while segments in-between may be mutated without affecting fold or function. &lt;b&gt;Results:&lt;/b&gt; Here, we report a novel graph clustering algorithm in which all known protein sequences simultaneously self-organize into hypothetical multiple sequence alignments. This eliminates noise so that non-contiguous sequence motifs can be tracked down between extremely distant homologues. The novel data structure enables fast sequence database searching methods which are superior to profile-profile comparison at recognizing distant homologues. This study will boost the leverage of structural and functional genomics and opens up new avenues for data mining a complete set of functional signature motifs. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.bioinfo.biocenter.helsinki.fi/gtg&quot; locator-type=&quot;url&quot;&gt;http://www.bioinfo.biocenter.helsinki.fi/gtg&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;liisa.holm@helsinki.fi&quot; locator-type=&quot;email&quot;&gt;liisa.holm@helsinki.fi&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2361</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm358</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2368</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A quantitative genotype algorithm reflecting H5N1 Avian influenza niches</dc:title>
<dc:creator>Wan, Xiu-Feng</dc:creator>
<dc:creator>Chen, Guorong</dc:creator>
<dc:creator>Luo, Feng</dc:creator>
<dc:creator>Emch, Michael</dc:creator>
<dc:creator>Donis, Ruben</dc:creator>
<dc:subject>PHYLOGENETICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Computational genotyping analyses are critical for characterizing molecular evolutionary footprints, thus providing important information for designing the strategies of influenza prevention and control. Most of the current methods that are available are based on multiple sequence alignment and phylogenetic tree construction, which are time consuming and limited by the number of taxa. Arbitrarily defining genotypes further complicates the interpretation of genotyping results. &lt;b&gt;Methods:&lt;/b&gt; In this study, we describe a quantitative influenza genotyping algorithm based on the theory of quasispecies. First, the complete composition vector (CCV) was utilized to calculate the pairwise evolutionary distance between genotypes. Next, Hierarchical Bayesian Modeling using the Gibbs Sampling algorithm was applied to identify the segment genotype threshold, which is used to identify influenza segment genotype through a modularity calculation. The viral genotype was defined by combining eight segment genotypes based on the genetic reassortment feature of influenza A viruses. &lt;b&gt;Results:&lt;/b&gt; We applied this method for H5N1 avian influenza viruses and identified 107 niches among 283 viruses with a complete genome set. The diversity of viral genotypes, and their correlation with geographic locations suggests that these viruses form local niches after being introduced to a new ecological environment through poultry trade or bird migration. This novel method allows us to define genotypes in a robust, quantitative as well as hierarchical manner. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;wanhenry@yahoo.com&quot; locator-type=&quot;email&quot;&gt;wanhenry@yahoo.com&lt;/inter-ref&gt; or &lt;inter-ref locator=&quot;fvq7@cdc.gov&quot; locator-type=&quot;email&quot;&gt;fvq7@cdc.gov&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2368</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm354</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2353</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Methods of remote homology detection can be combined to increase coverage by 10% in the midnight zone</dc:title>
<dc:creator>Reid, Adam James</dc:creator>
<dc:creator>Yeats, Corin</dc:creator>
<dc:creator>Orengo, Christine Anne</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; A recent development in sequence-based remote homologue detection is the introduction of profile&#8211;profile comparison methods. These are more powerful than previous technologies and can detect potentially homologous relationships missed by structural classifications such as CATH and SCOP. As structural classifications traditionally act as the gold standard of homology this poses a challenge in benchmarking them. &lt;b&gt;Results:&lt;/b&gt; We present a novel approach which allows an accurate benchmark of these methods against the CATH structural classification. We then apply this approach to assess the accuracy of a range of publicly available methods for remote homology detection including several profile&#8211;profile methods (COMPASS, HHSearch, PRC) from two perspectives. First, in distinguishing homologous domains from non-homologues and second, in annotating proteomes with structural domain families. PRC is shown to be the best method for distinguishing homologues. We show that SAM is the best practical method for annotating genomes, whilst using COMPASS for the most remote homologues would increase coverage. Finally, we introduce a simple approach to increase the sensitivity of remote homologue detection by up to 10 %. This is achieved by combining multiple methods with a jury vote. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;reid@bioichem.ucl.ac.uk&quot; locator-type=&quot;email&quot;&gt;reid@bioichem.ucl.ac.uk&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2353</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm355</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2433</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Modular decomposition of metabolic reaction networks based on flux analysis and pathway projection</dc:title>
<dc:creator>Yoon, Jeongah</dc:creator>
<dc:creator>Si, Yaguang</dc:creator>
<dc:creator>Nolan, Ryan</dc:creator>
<dc:creator>Lee, Kyongbum</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; The rational decomposition of biochemical networks into sub-structures has emerged as a useful approach to study the design of these complex systems. A biochemical network is characterized by an inhomogeneous connectivity distribution, which gives rise to several organizational features, including modularity. To what extent the connectivity-based modules reflect the functional organization of the network remains to be further explored. In this work, we examine the influence of physiological perturbations on the modular organization of cellular metabolism. &lt;b&gt;Results:&lt;/b&gt; Modules were characterized for two model systems, liver and adipocyte primary metabolism, by applying an algorithm for top&#8211;down partition of directed graphs with non-uniform edge weights. The weights were set by the engagement of the corresponding reactions as expressed by the flux distribution. For the base case of the fasted rat liver, three modules were found, carrying out the following biochemical transformations: ketone body production, glucose synthesis and transamination. This basic organization was further modified when different flux distributions were applied that describe the liver&apos;s metabolic response to whole body inflammation. For the fully mature adipocyte, only a single module was observed, integrating all of the major pathways needed for lipid storage. Weaker levels of integration between the pathways were found for the early stages of adipocyte differentiation. Our results underscore the inhomogeneous distribution of both connectivity and connection strengths, and suggest that global activity data such as the flux distribution can be used to study the organizational flexibility of cellular metabolism. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;kyongbum.lee@tufts.edu&quot; locator-type=&quot;email&quot;&gt;kyongbum.lee@tufts.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2433</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm374</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2441</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Quantitative quality-assessment techniques to compare fractionation and depletion methods in SELDI-TOF mass spectrometry experiments</dc:title>
<dc:creator>Harezlak, Jaroslaw</dc:creator>
<dc:creator>Wang, Mike</dc:creator>
<dc:creator>Christiani, David</dc:creator>
<dc:creator>Lin, Xihong</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Mass spectrometry (MS), such as the surface-enhanced laser desorption and ionization time-of-flight (SELDI-TOF) MS, provides a potentially promising proteomic technology for biomarker discovery. An important matter for such a technology to be used routinely is its reproducibility. It is of significant interest to develop quantitative measures to evaluate the quality and reliability of different experimental methods. &lt;b&gt;Results:&lt;/b&gt; We compare the quality of SELDI-TOF MS data using unfractionated, fractionated plasma samples and abundant protein depletion methods in terms of the numbers of detected peaks and reliability. Several statistical quality-control and quality-assessment techniques are proposed, including the Graeco&#8211;Latin square design for the sample allocation on a Protein chip, the use of the pairwise Pearson correlation coefficient as the similarity measure between the spectra in conjunction with multi-dimensional scaling (MDS) for graphically evaluating similarity of replicates and assessing outlier samples; and the use of the reliability ratio for evaluating reproducibility. Our results show that the number of peaks detected is similar among the three sample preparation technologies, and the use of the Sigma multi-removal kit does not improve peak detection. Fractionation of plasma samples introduces more experimental variability. The peaks detected using the unfractionated plasma samples have the highest reproducibility as determined by the reliability ratio. &lt;b&gt;Availability:&lt;/b&gt; Our algorithm for assessment of SELDI-TOF experiment quality is available at &lt;inter-ref locator=&quot;http://www.biostat.harvard.edu/~xlin&quot; locator-type=&quot;url&quot;&gt;http://www.biostat.harvard.edu/~xlin&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;harezlak@post.harvard.edu&quot; locator-type=&quot;email&quot;&gt;harezlak@post.harvard.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2441</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm346</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2376</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Natively unstructured regions in proteins identified from contact predictions</dc:title>
<dc:creator>Schlessinger, Avner</dc:creator>
<dc:creator>Punta, Marco</dc:creator>
<dc:creator>Rost, Burkhard</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Natively unstructured (also dubbed &lt;it&gt;intrinsically disordered&lt;/it&gt;) regions in proteins lack a defined 3D structure under physiological conditions and often adopt regular structures under particular conditions. Proteins with such regions are overly abundant in eukaryotes, they may increase functional complexity of organisms and they usually evade structure determination in the unbound form. Low propensity for the formation of internal residue contacts has been previously used to predict natively unstructured regions. &lt;b&gt;Results:&lt;/b&gt; We combined PROFcon predictions for protein-specific contacts with a generic pairwise potential to predict unstructured regions. This novel method, &lt;it&gt;Ucon&lt;/it&gt;, outperformed the best available methods in predicting proteins with long unstructured regions. Furthermore, &lt;it&gt;Ucon&lt;/it&gt; correctly identified cases missed by other methods. By computing the difference between predictions based on specific contacts (approach introduced here) and those based on generic potentials (realized in other methods), we might identify unstructured regions that are involved in protein&#8211;protein binding. We discussed one example to illustrate this ambitious aim. Overall, Ucon added quality and an orthogonal aspect that may help in the experimental study of unstructured regions in network hubs. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.predictprotein.org/submit_ucon.html&quot; locator-type=&quot;url&quot;&gt;http://www.predictprotein.org/submit_ucon.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;as2067@columbia.edu&quot; locator-type=&quot;email&quot;&gt;as2067@columbia.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2376</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm349</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2385</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>AffyProbeMiner: a web resource for computing or retrieving accurately redefined Affymetrix probe sets</dc:title>
<dc:creator>Liu, Hongfang</dc:creator>
<dc:creator>Zeeberg, Barry R.</dc:creator>
<dc:creator>Qu, Gang</dc:creator>
<dc:creator>Koru, A. Gunes</dc:creator>
<dc:creator>Ferrucci, Alessandro</dc:creator>
<dc:creator>Kahn, Ari</dc:creator>
<dc:creator>Ryan, Michael C.</dc:creator>
<dc:creator>Nuhanovic, Antej</dc:creator>
<dc:creator>Munson, Peter J.</dc:creator>
<dc:creator>Reinhold, William C.</dc:creator>
<dc:creator>Kane, David W.</dc:creator>
<dc:creator>Weinstein, John N.</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Affymetrix microarrays are widely used to measure global expression of mRNA transcripts. That technology is based on the concept of a probe set. Individual probes within a probe set were originally designated by Affymetrix to hybridize with the same unique mRNA transcript. Because of increasing accuracy in knowledge of genomic sequences, however, a substantial number of the manufacturer&apos;s original probe groupings and mappings are now known to be inaccurate and must be corrected. Otherwise, analysis and interpretation of an Affymetrix microarray experiment will be in error. &lt;b&gt;Results:&lt;/b&gt; AffyProbeMiner is a computationally efficient platform-independent tool that uses all RefSeq mature RNA protein coding transcripts and validated complete coding sequences in GenBank to (1) regroup the individual probes into consistent probe sets and (2) remap the probe sets to the correct sets of mRNA transcripts. The individual probes are grouped into probe sets that are &#8216;transcript-consistent&#8217; in that they hybridize to the same mRNA transcript (or transcripts) and, therefore, measure the same entity (or entities). About 65.6 % of the probe sets on the HG-U133A chip were affected by the remapping. Pre-computed regrouped and remapped probe sets for many Affymetrix microarrays are made freely available at the AffyProbeMiner web site. Alternatively, we provide a web service that enables the user to perform the remapping for any type of short-oligo commercial or custom array that has an Affymetrix-format Chip Definition File (CDF). Important features that differentiate AffyProbeMiner from other approaches are flexibility in the handling of splice variants, computational efficiency, extensibility, customizability and user-friendliness of the interface. &lt;b&gt;Availability:&lt;/b&gt; The web interface and software (GPL open source license), are publicly-accessible at &lt;inter-ref locator=&quot;http://discover.nci.nih.gov/affyprobeminer&quot; locator-type=&quot;url&quot;&gt;http://discover.nci.nih.gov/affyprobeminer&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;hl224@georgetown.edu&quot; locator-type=&quot;email&quot;&gt;hl224@georgetown.edu&lt;/inter-ref&gt; or &lt;inter-ref locator=&quot;barry@discover.nci.nih.gov&quot; locator-type=&quot;email&quot;&gt;barry@discover.nci.nih.gov&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2385</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm360</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2391</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Selection and validation of normalization methods for c-DNA microarrays using within-array replications</dc:title>
<dc:creator>Fan, Jianqing</dc:creator>
<dc:creator>Niu, Yue</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Normalization of microarray data is essential for multiple-array analyses. Several normalization protocols have been proposed based on different biological or statistical assumptions. A fundamental problem arises whether they have effectively normalized arrays. In addition, for a given array, the question arises how to choose a method to most effectively normalize the microarray data. &lt;b&gt;Results:&lt;/b&gt; We propose several techniques to compare the effectiveness of different normalization methods. We approach the problem by constructing statistics to test whether there are any systematic biases in the expression profiles among duplicated spots within an array. The test statistics involve estimating the genewise variances. This is accomplished by using several novel methods, including empirical Bayes methods for moderating the genewise variances and the smoothing methods for aggregating variance information. &lt;it&gt;P&lt;/it&gt;-values are estimated based on a normal or &#967; approximation. With estimated &lt;it&gt;P&lt;/it&gt;-values, we can choose a most appropriate method to normalize a specific array and assess the extent to which the systematic biases due to the variations of experimental conditions have been removed. The effectiveness and validity of the proposed methods are convincingly illustrated by a carefully designed simulation study. The method is further illustrated by an application to human placenta cDNAs comprising a large number of clones with replications, a customized microarray experiment carrying just a few hundred genes on the study of the molecular roles of Interferons on tumor, and the Agilent microarrays carrying tens of thousands of total RNA samples in the MAQC project on the study of reproducibility, sensitivity and specificity of the data. &lt;b&gt;Availability:&lt;/b&gt; Code to implement the method in the statistical package R is available from the authors. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;jqfan@princeton.edu&quot; locator-type=&quot;email&quot;&gt;jqfan@princeton.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2391</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm361</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2415</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Robustness analysis and tuning of synthetic gene networks</dc:title>
<dc:creator>Batt, Gr&#233;gory</dc:creator>
<dc:creator>Yordanov, Boyan</dc:creator>
<dc:creator>Weiss, Ron</dc:creator>
<dc:creator>Belta, Calin</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; The goal of synthetic biology is to design and construct biological systems that present a desired behavior. The construction of synthetic gene networks implementing simple functions has demonstrated the feasibility of this approach. However, the design of these networks is difficult, notably because existing techniques and tools are not adapted to deal with uncertainties on molecular concentrations and parameter values. &lt;b&gt;Results:&lt;/b&gt; We propose an approach for the analysis of a class of uncertain piecewise-multiaffine differential equation models. This modeling framework is well adapted to the experimental data currently available. Moreover, these models present interesting mathematical properties that allow the development of efficient algorithms for solving robustness analyses and tuning problems. These algorithms are implemented in the tool RoVerGeNe, and their practical applicability and biological relevance are demonstrated on the analysis of the tuning of a synthetic transcriptional cascade built in &lt;it&gt;Escherichia coli&lt;/it&gt;. &lt;b&gt;Availability:&lt;/b&gt; RoVerGeNe and the transcriptional cascade model are available at &lt;inter-ref locator=&quot;http://iasi.bu.edu/%7Ebatt/rovergene/rovergene.htm&quot; locator-type=&quot;url&quot;&gt;http://iasi.bu.edu/%7Ebatt/rovergene/rovergene.htm&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;gregory.batt@imag.fr&quot; locator-type=&quot;email&quot;&gt;gregory.batt@imag.fr&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2415</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm362</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2407</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>LICORN: learning cooperative regulation networks from gene expression data</dc:title>
<dc:creator>Elati, Mohamed</dc:creator>
<dc:creator>Neuvial, Pierre</dc:creator>
<dc:creator>Bolotin-Fukuhara, Monique</dc:creator>
<dc:creator>Barillot, Emmanuel</dc:creator>
<dc:creator>Radvanyi, Fran&#231;ois</dc:creator>
<dc:creator>Rouveirol, C&#233;line</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; One of the most challenging tasks in the post-genomic era is the reconstruction of transcriptional regulation networks. The goal is to identify, for each gene expressed in a particular cellular context, the regulators affecting its transcription, and the co-ordination of several regulators in specific types of regulation. DNA microarrays can be used to investigate relationships between regulators and their target genes, through simultaneous observations of their RNA levels. &lt;b&gt;Results:&lt;/b&gt; We propose a &lt;it&gt;data mining&lt;/it&gt; system for inferring transcriptional regulation relationships from RNA expression values. This system is particularly suitable for the detection of cooperative transcriptional regulation. We model regulatory relationships as labelled two-layer gene regulatory networks, and describe a method for the efficient learning of these bipartite networks from discretized expression data sets. We also evaluate the statistical significance of such inferred networks and validate our methods on two public yeast expression data sets. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.lri.fr/~elati/licorn.html&quot; locator-type=&quot;url&quot;&gt;http://www.lri.fr/~elati/licorn.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;mohamed.elati@curie.fr&quot; locator-type=&quot;email&quot;&gt;mohamed.elati@curie.fr&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2407</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm352</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2399</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Haplotype inference for present absent genotype data using previously identified haplotypes and haplotype patterns</dc:title>
<dc:creator>Yoo, Yun Joo</dc:creator>
<dc:creator>Tang, Jianming</dc:creator>
<dc:creator>Kaslow, Richard A.</dc:creator>
<dc:creator>Zhang, Kui</dc:creator>
<dc:subject>GENETICS AND POPULATION ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Killer immunoglobulin-like receptor (KIR) genes vary considerably in their presence or absence on a specific regional haplotype. Because presence or absence of these genes is largely detected using locus-specific genotyping technology, the distinction between homozygosity and hemizygosity is often ambiguous. The performance of methods for haplotype inference (e.g. PL-EM, PHASE) for KIR genes may be compromised due to the large portion of ambiguous data. At the same time, many haplotypes or partial haplotype patterns have been previously identified and can be incorporated to facilitate haplotype inference for unphased genotype data. To accommodate the increased ambiguity of present&#8211;absent genotyping of KIR genes, we developed a hybrid approach combining a greedy algorithm with the Expectation-Maximization (EM) method for haplotype inference based on previously identified haplotypes and haplotype patterns. &lt;b&gt;Results:&lt;/b&gt; We implemented this algorithm in a software package named HAPLO-IHP (Haplotype inference using identified haplotype patterns) and compared its performance with that of HAPLORE and PHASE on simulated KIR genotypes. We compared five measures in order to evaluate the reliability of haplotype assignments and the accuracy in estimating haplotype frequency. Our method outperformed the two existing techniques by all five measures when either 60 % or 25 % of previously identified haplotypes were incorporated into the analyses. &lt;b&gt;Availability:&lt;/b&gt; The HAPLO-IHP is available at &lt;inter-ref locator=&quot;http://www.soph.uab.edu/Statgenetics/People/KZhang/HAPLO-IHP/index.html&quot; locator-type=&quot;url&quot;&gt;http://www.soph.uab.edu/Statgenetics/People/KZhang/HAPLO-IHP/index.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;KZhang@ms.soph.uab.edu&quot; locator-type=&quot;email&quot;&gt;KZhang@ms.soph.uab.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2399</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm371</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2423</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>miniTUBA: medical inference by network integration of temporal data using Bayesian analysis</dc:title>
<dc:creator>Xiang, Zuoshuang</dc:creator>
<dc:creator>Minter, Rebecca M.</dc:creator>
<dc:creator>Bi, Xiaoming</dc:creator>
<dc:creator>Woolf, Peter J.</dc:creator>
<dc:creator>He, Yongqun</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Many biomedical and clinical research problems involve discovering causal relationships between observations gathered from temporal events. Dynamic Bayesian networks are a powerful modeling approach to describe causal or apparently causal relationships, and support complex medical inference, such as future response prediction, automated learning, and rational decision making. Although many engines exist for creating Bayesian networks, most require a local installation and significant data manipulation to be practical for a general biologist or clinician. No software pipeline currently exists for interpretation and inference of dynamic Bayesian networks learned from biomedical and clinical data. &lt;b&gt;Results:&lt;/b&gt; miniTUBA is a web-based modeling system that allows clinical and biomedical researchers to perform complex medical/clinical inference and prediction using dynamic Bayesian network analysis with temporal datasets. The software allows users to choose different analysis parameters (e.g. Markov lags and prior topology), and continuously update their data and refine their results. miniTUBA can make temporal predictions to suggest interventions based on an automated learning process pipeline using all data provided. Preliminary tests using synthetic data and laboratory research data indicate that miniTUBA accurately identifies regulatory network structures from temporal data. &lt;b&gt;Availability:&lt;/b&gt; miniTUBA is available at &lt;inter-ref locator=&quot;http://www.minituba.org&quot; locator-type=&quot;url&quot;&gt;http://www.minituba.org&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;yongqunh@med.umich.edu&quot; locator-type=&quot;email&quot;&gt;yongqunh@med.umich.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2423</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm372</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2463</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Robust smooth segmentation approach for array CGH data analysis</dc:title>
<dc:creator>Huang, Jian</dc:creator>
<dc:creator>Gusnanto, Arief</dc:creator>
<dc:creator>O&apos;Sullivan, Kathleen</dc:creator>
<dc:creator>Staaf, Johan</dc:creator>
<dc:creator>Borg, &#197;ke</dc:creator>
<dc:creator>Pawitan, Yudi</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Array comparative genomic hybridization (aCGH) provides a genome-wide technique to screen for copy number alteration. The existing segmentation approaches for analyzing aCGH data are based on modeling data as a series of discrete segments with unknown boundaries and unknown heights. Although the biological process of copy number alteration is discrete, in reality a variety of biological and experimental factors can cause the signal to deviate from a stepwise function. To take this into account, we propose a smooth segmentation (smoothseg) approach. &lt;b&gt;Methods:&lt;/b&gt; To achieve a robust segmentation, we use a doubly heavy-tailed random-effect model. The first heavy-tailed structure on the errors deals with outliers in the observations, and the second deals with possible jumps in the underlying pattern associated with different segments. We develop a fast and reliable computational procedure based on the iterative weighted least-squares algorithm with band-limited matrix inversion. &lt;b&gt;Results:&lt;/b&gt; Using simulated and real data sets, we demonstrate how smoothseg can aid in identification of regions with genomic alteration and in classification of samples. For the real data sets, smoothseg leads to smaller false discovery rate and classification error rate than the circular binary segmentation (CBS) algorithm. In a realistic simulation setting, smoothseg is better than wavelet smoothing and CBS in identification of regions with genomic alterations and better than CBS in classification of samples. For comparative analyses, we demonstrate that segmenting the &lt;it&gt;t&lt;/it&gt;-statistics performs better than segmenting the data. &lt;b&gt;Availability:&lt;/b&gt; The R package &lt;ty&gt;smoothseg&lt;/ty&gt; to perform smooth segmentation is available from &lt;inter-ref locator=&quot;http://www.meb.ki.se/~yudpaw&quot; locator-type=&quot;url&quot;&gt;http://www.meb.ki.se/~yudpaw&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;yudi.pawitan@ki.se&quot; locator-type=&quot;email&quot;&gt;yudi.pawitan@ki.se&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2463</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm359</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2477</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Medline search engine for finding genetic markers with biological significance</dc:title>
<dc:creator>Xuan, Weijian</dc:creator>
<dc:creator>Wang, Pinglang</dc:creator>
<dc:creator>Watson, Stanley J.</dc:creator>
<dc:creator>Meng, Fan</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Genome-wide high density SNP association studies are expected to identify various SNP alleles associated with different complex disorders. Understanding the biological significance of these SNP alleles in the context of existing literature is a major challenge since existing search engines are not designed to search literature for SNPs or other genetic markers. The literature mining of gene and protein functions has received significant attention and effort while similar work on genetic markers and their related diseases is still in its infancy. Our goal is to develop a web-based tool that facilitates the mining of Medline literature related to genetic studies and gene/protein function studies. Our solution consists of four main function modules for (1) identification of different types of genetic markers or genetic variations in Medline records (2) distinguishing positive versus negative linkage or association between genetic markers and diseases (3) integrating marker genomic location data from different databases to enable the retrieval of Medline records related to markers in the same linkage disequilibrium region (4) and a web interface called MarkerInfoFinder to search, display, sort and download Medline citation results. Tests using published data suggest MarkerInfoFinder can significantly increase the efficiency of finding genetic disorders and their underlying molecular mechanisms. The functions we developed will also be used to build a knowledge base for genetic markers and diseases. &lt;b&gt;Availability:&lt;/b&gt; The MarkerInfoFinder is publicly available at: &lt;inter-ref locator=&quot;http://brainarray.mbni.med.umich.edu/brainarray/datamining/MarkerInfoFinder&quot; locator-type=&quot;url&quot;&gt;http://brainarray.mbni.med.umich.edu/brainarray/datamining/MarkerInfoFinder&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;mengf@umich.edu&quot; locator-type=&quot;email&quot;&gt;mengf@umich.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2477</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm375</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2449</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>An approach to predict transcription factor DNA binding site specificity based upon gene and transcription factor functional categorization</dc:title>
<dc:creator>Qian, Ziliang</dc:creator>
<dc:creator>Lu, Lingyi</dc:creator>
<dc:creator>Liu, XiaoJun</dc:creator>
<dc:creator>Cai, Yu-Dong</dc:creator>
<dc:creator>Li, Yixue</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; To understand transcription regulatory mechanisms, it is indispensable to investigate transcription factor (TF) DNA binding preferences. We noted that the generally acknowledged information of functional annotations of TFs as well as that of their target genes should provide useful hints in determining TF DNA binding preferences. &lt;b&gt;Results:&lt;/b&gt; In this contribution, we developed an integrative method based on the Nearest Neighbor Algorithm, to predict DNA binding preferences through integrating both the functional/structural information of TFs and the interaction between TFs and their targets. The accuracy of cross-validation tests on the dataset consisting of 3430 positive samples and 7000 negative samples reaches 87.0 % for 10-fold cross-validation and 87.9 % for jackknife cross-validation test, which is a much better result than that in our previous work. The prediction result indicates that the improved method we developed could be a powerful approach to infer the TF DNA preference &lt;it&gt;in silico&lt;/it&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;cyd@picb.ac.cn&quot; locator-type=&quot;email&quot;&gt;cyd@picb.ac.cn&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2449</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm348</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2470</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Analysis of array CGH data for cancer studies using fused quantile regression</dc:title>
<dc:creator>Li, Youjuan</dc:creator>
<dc:creator>Zhu, Ji</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; The identification of DNA copy number changes provides insights that may advance our understanding of initiation and progression of cancer. Array-based comparative genomic hybridization (array-CGH) has emerged as a technique allowing high-throughput genome-wide scanning for chromosomal aberrations. A number of statistical methods have been proposed for the analysis of array-CGH data. In this article, we consider a fused quantile regression model based on three motivations: (1) quantile regression may provide a more comprehensive picture for the ratio profile of copy numbers than the standard mean regression approach; (2) for simplicity, most available methods assume uniform spacing between neighboring clones, while incorporating the information of physical locations of clones may be helpful and (3) most current methods have a set of tuning parameters that must be carefully tuned, which introduces complexity to the implementation. &lt;b&gt;Results:&lt;/b&gt; We formulate the detection of regions of gains and losses in a fused regularized quantile regression framework, incorporating physical locations of clones. We derive an efficient algorithm that computes the entire solution path for the resulting optimization problem, and we propose a simple estimate for the complexity of the fitted model, which leads to convenient selection of the tuning parameter. Three published array-CGH datasets are used to demonstrate our approach. &lt;b&gt;Availability:&lt;/b&gt; R code are available at &lt;inter-ref locator=&quot;http://www.stat.lsa.umich.edu/~jizhu/code/cgh/&quot; locator-type=&quot;url&quot;&gt;http://www.stat.lsa.umich.edu/~jizhu/code/cgh/&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;jizhu@umich.edu&quot; locator-type=&quot;email&quot;&gt;jizhu@umich.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2470</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm364</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2488</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Quality estimation of multiple sequence alignments by Bayesian hypothesis testing</dc:title>
<dc:creator>Tomovic, Andrija</dc:creator>
<dc:creator>Oakeley, Edward J.</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; In this work we present a web-based tool for estimating multiple alignment quality using Bayesian hypothesis testing. The proposed method is very simple, easily implemented and not time consuming with a linear complexity. We evaluated method against a series of different alignments (a set of random and biologically derived alignments) and compared the results with tools based on classical statistical methods (such as sFFT and csFFT). Taking correlation coefficient as an objective criterion of the true quality, we found that Bayesian hypothesis testing performed better on average than the classical methods we tested. This approach may be used independently or as a component of any tool in computational biology which is based on the statistical estimation of alignment quality. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.fmi.ch/groups/functional.genomics/tool.htm&quot; locator-type=&quot;url&quot;&gt;http://www.fmi.ch/groups/functional.genomics/tool.htm&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;edward.oakeley@fmi.ch&quot; locator-type=&quot;email&quot;&gt;edward.oakeley@fmi.ch&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available from &lt;inter-ref locator=&quot;http://www.fmi.ch/groups/functional.genomics/tool-Supp.htm&quot; locator-type=&quot;url&quot;&gt;http://www.fmi.ch/groups/functional.genomics/tool-Supp.htm&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2488</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm366</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2485</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>HMMoC a compiler for hidden Markov models</dc:title>
<dc:creator>Lunter, Gerton</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Hidden Markov models are widely applied within computational biology. The large data sets and complex models involved demand optimized implementations, while efficient exploration of model space requires rapid prototyping. These requirements are not met by existing solutions, and hand-coding is time-consuming and error-prone. Here, I present a compiler that takes over the mechanical process of implementing HMM algorithms, by translating high-level XML descriptions into efficient C++ implementations. The compiler is highly customizable, produces efficient and bug-free code, and includes several optimizations. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://genserv.anat.ox.ac.uk/software&quot; locator-type=&quot;url&quot;&gt;http://genserv.anat.ox.ac.uk/software&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;gerton.lunter@dpag.ox.ac.uk&quot; locator-type=&quot;email&quot;&gt;gerton.lunter@dpag.ox.ac.uk&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2485</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm350</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2455</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Mining complex genotypic features for predicting HIV-1 drug resistance</dc:title>
<dc:creator>Saigo, Hiroto</dc:creator>
<dc:creator>Uno, Takeaki</dc:creator>
<dc:creator>Tsuda, Koji</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Human immunodeficiency virus type 1 (HIV-1) evolves in human body, and its exposure to a drug often causes mutations that enhance the resistance against the drug. To design an effective pharmacotherapy for an individual patient, it is important to accurately predict the drug resistance based on genotype data. Notably, the resistance is not just the simple sum of the effects of all mutations. Structural biological studies suggest that the association of mutations is crucial: even if mutations A or B alone do not affect the resistance, a significant change might happen when the two mutations occur together. Linear regression methods cannot take the associations into account, while decision tree methods can reveal only limited associations. Kernel methods and neural networks implicitly use all possible associations for prediction, but cannot select salient associations explicitly. &lt;b&gt;Results:&lt;/b&gt; Our method, &lt;it&gt;itemset boosting&lt;/it&gt;, performs linear regression in the complete space of power sets of mutations. It implements a forward feature selection procedure where, in each iteration, one mutation combination is found by an efficient branch-and-bound search. This method uses all possible combinations, and salient associations are explicitly shown. In experiments, our method worked particularly well for predicting the resistance of nucleotide reverse transcriptase inhibitors (NRTIs). Furthermore, it successfully recovered many mutation associations known in biological literature. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.kyb.mpg.de/bs/people/hiroto/iboost/&quot; locator-type=&quot;url&quot;&gt;http://www.kyb.mpg.de/bs/people/hiroto/iboost/&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;koji.tsuda@tuebingen.mpg.de&quot; locator-type=&quot;email&quot;&gt;koji.tsuda@tuebingen.mpg.de&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2455</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm353</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2491</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Con-Struct Map: a comparative contact map analysis tool</dc:title>
<dc:creator>Chung, Jo-Lan</dc:creator>
<dc:creator>Beaver, John E.</dc:creator>
<dc:creator>Scheeff, Eric D.</dc:creator>
<dc:creator>Bourne, Philip E.</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Con-Struct Map is a graphical tool for the comparative study of protein structures. The tool detects potential conserved residue contacts shared by multiple protein structures by superimposing their contact maps according to a multiple structure alignment. In general, Con-Struct Map allows the study of structural changes resulting from, e.g. sequence substitutions, or alternatively, the study of conserved components of a structure framework across structurally aligned proteins. Specific applications include the study of sequence-structure relationship in distantly related proteins and the comparisons of wild type and mutant proteins. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://pdbrs3.sdsc.edu/ConStructMap/viewer_argument_generator/singleArguments&quot; locator-type=&quot;url&quot;&gt;http://pdbrs3.sdsc.edu/ConStructMap/viewer_argument_generator/singleArguments&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;bourne@sdsc.edu&quot; locator-type=&quot;email&quot;&gt;bourne@sdsc.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2491</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm356</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2501</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>EcoProDB: the Escherichia coli protein database</dc:title>
<dc:creator>Yun, Hongseok</dc:creator>
<dc:creator>Lee, Jeong Wook</dc:creator>
<dc:creator>Jeong, Joonwoo</dc:creator>
<dc:creator>Chung, Jaesung</dc:creator>
<dc:creator>Park, Jong Myoung</dc:creator>
<dc:creator>Myoung, Han Na</dc:creator>
<dc:creator>Lee, Sang Yup</dc:creator>
<dc:subject>DATABASES AND ONTOLOGIES</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; EcoProDB is a web-based database for comparative proteomics of &lt;it&gt;Escherichia coli&lt;/it&gt;. The database contains information on &lt;it&gt;E. coli&lt;/it&gt; proteins identified on 2D gels along with other resources collected from various databases and published literature, with a special feature of showing the expression levels of &lt;it&gt;E. coli&lt;/it&gt; proteins under different genetic and environmental conditions. It also provides comparative information of subcellular localization, theoretical 2D map, experimental 2D map and integrated protein information via an interactive web interface and application such as the Map Browser. Users can also upload their own 2D gels, extract core information associated with the proteins and 2D gel results from different experiments and consequently generate new knowledge and hypotheses for further studies. &lt;b&gt;Availability:&lt;/b&gt; EcoProDB database system is accessible at &lt;inter-ref locator=&quot;http://eecoli.kaist.ac.kr&quot; locator-type=&quot;url&quot;&gt;http://eecoli.kaist.ac.kr&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;leesy@kaist.ac.kr&quot; locator-type=&quot;email&quot;&gt;leesy@kaist.ac.kr&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2501</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm351</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2493</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>RefPlus: an R package extending the RMA Algorithm</dc:title>
<dc:creator>Harbron, Chris</dc:creator>
<dc:creator>Chang, Kai-Ming</dc:creator>
<dc:creator>South, Marie C.</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; RMA has become a widely used methodology to pre-process Affymetrix gene expression microarrays. A limitation of RMA is that the calculated probeset intensities change when a set of microarrays is re-pre-processed after the inclusion of additional microarrays into the analysis set. Here we report the availability of the RefPlus package containing functions to perform the Extrapolation Strategy and Extrapolation Averaging algorithms which address these issues. &lt;b&gt;Availability:&lt;/b&gt; The software is implemented in the R language and can be downloaded from the Bioconductor project website (&lt;inter-ref locator=&quot;http://www.bioconductor.org&quot; locator-type=&quot;url&quot;&gt;http://www.bioconductor.org&lt;/inter-ref&gt;). &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;Chris.Harbron@AstraZeneca.Com&quot; locator-type=&quot;email&quot;&gt;Chris.Harbron@AstraZeneca.Com&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Further details of the workings and evaluation of these functions are given in the documentation available on the Bioconductor website. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2493</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm357</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2498</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>sMOL Explorer: an open source, web-enabled database and exploration tool for Small MOLecules datasets</dc:title>
<dc:creator>Ingsriswang, Supawadee</dc:creator>
<dc:creator>Pacharawongsakda, Eakasit</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; sMOL Explorer is a 2D ligand-based computational tool that provides three major functionalities: data management, information retrieval and extraction and statistical analysis and data mining through Web interface. With sMOL Explorer, users can create personal databases by adding each small molecule via a drawing interface or uploading the data files from internal and external projects into the sMOL database. Then, the database can be browsed and queried with textual and structural similarity search. The molecule can also be submitted to search against external public databases including PubChem, KEGG, DrugBank and eMolecules. Moreover, users can easily access a variety of data mining tools from Weka and R packages to perform analysis including (1) finding the frequent substructure, (2) clustering the molecular fingerprints, (3) identifying and removing irrelevant attributes from the data and (4) building the classification model of biological activity. &lt;b&gt;Availability:&lt;/b&gt; sMOL Explorer is an Open Source project and is freely available to all interested users at &lt;inter-ref locator=&quot;http://www.biotec.or.th/ISL/SMOL/&quot; locator-type=&quot;url&quot;&gt;http://www.biotec.or.th/ISL/SMOL/&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;supawadee@biotec.or.th&quot; locator-type=&quot;email&quot;&gt;supawadee@biotec.or.th&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2498</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm363</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2504</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A Laboratory Information Management System (LIMS) for a high throughput genetic platform aimed at candidate gene mutation screening</dc:title>
<dc:creator>Voegele, C.</dc:creator>
<dc:creator>Tavtigian, S.V.</dc:creator>
<dc:creator>de Silva, D.</dc:creator>
<dc:creator>Cuber, S.</dc:creator>
<dc:creator>Thomas, A.</dc:creator>
<dc:creator>Le Calvez-Kelm, F.</dc:creator>
<dc:subject>DATABASES AND ONTOLOGIES</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; High throughput mutation screening in an automated environment generates large data sets that have to be organized and stored reliably. Complex multistep workflows require strict process management and careful data tracking. We have developed a Laboratory Information Management Systems (LIMS) tailored to high throughput candidate gene mutation scanning and resequencing that respects these requirements. Designed with a client/server architecture, our system is platform independent and based on open-source tools from the database to the web application development strategy. Flexible, expandable and secure, the LIMS, by communicating with most of the laboratory instruments and robots, tracks samples and laboratory information, capturing data at every step of our automated mutation screening workflow. An important feature of our LIMS is that it enables tracking of information through a laboratory workflow where the process at one step is contingent on results from a previous step. &lt;b&gt;Availability:&lt;/b&gt; Script for MySQL database table creation and source code of the whole JSP application are freely available on our website: &lt;inter-ref locator=&quot;http://www-gcs.iarc.fr/lims/&quot; locator-type=&quot;url&quot;&gt;http://www-gcs.iarc.fr/lims/&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;voegele@iarc.fr&quot; locator-type=&quot;email&quot;&gt;voegele@iarc.fr&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; System server configuration, database structure and additional details on the LIMS and the mutation screening workflow are available on our website: &lt;inter-ref locator=&quot;http://www-gcs.iarc.fr/lims/&quot; locator-type=&quot;url&quot;&gt;http://www-gcs.iarc.fr/lims/&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2504</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm365</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/18/2495</identifier><datestamp>2007-09-17</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:18</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>APID2NET: unified interactome graphic analyzer</dc:title>
<dc:creator>Hernandez-Toro, Juan</dc:creator>
<dc:creator>Prieto, Carlos</dc:creator>
<dc:creator>De Las Rivas, Javier</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Exploration and analysis of interactome networks at systems level requires unification of the biomolecular elements and annotations that come from many different high-throughput or small-scale proteomic experiments. Only such integration can provide a non-redundant and consistent identification of proteins and interactions. APID2NET is a new tool that works with Cytoscape to allow surfing unified interactome data by querying APID server (&lt;inter-ref locator=&quot;http://bioinfow.dep.usal.es/apid/&quot; locator-type=&quot;url&quot;&gt;http://bioinfow.dep.usal.es/apid/&lt;/inter-ref&gt;) to provide interactive analysis of protein&#8211;protein interaction (PPI) networks. The program is designed to visualize, explore and analyze the proteins and interactions retrieved, including the annotations and attributes associated to them, such as: GO terms, InterPro domains, experimental methods that validate each interaction, PubMed IDs, UniProt IDs, etc. The tool provides interactive graphical representation of the networks with all Cytoscape capabilities, plus new automatic tools to find concurrent functional and structural attributes along all protein pairs in a network. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://bioinfow.dep.usal.es/apid/apid2net.html&quot; locator-type=&quot;url&quot;&gt;http://bioinfow.dep.usal.es/apid/apid2net.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;jrivas@usal.es&quot; locator-type=&quot;email&quot;&gt;jrivas@usal.es&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Installation Guide and User&apos;s Guide are supplied at the Web site indicated above. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-09-17</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/18/2495</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm373</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/538</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Improving the accuracy of transmembrane protein topology prediction using evolutionary information</dc:title>
<dc:creator>Jones, David T.</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Many important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell&#8211;cell communication, cell recognition and cell adhesion are mediated by membrane proteins. Unfortunately, as these proteins are not water soluble, it is extremely hard to experimentally determine their structure. Therefore, improved methods for predicting the structure of these proteins are vital in biological research. In order to improve transmembrane topology prediction, we evaluate the combined use of both integrated signal peptide prediction and evolutionary information in a single algorithm. &lt;b&gt;Results:&lt;/b&gt; A new method (MEMSAT3) for predicting transmembrane protein topology from sequence profiles is described and benchmarked with full cross-validation on a standard data set of 184 transmembrane proteins. The method is found to predict both the correct topology and the locations of transmembrane segments for 80% of the test set. This compares with accuracies of 62&#8211;72% for other popular methods on the same benchmark. By using a second neural network specifically to discriminate transmembrane from globular proteins, a very low overall false positive rate (0.5%) can also be achieved in detecting transmembrane proteins. &lt;b&gt;Availability:&lt;/b&gt; An implementation of the described method is available both as a web server (&lt;inter-ref locator=&quot;http://www.psipred.net&quot; locator-type=&quot;url&quot;&gt;http://www.psipred.net&lt;/inter-ref&gt;) and as downloadable source code from &lt;inter-ref locator=&quot;http://bioinf.cs.ucl.ac.uk/memsat&quot; locator-type=&quot;url&quot;&gt;http://bioinf.cs.ucl.ac.uk/memsat&lt;/inter-ref&gt;. Both the server and source code files are free to non-commercial users. Benchmark and training data are also available from &lt;inter-ref locator=&quot;http://bioinf.cs.ucl.ac.uk/memsat&quot; locator-type=&quot;url&quot;&gt;http://bioinf.cs.ucl.ac.uk/memsat&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;dtj@cs.ucl.ac.uk&quot; locator-type=&quot;email&quot;&gt;dtj@cs.ucl.ac.uk&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/538</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl677</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/527</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>I-Ssp6803I: the first homing endonuclease from the PD-(D/E)XK superfamily exhibits an unusual mode of DNA recognition</dc:title>
<dc:creator>Orlowski, Jerzy</dc:creator>
<dc:creator>Boniecki, Michal</dc:creator>
<dc:creator>Bujnicki, Janusz M.</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Restriction endonucleases (REases) and homing endonucleases (HEases) are biotechnologically important enzymes. Nearly all structurally characterized REases belong to the PD-(D/E)XK superfamily of nucleases, while most HEases belong to an unrelated LAGLIDADG superfamily. These two protein folds are typically associated with very different modes of protein-DNA recognition, consistent with the different mechanisms of action required to achieve high specificity. REases recognize short DNA sequences using multiple contacts per base pair, while HEases recognize very long sites using a few contacts per base pair, thereby allowing for partial degeneracy of the target sequence. Thus far, neither REases with the LAGLIDADG fold, nor HEases with the PD-(D/E)XK fold, have been found. &lt;b&gt;Results:&lt;/b&gt; Using protein fold recognition, we have identified the first member of the PD-(D/E)XK superfamily among homing endonucleases, a cyanobacterial enzyme I-Ssp6803I. We present a model of the I-Ssp6803I-DNA complex based on the structure of Type II restriction endonuclease R.BglI and predict the active site and residues involved in specific DNA sequence recognition by I-Ssp6803I. Our finding reveals a new unexpected evolutionary link between HEases and REases and suggests how PD-(D/E)XK nucleases may develop a &#8216;HEase-like&#8217; way of interacting with the extended DNA sequence. This in turn may be exploited to study the evolution of DNA sequence specificity and to engineer nucleases with new substrate specificities. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;iamb@genesilico.pl&quot; locator-type=&quot;email&quot;&gt;iamb@genesilico.pl&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/527</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm007</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/555</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A new protein-protein docking scoring function based on interface residue properties</dc:title>
<dc:creator>Bernauer, J.</dc:creator>
<dc:creator>Az&#233;, J.</dc:creator>
<dc:creator>Janin, J.</dc:creator>
<dc:creator>Poupon, A.</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Protein&#8211;protein complexes are known to play key roles in many cellular processes. However, they are often not accessible to experimental study because of their low stability and difficulty to produce the proteins and assemble them in native conformation. Thus, docking algorithms have been developed to provide an &lt;it&gt;in silico&lt;/it&gt; approach of the problem. A protein&#8211;protein docking procedure traditionally consists of two successive tasks: a search algorithm generates a large number of candidate solutions, and then a scoring function is used to rank them. &lt;b&gt;Results:&lt;/b&gt; To address the second step, we developed a scoring function based on a Vorono&#239; tessellation of the protein three-dimensional structure. We showed that the Vorono&#239; representation may be used to describe in a simplified but useful manner, the geometric and physico-chemical complementarities of two molecular surfaces. We measured a set of parameters on native protein&#8211;protein complexes and on decoys, and used them as attributes in several statistical learning procedures: a logistic function, Support Vector Machines (SVM), and a genetic algorithm. For the later, we used ROGER, a genetic algorithm designed to optimize the area under the receiver operating characteristics curve. To further test the scores derived with ROGER, we ranked models generated by two different docking algorithms on targets of a blind prediction experiment, improving in almost all cases the rank of native-like solutions. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://genomics.eu.org/spip/-Bioinformatics-tools-&quot; locator-type=&quot;url&quot;&gt;http://genomics.eu.org/spip/-Bioinformatics-tools-&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/555</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl654</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/545</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>The Treeterbi and Parallel Treeterbi algorithms: efficient, optimal decoding for ordinary, generalized and pair HMMs</dc:title>
<dc:creator>Keibler, Evan</dc:creator>
<dc:creator>Arumugam, Manimozhiyan</dc:creator>
<dc:creator>Brent, Michael R.</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Hidden Markov models (HMMs) and generalized HMMs been successfully applied to many problems, but the standard Viterbi algorithm for computing the most probable interpretation of an input sequence (known as decoding) requires memory proportional to the length of the sequence, which can be prohibitive. Existing approaches to reducing memory usage either sacrifice optimality or trade increased running time for reduced memory. &lt;b&gt;Results:&lt;/b&gt; We developed two novel decoding algorithms, Treeterbi and Parallel Treeterbi, and implemented them in the TWINSCAN/N-SCAN gene-prediction system. The worst case asymptotic space and time are the same as for standard Viterbi, but in practice, Treeterbi optimally decodes arbitrarily long sequences with generalized HMMs in bounded memory without increasing running time. Parallel Treeterbi uses the same ideas to split optimal decoding across processors, dividing latency to completion by approximately the number of available processors with constant average overhead per processor. Using these algorithms, we were able to optimally decode all human chromosomes with N-SCAN, which increased its accuracy relative to heuristic solutions. We also implemented Treeterbi for Pairagon, our pair HMM based cDNA-to-genome aligner. &lt;b&gt;Availability:&lt;/b&gt; The TWINSCAN/N-SCAN/PAIRAGON open source software package is available from &lt;inter-ref locator=&quot;http://genes.cse.wustl.edu&quot; locator-type=&quot;url&quot;&gt;http://genes.cse.wustl.edu&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;brent@cse.wustl.edu&quot; locator-type=&quot;email&quot;&gt;brent@cse.wustl.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/545</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl659</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/531</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Computing exact P-values for DNA motifs</dc:title>
<dc:creator>Zhang, Jing</dc:creator>
<dc:creator>Jiang, Bo</dc:creator>
<dc:creator>Li, Ming</dc:creator>
<dc:creator>Tromp, John</dc:creator>
<dc:creator>Zhang, Xuegong</dc:creator>
<dc:creator>Zhang, Michael Q.</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Many heuristic algorithms have been designed to approximate &lt;it&gt;P&lt;/it&gt;-values of DNA motifs described by position weight matrices, for evaluating their statistical significance. They often significantly deviate from the true &lt;it&gt;P&lt;/it&gt;-value by orders of magnitude. Exact &lt;it&gt;P&lt;/it&gt;-value computation is needed for ranking the motifs. Furthermore, surprisingly, the complexity of the problem is unknown. &lt;b&gt;Results:&lt;/b&gt; We show the problem to be NP-hard, and present MotifRank, software based on dynamic programming, to calculate exact &lt;it&gt;P&lt;/it&gt;-values of motifs. We define the exact &lt;it&gt;P&lt;/it&gt;-value on a general and more precise model. Asymptotically, MotifRank is faster than the best exact &lt;it&gt;P&lt;/it&gt;-value computing algorithm, and is in fact practical. Our experiments clearly demonstrate that MotifRank significantly improves the accuracy of existing approximation algorithms. &lt;b&gt;Availability:&lt;/b&gt; MotifRank is available from &lt;inter-ref locator=&quot;http://bio.dlg.cn&quot; locator-type=&quot;url&quot;&gt;http://bio.dlg.cn&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;mzhang@cshl.edu&quot; locator-type=&quot;email&quot;&gt;mzhang@cshl.edu&lt;/inter-ref&gt; &lt;inter-ref locator=&quot;mli@uwaterloo.ca&quot; locator-type=&quot;email&quot;&gt;mli@uwaterloo.ca&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/531</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl662</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/627</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>SNPchip: R classes and methods for SNP array data</dc:title>
<dc:creator>Scharpf, Robert B.</dc:creator>
<dc:creator>Ting, Jason C.</dc:creator>
<dc:creator>Pevsner, Jonathan</dc:creator>
<dc:creator>Ruczinski, Ingo</dc:creator>
<dc:subject>GENOME ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; High-density single nucleotide polymorphism microarrays (SNP chips) provide information on a subject&apos;s genome, such as copy number and genotype (heterozygosity/homozygosity) at a SNP. While fluorescence &lt;it&gt;in situ&lt;/it&gt; hybridization and karyotyping reveal many abnormalities, SNP chips provide a higher resolution map of the human genome that can be used to detect, e.g., aneuploidies, microdeletions, microduplications and loss of heterozygosity (LOH). As a variety of diseases are linked to such chromosomal abnormalities, SNP chips promise new insights for these diseases by aiding in the discovery of such regions, and may suggest targets for intervention. The R package &lt;it&gt;SNPchip&lt;/it&gt; contains classes and methods useful for storing, visualizing and analyzing high density SNP data. Originally developed from the SNPscan web-tool, &lt;it&gt;SNPchip&lt;/it&gt; utilizes S4 classes and extends other open source R tools available at Bioconductor. This has numerous advantages, including the ability to build statistical models for SNP-level data that operate on instances of the class, and to communicate with other R packages that add additional functionality. &lt;b&gt;Availability:&lt;/b&gt; The package is available from the Bioconductor web page at &lt;inter-ref locator=&quot;www.bioconductor.org&quot; locator-type=&quot;url&quot;&gt;www.bioconductor.org&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;ingo@jhu.edu&quot; locator-type=&quot;email&quot;&gt;ingo@jhu.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; The supplementary material as described in this article (case studies, installation guidelines and R code) is available from &lt;inter-ref locator=&quot;http://biostat.jhsph.edu/~iruczins/publications/sm/&quot; locator-type=&quot;url&quot;&gt;http://biostat.jhsph.edu/~iruczins/publications/sm/&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/627</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl638</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/582</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Analysis of E.coli promoter recognition problem in dinucleotide feature space</dc:title>
<dc:creator>Rani, T. Sobha</dc:creator>
<dc:creator>Bhavani, S. Durga</dc:creator>
<dc:creator>Bapi, Raju S.</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Patterns in the promoter sequences within a species are known to be conserved but there exist many exceptions to this rule which makes the promoter recognition a complex problem. Although many complex feature extraction schemes coupled with several classifiers have been proposed for promoter recognition in the current literature, the problem is still open. &lt;b&gt;Results:&lt;/b&gt; A dinucleotide global feature extraction method is proposed for the recognition of sigma-70 promoters in &lt;it&gt;Escherichia coli&lt;/it&gt; in this article. The positive data set consists of sigma-70 promoters with known transcription starting points which are part of regulonDB and promec databases. Four different kinds of negative data sets are considered, two of them biological sets (Gordon &lt;it&gt;et al&lt;/it&gt;., &lt;cross-ref type=&quot;bib&quot; refid=&quot;B5&quot;&gt;2003&lt;/cross-ref&gt;) and the other two synthetic data sets. Our results reveal that a single-layer perceptron using dinucleotide features is able to achieve an accuracy of 80% against a background of biological non-promoters and 96% for random data sets. A scheme for locating the promoter regions in a given genome sequence is proposed. A deeper analysis of the data set shows that there is a bifurcation of the data set into two distinct classes, a majority class and a minority class. Our results point out that majority class constituting the majority promoter and the majority non-promoter signal is linearly separable. Also the minority class is linearly separable. We further show that the feature extraction and classification methods proposed in the paper are generic enough to be applied to the more complex problem of eucaryotic promoter recognition. We present Drosophila promoter recognition as a case study. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://202.41.85.117/htmfiles/faculty/tsr/tsr.html&quot; locator-type=&quot;url&quot;&gt;http://202.41.85.117/htmfiles/faculty/tsr/tsr.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;tsrcs@uohyd.ernet.in&quot; locator-type=&quot;email&quot;&gt;tsrcs@uohyd.ernet.in&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/582</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl670</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/605</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Comparison of human protein-protein interaction maps</dc:title>
<dc:creator>Futschik, Matthias E.</dc:creator>
<dc:creator>Chaurasia, Gautam</dc:creator>
<dc:creator>Herzel, Hanspeter</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Large-scale mappings of protein&#8211;protein interactions have started to give us new views of the complex molecular mechanisms inside a cell. After initial projects to systematically map protein interactions in model organisms such as yeast, worm and fly, researchers have begun to focus on the mapping of the human interactome. To tackle this enormous challenge, different approaches have been proposed and pursued. While several large-scale human protein interaction maps have recently been published, their quality remains to be critically assessed. &lt;b&gt;Results:&lt;/b&gt; We present here a first comparative analysis of eight currently available large-scale maps with a total of over 10&#8201;000 unique proteins and 57&#8201;000 interactions included. They are based either on literature search, orthology or by yeast-two-hybrid assays. Comparison reveals only a small, but statistically significant overlap. More importantly, our analysis gives clear indications that all interaction maps imply considerable selection and detection biases. These results have to be taken into account for future assembly of the human interactome. &lt;b&gt;Availability:&lt;/b&gt; An integrated human interaction network called Unified Human Interactome (&lt;it&gt;UniHI&lt;/it&gt;) is made publicly accessible at &lt;inter-ref locator=&quot;http://www.mdc-berlin.de/unihi&quot; locator-type=&quot;url&quot;&gt;http://www.mdc-berlin.de/unihi&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;m.futschik@biologie.hu-berlin.de&quot; locator-type=&quot;email&quot;&gt;m.futschik@biologie.hu-berlin.de&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/605</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl683</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/612</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Speeding up tandem mass spectrometry database search: metric embeddings and fast near neighbor search</dc:title>
<dc:creator>Dutta, Debojyoti</dc:creator>
<dc:creator>Chen, Ting</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Due to the recent advances in technology of mass spectrometry, there has been an exponential increase in the amount of data being generated in the past few years. Database searches have not been able to keep with this data explosion. Thus, speeding up the data searches becomes increasingly important in mass-spectrometry-based applications. Traditional database search methods use one-against-all comparisons of a query spectrum against a very large number of peptides generated from &lt;it&gt;in silico&lt;/it&gt; digestion of protein sequences in a database, to filter potential candidates from this database followed by a detailed scoring and ranking of those filtered candidates. &lt;b&gt;Results:&lt;/b&gt; In this article, we show that we can avoid the one-against-all comparisons. The basic idea is to design a set of hash functions to pre-process peptides in the database such that for each query spectrum we can use the hash functions to find only a small subset of peptide sequences that are most likely to match the spectrum. The construction of each hash function is based on a random spectrum and the hash value of a peptide is the normalized shared peak counts score (cosine) between the random spectrum and the hypothetical spectrum of the peptide. To implement this idea, we first embed each peptide into a unit vector in a high-dimensional metric space. The random spectrum is represented by a random vector, and we use random vectors to construct a set of hash functions called locality sensitive hashing (LSH) for preprocessing. We demonstrate that our mapping is accurate. We show that our method can filter out &gt;95.65% of the spectra without missing any correct sequences, or gain 111 times speedup by filtering out 99.64% of spectra while missing at most 0.19% (2 out of 1014) of the correct sequences. In addition, we show that our method can be effectively used for other mass spectra mining applications such as finding clusters of spectra efficiently and accurately. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;tingchen@usc.edu&quot; locator-type=&quot;email&quot;&gt;tingchen@usc.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/612</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl645</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/619</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Peak selection from MALDI-TOF mass spectra using ant colony optimization</dc:title>
<dc:creator>Ressom, H. W.</dc:creator>
<dc:creator>Varghese, R. S.</dc:creator>
<dc:creator>Drake, S. K.</dc:creator>
<dc:creator>Hortin, G. L.</dc:creator>
<dc:creator>Abdel-Hamid, M.</dc:creator>
<dc:creator>Loffredo, C. A.</dc:creator>
<dc:creator>Goldman, R.</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Due to the large number of peaks in mass spectra of low-molecular-weight (LMW) enriched sera, a systematic method is needed to select a parsimonious set of peaks to facilitate biomarker identification. We present computational methods for matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) spectral data preprocessing and peak selection. In particular, we propose a novel method that combines ant colony optimization (ACO) with support vector machines (SVM) to select a small set of useful peaks. &lt;b&gt;Results:&lt;/b&gt; The proposed hybrid ACO-SVM algorithm selected a panel of eight peaks out of 228 candidate peaks from MALDI-TOF spectra of LMW enriched sera. An SVM classifier built with these peaks achieved 94% sensitivity and 100% specificity in distinguishing hepatocellular carcinoma from cirrhosis in a blind validation set of 69 samples. Area under the receiver operating characteristic (ROC) curve was 0.996. The classification capability of these peaks is compared with those selected by the SVM-recursive feature elimination method. &lt;b&gt;Availability:&lt;/b&gt; Supplementary material and MATLAB scripts to implement the methods described in this article are available at &lt;inter-ref locator=&quot;http://microarray.georgetown.edu/web/files/bioinf.htm&quot; locator-type=&quot;url&quot;&gt;http://microarray.georgetown.edu/web/files/bioinf.htm&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;hwr@georgetown.edu&quot; locator-type=&quot;email&quot;&gt;hwr@georgetown.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/619</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl678</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/597</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Protein-protein interaction site prediction based on conditional random fields</dc:title>
<dc:creator>Li, Ming-Hui</dc:creator>
<dc:creator>Lin, Lei</dc:creator>
<dc:creator>Wang, Xiao-Long</dc:creator>
<dc:creator>Liu, Tao</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; We are motivated by the fast-growing number of protein structures in the Protein Data Bank with necessary information for prediction of protein&#8211;protein interaction sites to develop methods for identification of residues participating in protein&#8211;protein interactions. We would like to compare conditional random fields (CRFs)-based method with conventional classification-based methods that omit the relation between two labels of neighboring residues to show the advantages of CRFs-based method in predicting protein&#8211;protein interaction sites. &lt;b&gt;Results:&lt;/b&gt; The prediction of protein&#8211;protein interaction sites is solved as a sequential labeling problem by applying CRFs with features including protein sequence profile and residue accessible surface area. The CRFs-based method can achieve a comparable performance with state-of-the-art methods, when 1276 nonredundant hetero-complex protein chains are used as training and test set. Experimental result shows that CRFs-based method is a powerful and robust protein&#8211;protein interaction site prediction method and can be used to guide biologists to make specific experiments on proteins. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.insun.hit.edu.cn/~mhli/site_CRFs/index.html&quot; locator-type=&quot;url&quot;&gt;http://www.insun.hit.edu.cn/~mhli/site_CRFs/index.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;mhli@insun.hit.edu.cn&quot; locator-type=&quot;email&quot;&gt;mhli@insun.hit.edu.cn&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/597</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl660</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/573</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Structure-based evaluation of in silico predictions of protein-protein interactions using Comparative Docking</dc:title>
<dc:creator>Cockell, Simon J.</dc:creator>
<dc:creator>Oliva, Baldo</dc:creator>
<dc:creator>Jackson, Richard M.</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Due to the limitations in experimental methods for determining binary interactions and structure determination of protein complexes, the need exists for computational models to fill the increasing gap between genome sequence information and protein annotation. Here we describe a novel method that uses structural models to reduce a large number of &lt;it&gt;in silico&lt;/it&gt; predictions to a high confidence subset that is amenable to experimental validation. &lt;b&gt;Results:&lt;/b&gt; A two-stage evaluation procedure was developed, first, a sequence-based method assessed the conservation of protein interface patches used in the original &lt;it&gt;in silico&lt;/it&gt; prediction method, both in terms of position within the primary sequence, and in terms of sequence conservation. When applying the most stringent conditions it was found that 20.5% of the data set being assessed passed this test. Secondly, a high-throughput structure-based docking evaluation procedure assessed the soundness of three dimensional models produced for the putative interactions. Of the data set being assessed, 8264 interactions or over 70% could be modelled in this way, and 27% of these can be considered &#8216;valid&#8217; by the applied criteria. In all, 6.9% of the interactions passed both the tests and can be considered to be a high confidence set of predicted interactions, several of which are described. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://bioinformatics.leeds.ac.uk/~bmb4sjc&quot; locator-type=&quot;url&quot;&gt;http://bioinformatics.leeds.ac.uk/~bmb4sjc&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;r.m.jackson@leeds.ac.uk&quot; locator-type=&quot;email&quot;&gt;r.m.jackson@leeds.ac.uk&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/573</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl661</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/589</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Automatic recognition and annotation of gene expression patterns of fly embryos</dc:title>
<dc:creator>Zhou, Jie</dc:creator>
<dc:creator>Peng, Hanchuan</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Gene expression patterns obtained by &lt;it&gt;in situ&lt;/it&gt; mRNA hybridization provide important information about different genes during &lt;it&gt;Drosophila&lt;/it&gt; embryogenesis. So far, annotations of these images are done by manually assigning a subset of anatomy ontology terms to an image. This time-consuming process depends heavily on the consistency of experts. &lt;b&gt;Results:&lt;/b&gt; We develop a system to automatically annotate a fruitfly&apos;s embryonic tissue in which a gene has expression. We formulate the task as an image pattern recognition problem. For a new fly embryo image, our system answers two questions: (1) Which stage range does an image belong to? (2) Which annotations should be assigned to an image? We propose to identify the wavelet embryo features by multi-resolution 2D wavelet discrete transform, followed by min-redundancy max-relevance feature selection, which yields optimal distinguishing features for an annotation. We then construct a series of parallel bi-class predictors to solve the multi-objective annotation problem since each image may correspond to multiple annotations. &lt;b&gt;Supplementary information:&lt;/b&gt; The complete annotation prediction results are available at: &lt;inter-ref locator=&quot;http://www.cs.niu.edu/~jzhou/papers/fruitfly&quot; locator-type=&quot;url&quot;&gt;http://www.cs.niu.edu/~jzhou/papers/fruitfly&lt;/inter-ref&gt; and &lt;inter-ref locator=&quot;http://research.janelia.org/peng/proj/fly_embryo_annotation/&quot; locator-type=&quot;url&quot;&gt;http://research.janelia.org/peng/proj/fly_embryo_annotation/&lt;/inter-ref&gt;. The datasets used in experiments will be available upon request to the correspondence author. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;jzhou@cs.niu.edu&quot; locator-type=&quot;email&quot;&gt;jzhou@cs.niu.edu&lt;/inter-ref&gt; and &lt;inter-ref locator=&quot;pengh@janelia.hhmi.org&quot; locator-type=&quot;email&quot;&gt;pengh@janelia.hhmi.org&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/589</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl680</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/563</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Molecular basis for specificity in the druggable kinome: sequence-based analysis</dc:title>
<dc:creator>Chen, Jianping</dc:creator>
<dc:creator>Zhang, Xi</dc:creator>
<dc:creator>Fern&#225;ndez, Ariel</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Rational design of kinase inhibitors remains a challenge partly because there is no clear delineation of the molecular features that direct the pharmacological impact towards clinically relevant targets. Standard factors governing ligand affinity, such as potential for intermolecular hydrophobic interactions or for intermolecular hydrogen bonding do not provide good markers to assess cross reactivity. &lt;it&gt;Thus, a core question in the informatics of drug design is what type of molecular similarity among targets promotes promiscuity and what type of molecular difference governs specificity&lt;/it&gt;. This work answers the question for a sizable screened sample of the human pharmacokinome including targets with unreported structure. &lt;b&gt;Results:&lt;/b&gt; We show that drug design aimed at promoting pairwise interactions between ligand and kinase target actually fosters promiscuity because of the high conservation of the partner groups on or around the ATP-binding site of the kinase. Alternatively, we focus on a structural marker that may be reliably determined from sequence and measures dehydration propensities mostly localized on the loopy regions of kinases. Based on this marker, we construct a sequence-based kinase classifier that enables the accurate prediction of pharmacological differences. Our indicator is a microenvironmental descriptor that quantifies the propensity for water exclusion around preformed polar pairs. The results suggest that targeting polar dehydration patterns heralds a new generation of drugs that enable a tighter control of specificity than designs aimed at promoting ligand&#8211;kinase pairwise interactions. &lt;b&gt;Availability:&lt;/b&gt; The predictor of polar hot spots for dehydration propensity, or solvent-accessible hydrogen bonds in soluble proteins, named YAPView, may be freely downloaded from the University of Chicago website &lt;inter-ref locator=&quot;http://protlib.uchicago.edu/dloads.html&quot; locator-type=&quot;url&quot;&gt;http://protlib.uchicago.edu/dloads.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;arifer@rice.edu&quot; locator-type=&quot;email&quot;&gt;arifer@rice.edu&lt;/inter-ref&gt;, &lt;inter-ref locator=&quot;ariel@uchicago.edu&quot; locator-type=&quot;email&quot;&gt;ariel@uchicago.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/563</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl666</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/648</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>APOPTO-CELL--a simulation tool and interactive database for analyzing cellular susceptibility to apoptosis</dc:title>
<dc:creator>Huber, Heinrich J.</dc:creator>
<dc:creator>Rehm, Markus</dc:creator>
<dc:creator>Plchut, Martin</dc:creator>
<dc:creator>D&#252;ssmann, Heiko</dc:creator>
<dc:creator>Prehn, Jochen H. M.</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> We have developed a web service that provides a comprehensive analysis of the susceptibility of cells to undergo apoptosis in response to an activation of the mitochondrial apoptotic pathway. Based on ordinary differential equations, (pre-determined) protein concentrations and release kinetics of mitochondrial pro-apoptotic factors, a network of 52 reactions and 19 reaction partners can be employed as a tool to display temporal protein profiles, to identify key regulatory proteins and to determine critical threshold concentrations required for the execution of apoptosis in HeLa cancer cells or other cell types. The web service also provides an interactive database function for the deposition of cell-type-specific quantitative data. In addition, the web service provides an output that can be compared directly to experimental results obtained from real-time single-cell experiments, making this a widely applicable systems biology tool for apoptosis and cancer researchers. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://systemsbiology.rcsi.ie/apopto-cell.html&quot; locator-type=&quot;url&quot;&gt;http://systemsbiology.rcsi.ie/apopto-cell.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;mrehm@rcsi.ie&quot; locator-type=&quot;email&quot;&gt;mrehm@rcsi.ie&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/648</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl684</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/631</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>OMWSA: detection of DNA repeats using moving window spectral analysis</dc:title>
<dc:creator>Du, Liping</dc:creator>
<dc:creator>Zhou, Hongxia</dc:creator>
<dc:creator>Yan, Hong</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Repetitive DNA sequences play paramount biological roles, such as gene variation and regulatory functions on gene expressions. Until now, detection of various kinds of DNA repeats accurately is still an open problem. In this article, we propose a new method and a visualization tool for detecting DNA repeats in a 2D plane of location and frequency by using optimized moving window spectral analysis. The spectrogram can display the general distribution of repetitive sequences while showing the repeat period, length and location without any prior knowledge. Experimental results demonstrate that our method is accurate and robust even under the condition of excessive mutating and interleaving. &lt;b&gt;Availability:&lt;/b&gt; Available on &lt;inter-ref locator=&quot;http://www.hy8.com/~tec/sw01/omwsa01.zip&quot; locator-type=&quot;url&quot;&gt;http://www.hy8.com/~tec/sw01/omwsa01.zip&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;h.yan@cityu.edu.hk&quot; locator-type=&quot;email&quot;&gt;h.yan@cityu.edu.hk&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/631</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm008</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/651</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>BioWeka--extending the Weka framework for bioinformatics</dc:title>
<dc:creator>Gewehr, Jan E.</dc:creator>
<dc:creator>Szugat, Martin</dc:creator>
<dc:creator>Zimmer, Ralf</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Given the growing amount of biological data, data mining methods have become an integral part of bioinformatics research. Unfortunately, standard data mining tools are often not sufficiently equipped for handling raw data such as e.g. amino acid sequences. One popular and freely available framework that contains many well-known data mining algorithms is the Waikato Environment for Knowledge Analysis (Weka). In the BioWeka project, we introduce various input formats for bioinformatics data and bioinformatics methods like alignments to Weka. This allows users to easily combine them with Weka&apos;s classification, clustering, validation and visualization facilities on a single platform and therefore reduces the overhead of converting data between different data formats as well as the need to write custom evaluation procedures that can deal with many different programs. We encourage users to participate in this project by adding their own components and data formats to BioWeka. &lt;b&gt;Availability:&lt;/b&gt; The software, documentation and tutorial are available at &lt;inter-ref locator=&quot;http://www.bioweka.org&quot; locator-type=&quot;url&quot;&gt;http://www.bioweka.org&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;support@bioweka.org&quot; locator-type=&quot;email&quot;&gt;support@bioweka.org&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/651</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl671</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/634</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins</dc:title>
<dc:creator>Hwang, Seungwoo</dc:creator>
<dc:creator>Gou, Zhenkun</dc:creator>
<dc:creator>Kuznetsov, Igor B.</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; This article describes DP-Bind, a web server for predicting DNA-binding sites in a DNA-binding protein from its amino acid sequence. The web server implements three machine learning methods: support vector machine, kernel logistic regression and penalized logistic regression. Prediction can be performed using either the input sequence alone or an automatically generated profile of evolutionary conservation of the input sequence in the form of PSI-BLAST position-specific scoring matrix (PSSM). PSSM-based kernel logistic regression achieves the accuracy of 77.2%, sensitivity of 76.4% and specificity of 76.6%. The outputs of all three individual methods are combined into a consensus prediction to help identify positions predicted with high level of confidence. &lt;b&gt;Availability:&lt;/b&gt; Freely available at &lt;inter-ref locator=&quot;http://lcg.rit.albany.edu/dp-bind&quot; locator-type=&quot;url&quot;&gt;http://lcg.rit.albany.edu/dp-bind&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;IKuznetsov@albany.edu&quot; locator-type=&quot;email&quot;&gt;IKuznetsov@albany.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementry information:&lt;/b&gt; &lt;inter-ref locator=&quot;http://lcg.rit.albany.edu/dp-bind/dpbind_supplement.html&quot; locator-type=&quot;url&quot;&gt;http://lcg.rit.albany.edu/dp-bind/dpbind_supplement.html&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/634</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl672</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/639</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Software package for automatic microarray image analysis (MAIA)</dc:title>
<dc:creator>Novikov, Eugene</dc:creator>
<dc:creator>Barillot, Emmanuel</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Although various software solutions are currently available for microarray image analysis, one would still expect to develop algorithms ensuring higher level of intelligence and robustness. We present a fully functional software package for automatic processing of the two-color microarray images including spot localization, quantification and quality control. The developed algorithms aim at making ratio estimates more resistant to array contamination and offer automatic tools to evaluate spot quality. &lt;b&gt;Availability:&lt;/b&gt; A demo version of the software can be downloaded from &lt;inter-ref locator=&quot;http://bioinfo.curie.fr/projects/maia&quot; locator-type=&quot;url&quot;&gt;http://bioinfo.curie.fr/projects/maia&lt;/inter-ref&gt;. A full version is freely available to non-commercial users upon request from the authors. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;eugene.novikov@curie.fr&quot; locator-type=&quot;email&quot;&gt;eugene.novikov@curie.fr&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/639</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl644</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/641</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>R/qtlbim: QTL with Bayesian Interval Mapping in experimental crosses</dc:title>
<dc:creator>Yandell, Brian S.</dc:creator>
<dc:creator>Mehta, Tapan</dc:creator>
<dc:creator>Banerjee, Samprit</dc:creator>
<dc:creator>Shriner, Daniel</dc:creator>
<dc:creator>Venkataraman, Ramprasad</dc:creator>
<dc:creator>Moon, Jee Young</dc:creator>
<dc:creator>Neely, W. Whipple</dc:creator>
<dc:creator>Wu, Hao</dc:creator>
<dc:creator>von Smith, Randy</dc:creator>
<dc:creator>Yi, Nengjun</dc:creator>
<dc:subject>GENETICS AND POPULATION ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; R/qtlbim is an extensible, interactive environment for the Bayesian Interval Mapping of QTL, built on top of R/qtl (Broman &lt;it&gt;et al&lt;/it&gt;., &lt;cross-ref type=&quot;bib&quot; refid=&quot;B3&quot;&gt;2003&lt;/cross-ref&gt;), providing Bayesian analysis of multiple interacting quantitative trait loci (QTL) models for continuous, binary and ordinal traits in experimental crosses. It includes several efficient Markov chain Monte Carlo (MCMC) algorithms for evaluating the posterior of genetic architectures, i.e. the number and locations of QTL, their main and epistatic effects and gene&#8211;environment interactions. R/qtlbim provides extensive informative graphical and numerical summaries, and model selection and convergence diagnostics of the MCMC output, illustrated through the vignette, example and demo capabilities of R (R Development Core Team &lt;cross-ref type=&quot;bib&quot; refid=&quot;B6&quot;&gt;2006&lt;/cross-ref&gt;). &lt;b&gt;Availability:&lt;/b&gt; The package is freely available from &lt;inter-ref locator=&quot;cran.r-project.org&quot; locator-type=&quot;url&quot;&gt;cran.r-project.org&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;byandell@wisc.edu&quot; locator-type=&quot;email&quot;&gt;byandell@wisc.edu&lt;/inter-ref&gt; or &lt;inter-ref locator=&quot;nyi@ms.soph.uab.edu&quot; locator-type=&quot;email&quot;&gt;nyi@ms.soph.uab.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/641</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm011</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/654</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>The ESID Online Database network</dc:title>
<dc:creator>Guzman, D.</dc:creator>
<dc:creator>Veit, D.</dc:creator>
<dc:creator>Knerr, V.</dc:creator>
<dc:creator>Kindle, G.</dc:creator>
<dc:creator>Gathmann, B.</dc:creator>
<dc:creator>Eades-Perner, A. M.</dc:creator>
<dc:creator>Grimbacher, B.</dc:creator>
<dc:subject>DATABASES AND ONTOLOGIES</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Primary immunodeficiencies (PIDs) belong to the group of rare diseases. The European Society for Immunodeficiencies (ESID), is establishing an innovative European patient and research database network for continuous long-term documentation of patients, in order to improve the diagnosis, classification, prognosis and therapy of PIDs. The ESID Online Database is a web-based system aimed at data storage, data entry, reporting and the import of pre-existing data sources in an enterprise business-to-business integration (B2B). The online database is based on Java 2 Enterprise System (J2EE) with high-standard security features, which comply with data protection laws and the demands of a modern research platform. &lt;b&gt;Availability:&lt;/b&gt; The ESID Online Database is accessible via the official website (&lt;inter-ref locator=&quot;http://www.esid.org/&quot; locator-type=&quot;url&quot;&gt;http://www.esid.org/&lt;/inter-ref&gt;). &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;b.grimbacher@medsch.ucl.ac.uk&quot; locator-type=&quot;email&quot;&gt;b.grimbacher@medsch.ucl.ac.uk&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/654</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl675</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/654-a</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>SNPassoc: an R package to perform whole genome association studies</dc:title>
<dc:creator>Gonz&#225;lez, Juan R.</dc:creator>
<dc:creator>Armengol, Llu&#237;s</dc:creator>
<dc:creator>Sol&#233;, Xavier</dc:creator>
<dc:creator>Guin&#243;, Elisabet</dc:creator>
<dc:creator>Mercader, Josep M.</dc:creator>
<dc:creator>Estivill, Xavier</dc:creator>
<dc:creator>Moreno, V&#237;ctor</dc:creator>
<dc:subject>GENETICS AND POPULATION ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; The popularization of large-scale genotyping projects has led to the widespread adoption of genetic association studies as the tool of choice in the search for single nucleotide polymorphisms (SNPs) underlying susceptibility to complex diseases. Although the analysis of individual SNPs is a relatively trivial task, when the number is large and multiple genetic models need to be explored it becomes necessary a tool to automate the analyses. In order to address this issue, we developed SNPassoc, an R package to carry out most common analyses in whole genome association studies. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy&#8211;Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). &lt;b&gt;Availability:&lt;/b&gt; Package SNPassoc is available at CRAN from &lt;inter-ref locator=&quot;http://cran.r-project.org&quot; locator-type=&quot;url&quot;&gt;http://cran.r-project.org&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;juanramon.gonzalez@crg.es&quot; locator-type=&quot;email&quot;&gt;juanramon.gonzalez@crg.es&lt;/inter-ref&gt; or &lt;inter-ref locator=&quot;v.moreno@iconcologia.net&quot; locator-type=&quot;email&quot;&gt;v.moreno@iconcologia.net&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; A tutorial is available on &lt;it&gt;Bioinformatics&lt;/it&gt; online and in &lt;inter-ref locator=&quot;http://davinci.crg.es/estivill_lab/snpassoc&quot; locator-type=&quot;url&quot;&gt;http://davinci.crg.es/estivill_lab/snpassoc&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/654-a</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm025</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/637</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>SMotif: a server for structural motifs in proteins</dc:title>
<dc:creator>Pugalenthi, Ganesan</dc:creator>
<dc:creator>Suganthan, P. N.</dc:creator>
<dc:creator>Sowdhamini, R.</dc:creator>
<dc:creator>Chakrabarti, Saikat</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; SMotif is a server that identifies important structural segments or motifs for a given protein structure(s) based on conservation of both sequential as well as important structural features such as solvent inaccessibility, secondary structural content, hydrogen bonding pattern and residue packing. This server also provides three-dimensional orientation patterns of the identified motifs in terms of inter-motif distances and torsion angles. These motifs may form the common core and therefore, can also be employed to design and rationalize protein engineering and folding experiments. &lt;b&gt;Availability:&lt;/b&gt; SMotif server is available via the URL &lt;inter-ref locator=&quot;http://caps.ncbs.res.in/SMotif/index.html&quot; locator-type=&quot;url&quot;&gt;http://caps.ncbs.res.in/SMotif/index.html&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;chakraba@mail.nih.gov&quot; locator-type=&quot;email&quot;&gt;chakraba@mail.nih.gov&lt;/inter-ref&gt;, &lt;inter-ref locator=&quot;mini@ncbs.res.in&quot; locator-type=&quot;email&quot;&gt;mini@ncbs.res.in&lt;/inter-ref&gt; or &lt;inter-ref locator=&quot;EPNSugan@ntu.edu.sg&quot; locator-type=&quot;email&quot;&gt;EPNSugan@ntu.edu.sg&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/637</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl679</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/629</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Compressed suffix tree--a basis for genome-scale sequence analysis</dc:title>
<dc:creator>V&#228;lim&#228;ki, Niko</dc:creator>
<dc:creator>Gerlach, Wolfgang</dc:creator>
<dc:creator>Dixit, Kashyap</dc:creator>
<dc:creator>M&#228;kinen, Veli</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Suffix tree is one of the most fundamental data structures in string algorithms and biological sequence analysis. Unfortunately, when it comes to implementing those algorithms and applying them to real genomic sequences, often the main memory size becomes the bottleneck. This is easily explained by the fact that while a DNA sequence of length &lt;it&gt;n&lt;/it&gt; from alphabet &#931; = {&lt;it&gt;A&lt;/it&gt;,&#8201;&lt;it&gt;C&lt;/it&gt;,&#8201;&lt;it&gt;G&lt;/it&gt;,&#8201;&lt;it&gt;T&lt;/it&gt;&#8201;} can be stored in &lt;it&gt;n&lt;/it&gt;&#8201;log&#8201;|&#931;|=&#8201;2&lt;it&gt;n&lt;/it&gt; bits, its suffix tree occupies &lt;it&gt;O&lt;/it&gt;(&lt;it&gt;n&lt;/it&gt; log &lt;it&gt;n&lt;/it&gt;) bits. In practice, the size difference easily reaches factor 50. We provide an implementation of the &lt;it&gt;compressed suffix tree&lt;/it&gt; very recently proposed by Sadakane (&lt;it&gt;Theory of Computing Systems&lt;/it&gt;, in press). The compressed suffix tree occupies space proportional to the text size, i.e. &lt;it&gt;O&lt;/it&gt;(&lt;it&gt;n&lt;/it&gt; log} | &#931; |) bits, and supports all typical suffix tree operations with at most log &lt;it&gt;n&lt;/it&gt; factor slowdown. Our experiments show that, e.g. on a 10 MB DNA sequence, the compressed suffix tree takes 10% of the space of normal suffix tree. Typical operations are slowed down by factor 60. &lt;b&gt;Availability:&lt;/b&gt; The C++ implementation under GNU license is available at &lt;inter-ref locator=&quot;http://www.cs.helsinki.fi/group/suds/cst/&quot; locator-type=&quot;url&quot;&gt;http://www.cs.helsinki.fi/group/suds/cst/&lt;/inter-ref&gt;. An example program implementing a typical pattern discovery task is included. Experimental results in this note correspond to version 0.95. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;vmakinen@cs.helsinki.fi&quot; locator-type=&quot;email&quot;&gt;vmakinen@cs.helsinki.fi&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/629</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl681</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/5/646</identifier><datestamp>2007-03-21</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:5</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>SBaddon: high performance simulation for the Systems Biology Toolbox for MATLAB</dc:title>
<dc:creator>Schmidt, Henning</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; We present the SBaddon package as an extension to the Systems Biology Toolbox for MATLAB (SBtoolbox). The goal of this extension is to provide the users of the SBtoolbox with important functionality that is needed for parameter estimation applications. While simulation in the SBtoolbox relies on the MATLAB ODE solvers, the SBaddon package provides considerably increased simulation performance through automatic generation of compiled simulation functions. Furthermore, the package contains improved optimization algorithms, forward parameter sensitivity analysis and basic numeric parameter identifiability analysis. &lt;b&gt;Availability:&lt;/b&gt; The SBaddon package is open source and freely available for non-commercial use. Commercial use of SBaddon is only possible through a specific licensing agreement (contact &lt;inter-ref locator=&quot;sbaddon@sbtoolbox.org&quot; locator-type=&quot;email&quot;&gt;sbaddon@sbtoolbox.org&lt;/inter-ref&gt;). SBaddon can be obtained from &lt;inter-ref locator=&quot;http://www.sbtoolbox.org/SBaddon&quot; locator-type=&quot;url&quot;&gt;http://www.sbtoolbox.org/SBaddon&lt;/inter-ref&gt;. The website also contains extensive documentation, and examples. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;henning@fcc.chalmers.se&quot; locator-type=&quot;email&quot;&gt;henning@fcc.chalmers.se&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-03-21</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/5/646</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl668</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1188</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>GAPWM: a genetic algorithm method for optimizing a position weight matrix</dc:title>
<dc:creator>Li, Leping</dc:creator>
<dc:creator>Liang, Yu</dc:creator>
<dc:creator>Bass, Robert L.</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Position weight matrices (PMWs) are simple models commonly used in motif-finding algorithms to identify short functional elements, such as &lt;it&gt;cis&lt;/it&gt;-regulatory motifs, on genes. When few experimentally verified motifs are available, estimation of the PWM may be poor. The resultant PWM may not reliably discriminate a true motif from a false one. While experimentally identifying such motifs remains time-consuming and expensive, low-resolution binding data from techniques such as ChIP-on-chip and ChIP-PET have become available. We propose a novel but simple method to improve a poorly estimated PWM using ChIP data. &lt;b&gt;Methodology:&lt;/b&gt; Starting from an existing PWM, a set of ChIP sequences, and a set of background sequences, our method, GAPWM, derives an improved PWM via a genetic algorithm that maximizes the area under the receiver operating characteristic (ROC) curve. GAPWM can easily incorporate prior information such as base conservation. We tested our method on two PMWs (Oct4/Sox2 and p53) using three recently published ChIP data sets (human Oct4, mouse Oct4 and human p53). &lt;b&gt;Results:&lt;/b&gt; GAPWM substantially increased the sensitivity/specificity of a poorly estimated PWM and further improved the quality of a good PWM. Furthermore, it still functioned when the starting PWM contained a major error. The ROC performance of GAPWM compared favorably with that of MEME and others. With increasing availability of ChIP data, our method provides an alternative for obtaining high-quality PWMs for genome-wide identification of transcription factor binding sites. &lt;b&gt;Availability:&lt;/b&gt; The C source code and all data used in this report are available at &lt;inter-ref locator=&quot;http://dir.niehs.nih.gov/dirbb/gapwm&quot; locator-type=&quot;url&quot;&gt;http://dir.niehs.nih.gov/dirbb/gapwm&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;li3@niehs.nih.gov&quot; locator-type=&quot;email&quot;&gt;li3@niehs.nih.gov&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1188</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm080</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1181</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>IMEx: Imperfect Microsatellite Extractor</dc:title>
<dc:creator>Mudunuri, Suresh B.</dc:creator>
<dc:creator>Nagarajaram, Hampapathalu A.</dc:creator>
<dc:subject>GENOME ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Microsatellites, also known as simple sequence repeats, are the tandem repeats of nucleotide motifs of the size 1&#8211;6&#8201;bp found in every genome known so far. Their importance in genomes is well known. Microsatellites are associated with various disease genes, have been used as molecular markers in linkage analysis and DNA fingerprinting studies, and also seem to play an important role in the genome evolution. Therefore, it is of importance to study distribution, enrichment and polymorphism of microsatellites in the genomes of interest. For this, the prerequisite is the availability of a computational tool for extraction of microsatellites (perfect as well as imperfect) and their related information from whole genome sequences. Examination of available tools revealed certain lacunae in them and prompted us to develop a new tool. &lt;b&gt;Results:&lt;/b&gt; In order to efficiently screen genome sequences for microsatellites (perfect as well as imperfect), we developed a new tool called IMEx (Imperfect Microsatellite Extractor). IMEx uses simple string-matching algorithm with sliding window approach to screen DNA sequences for microsatellites and reports the motif, copy number, genomic location, nearby genes, mutational events and many other features useful for in-depth studies. IMEx is more sensitive, efficient and useful than the available widely used tools. IMEx is available in the form of a stand-alone program as well as in the form of a web-server. &lt;b&gt;Availability:&lt;/b&gt; A World Wide Web server and the stand-alone program are available for free access at &lt;inter-ref locator=&quot;http://203.197.254.154/IMEX/&quot; locator-type=&quot;url&quot;&gt;http://203.197.254.154/IMEX/&lt;/inter-ref&gt; or &lt;inter-ref locator=&quot;http://www.cdfd.org.in/imex&quot; locator-type=&quot;url&quot;&gt;http://www.cdfd.org.in/imex&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;han@cdfd.org.in&quot; locator-type=&quot;email&quot;&gt;han@cdfd.org.in&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1181</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm097</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1195</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A fast and flexible approach to oligonucleotide probe design for genomes and gene families</dc:title>
<dc:creator>Feng, Shengzhong</dc:creator>
<dc:creator>Tillier, Elisabeth R.M.</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; With hundreds of completely sequenced microbial genomes available, and advancements in DNA microarray technology, the detection of genes in microbial communities consisting of hundreds of thousands of sequences may be possible. The existing strategies developed for DNA probe design, geared toward identifying specific sequences, are not suitable due to the lack of coverage, flexibility and efficiency necessary for applications in metagenomics. &lt;b&gt;Methods:&lt;/b&gt; ProDesign is a tool developed for the selection of oligonucleotide probes to detect members of gene families present in environmental samples. Gene family-specific probe sequences are generated based on specific and shared words, which are found with the spaced seed hashing algorithm. To detect more sequences, those sharing some common words are re-clustered into new families, then probes specific for the new families are generated. &lt;b&gt;Results:&lt;/b&gt; The program is very flexible in that it can be used for designing probes for detecting many genes families simultaneously and specifically in one or more genomes. Neither the length nor the melting temperature of the probes needs to be predefined. We have found that ProDesign provides more flexibility, coverage and speed than other software programs used in the selection of probes for genomic and gene family arrays. &lt;b&gt;Availability:&lt;/b&gt; ProDesign is licensed free of charge to academic users. ProDesign and Supplementary Material can be obtained by contacting the authors. A web server for ProDesign is available at &lt;inter-ref locator=&quot;http://www.uhnresearch.ca/labs/tillier/ProDesign/ProDesign.html&quot; locator-type=&quot;url&quot;&gt;http://www.uhnresearch.ca/labs/tillier/ProDesign/ProDesign.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;e.tillier@utoronto.ca&quot; locator-type=&quot;email&quot;&gt;e.tillier@utoronto.ca&lt;/inter-ref&gt; or &lt;inter-ref locator=&quot;fsz@ncic.ac.cn&quot; locator-type=&quot;email&quot;&gt;fsz@ncic.ac.cn&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1195</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm114</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1203</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>AutoSCOP: automated prediction of SCOP classifications using unique pattern-class mappings</dc:title>
<dc:creator>Gewehr, Jan E.</dc:creator>
<dc:creator>Hintermair, Volker</dc:creator>
<dc:creator>Zimmer, Ralf</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; The sequence patterns contained in the available motif and hidden Markov model (HMM) databases are a valuable source of information for protein sequence annotation. For structure prediction and fold recognition purposes, we computed mappings from such pattern databases to the protein domain hierarchy given by the ASTRAL compendium and applied them to the prediction of SCOP classifications. Our aim is to make highly confident predictions also for non-trivial cases if possible and abstain from a prediction otherwise, and thus to provide a method that can be used as a first step in a pipeline of prediction methods. We describe two successful examples for such pipelines. With the AutoSCOP approach, it is possible to make predictions in a large-scale manner for many domains of the available sequences in the well-known protein sequence databases. &lt;b&gt;Results:&lt;/b&gt; AutoSCOP computes unique sequence patterns and pattern combinations for SCOP classifications. For instance, we assign a SCOP superfamily to a pattern found in its members whenever the pattern does not occur in any other SCOP superfamily. Especially on the fold and superfamily level, our method achieves both high sensitivity (above 93%) and high specificity (above 98%) on the difference set between two ASTRAL versions, due to being able to abstain from unreliable predictions. Further, on a harder test set filtered at low sequence identity, the combination with profile&#8211;profile alignments improves accuracy and performs comparably even to structure alignment methods. Integrating our method with structure alignment, we are able to achieve an accuracy of 99% on SCOP fold classifications on this set. In an analysis of false assignments of domains from new folds/superfamilies/families to existing SCOP classifications, AutoSCOP correctly abstains for more than 70% of the domains belonging to new folds and superfamilies, and more than 80% of the domains belonging to new families. These findings show that our approach is a useful additional filter for SCOP classification prediction of protein domains in combination with well-known methods such as profile&#8211;profile alignment. &lt;b&gt;Availability:&lt;/b&gt; A web server where users can input their domain sequences is available at &lt;inter-ref locator=&quot;http://www.bio.ifi.lmu.de/autoscop&quot; locator-type=&quot;url&quot;&gt;http://www.bio.ifi.lmu.de/autoscop&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;jan.gewehr@ifi.lmu.de&quot; locator-type=&quot;email&quot;&gt;jan.gewehr@ifi.lmu.de&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1203</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm089</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1282</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>UniRef: comprehensive and non-redundant UniProt reference clusters</dc:title>
<dc:creator>Suzek, Baris E.</dc:creator>
<dc:creator>Huang, Hongzhan</dc:creator>
<dc:creator>McGarvey, Peter</dc:creator>
<dc:creator>Mazumder, Raja</dc:creator>
<dc:creator>Wu, Cathy H.</dc:creator>
<dc:subject>DATABASES AND ONTOLOGIES</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Redundant protein sequences in biological databases hinder sequence similarity searches and make interpretation of search results difficult. Clustering of protein sequence space based on sequence similarity helps organize all sequences into manageable datasets and reduces sampling bias and overrepresentation of sequences. &lt;b&gt;Results:&lt;/b&gt; The UniRef (UniProt Reference Clusters) provide clustered sets of sequences from the UniProt Knowledgebase (UniProtKB) and selected UniProt Archive records to obtain complete coverage of sequence space at several resolutions while hiding redundant sequences. Currently covering &gt;4 million source sequences, the UniRef100 database combines identical sequences and subfragments from any source organism into a single UniRef entry. UniRef90 and UniRef50 are built by clustering UniRef100 sequences at the 90 or 50% sequence identity levels. UniRef100, UniRef90 and UniRef50 yield a database size reduction of &#8764;10, 40 and 70%, respectively, from the source sequence set. The reduced redundancy increases the speed of similarity searches and improves detection of distant relationships. UniRef entries contain summary cluster and membership information, including the sequence of a representative protein, member count and common taxonomy of the cluster, the accession numbers of all the merged entries and links to rich functional annotation in UniProtKB to facilitate biological discovery. UniRef has already been applied to broad research areas ranging from genome annotation to proteomics data analysis. &lt;b&gt;Availability:&lt;/b&gt; UniRef is updated biweekly and is available for online search and retrieval at &lt;inter-ref locator=&quot;http://www.uniprot.org&quot; locator-type=&quot;url&quot;&gt;http://www.uniprot.org&lt;/inter-ref&gt;, as well as for download at &lt;inter-ref locator=&quot;ftp://ftp.uniprot.org/pub/databases/uniprot/uniref&quot; locator-type=&quot;url&quot;&gt;ftp://ftp.uniprot.org/pub/databases/uniprot/uniref&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;bes23@georgetown.edu&quot; locator-type=&quot;email&quot;&gt;bes23@georgetown.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1282</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm098</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1243</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A mixture model approach to the tests of concordance and discordance between two large-scale experiments with two-sample groups</dc:title>
<dc:creator>Lai, Yinglei</dc:creator>
<dc:creator>Adam, Bao-ling</dc:creator>
<dc:creator>Podolsky, Robert</dc:creator>
<dc:creator>She, Jin-Xiong</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Due to advances in experimental technologies, such as microarray, mass spectrometry and nuclear magnetic resonance, it is feasible to obtain large-scale data sets, in which measurements for a large number of features can be simultaneously collected. However, the sample sizes of these data sets are usually small due to their relatively high costs, which leads to the issue of concordance among different data sets collected for the same study: features should have consistent behavior in different data sets. There is a lack of rigorous statistical methods for evaluating this concordance or discordance. &lt;b&gt;Methods:&lt;/b&gt; Based on a three-component normal-mixture model, we propose two likelihood ratio tests for evaluating the concordance and discordance between two large-scale data sets with two sample groups. The parameter estimation is achieved through the expectation-maximization (E-M) algorithm. A normal-distribution-quantile-based method is used for data transformation. &lt;b&gt;Results:&lt;/b&gt; To evaluate the proposed tests, we conducted some simulation studies, which suggested their satisfactory performances. As applications, the proposed tests were applied to three SELDI-MS data sets with replicates. One data set has replicates from different platforms and the other two have replicates from the same platform. We found that data generated by SELDI-MS showed satisfactory concordance between replicates from the same platform but unsatisfactory concordance between replicates from different platforms. &lt;b&gt;Availability:&lt;/b&gt; The R codes are freely available at &lt;inter-ref locator=&quot;http://home.gwu.edu/~ylai/research/Concordance&quot; locator-type=&quot;url&quot;&gt;http://home.gwu.edu/~ylai/research/Concordance&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;ylai@gwu.edu&quot; locator-type=&quot;email&quot;&gt;ylai@gwu.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1243</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm103</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1217</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Pooling mRNA in microarray experiments and its effect on power</dc:title>
<dc:creator>Zhang, Wuyan</dc:creator>
<dc:creator>Carriquiry, Alicia</dc:creator>
<dc:creator>Nettleton, Dan</dc:creator>
<dc:creator>Dekkers, Jack C.M.</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Microarrays can simultaneously measure the expression levels of many genes and are widely applied to study complex biological problems at the genetic level. To contain costs, instead of obtaining a microarray on each individual, mRNA from several subjects can be first pooled and then measured with a single array. mRNA pooling is also necessary when there is not enough mRNA from each subject. Several studies have investigated the impact of pooling mRNA on inferences about gene expression, but have typically modeled the process of pooling as if it occurred in some transformed scale. This assumption is unrealistic. &lt;b&gt;Results:&lt;/b&gt; We propose modeling the gene expression levels in a pool as a weighted average of mRNA expression of all individuals in the pool on the original measurement scale, where the weights correspond to individual sample contributions to the pool. Based on these improved statistical models, we develop the appropriate F statistics to test for differentially expressed genes. We present formulae to calculate the power of various statistical tests under different strategies for pooling mRNA and compare resulting power estimates to those that would be obtained by following the approach proposed by Kendziorski &lt;it&gt;et al&lt;/it&gt;. (&lt;cross-ref type=&quot;bib&quot; refid=&quot;B5&quot;&gt;2003&lt;/cross-ref&gt;). We find that the Kendziorski estimate tends to exceed true power and that the estimate we propose, while somewhat conservative, is less biased. We argue that it is possible to design a study that includes mRNA pooling at a significantly reduced cost but with little loss of information. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;alicia@iastate.edu&quot; locator-type=&quot;email&quot;&gt;alicia@iastate.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1217</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm081</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1258</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Metabolic systems cost-benefit analysis for interpreting network structure and regulation</dc:title>
<dc:creator>Carlson, Ross P.</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Interpretation of bioinformatics data in terms of cellular function is a major challenge facing systems biology. This question is complicated by robust metabolic networks filled with structural features like parallel pathways and isozymes. Under conditions of nutrient sufficiency, metabolic networks are well known to be regulated for thermodynamic efficiency however; efficient biochemical pathways are anabolically expensive to construct. While parameters like thermodynamic efficiency have been extensively studied, a systems-based analysis of anabolic proteome synthesis &#8216;costs&#8217; and the cellular function implications of these costs has not been reported. &lt;b&gt;Results:&lt;/b&gt; A cost-benefit analysis of an &lt;it&gt;in silico Escherichia coli&lt;/it&gt; network revealed the relationship between metabolic pathway proteome synthesis requirements, DNA-coding sequence length, thermodynamic efficiency and substrate affinity. The results highlight basic metabolic network design principles. Pathway proteome synthesis requirements appear to have shaped biochemical network structure and regulation. Under conditions of nutrient scarcity and other general stresses, &lt;it&gt;E.coli&lt;/it&gt; expresses pathways with relatively inexpensive proteome synthesis requirements instead of more efficient but also anabolically more expensive pathways. This evolutionary strategy provides a cellular function-based explanation for common network motifs like isozymes and parallel pathways and possibly explains &#8216;overflow&#8217; metabolisms observed during nutrient scarcity. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;alicia@iastate.edu&quot; locator-type=&quot;email&quot;&gt;alicia@iastate.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1258</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm082</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1251</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Modeling sequence-sequence interactions for drug response</dc:title>
<dc:creator>Lin, Min</dc:creator>
<dc:creator>Li, Hongying</dc:creator>
<dc:creator>Hou, Wei</dc:creator>
<dc:creator>Johnson, Julie A.</dc:creator>
<dc:creator>Wu, Rongling</dc:creator>
<dc:subject>GENETICS AND POPULATION ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Genetic interactions or epistasis may play an important role in the genetic etiology of drug response. With the availability of large-scale, high-density single nucleotide polymorphism markers, a great challenge is how to associate haplotype structures and complex drug response through its underlying pharmacodynamic mechanisms. &lt;b&gt;Results:&lt;/b&gt; We have derived a general statistical model for detecting an interactive network of DNA sequence variants that encode pharmacodynamic processes based on the haplotype map constructed by single nucleotide polymorphisms. The model was validated by a pharmacogenetic study for two predominant beta-adrenergic receptor (&#946;AR) subtypes expressed in the heart, &#946;1AR and &#946;2AR. Haplotypes from these two receptors trigger significant interaction effects on the response of heart rate to different dose levels of dobutamine. This model will have implications for pharmacogenetic and pharmacogenomic research and drug discovery. &lt;b&gt;Availability:&lt;/b&gt; A computer program written in Matlab can be downloaded from the webpage of statistical genetics group at the University of Florida. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;rwu@mail.ifas.ufl.edu&quot; locator-type=&quot;email&quot;&gt;rwu@mail.ifas.ufl.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1251</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm110</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1235</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Using DNA microarrays to study gene expression in closely related species</dc:title>
<dc:creator>Oshlack, Alicia</dc:creator>
<dc:creator>Chabot, Adrien E.</dc:creator>
<dc:creator>Smyth, Gordon K.</dc:creator>
<dc:creator>Gilad, Yoav</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Comparisons of gene expression levels within and between species have become a central tool in the study of the genetic basis for phenotypic variation, as well as in the study of the evolution of gene regulation. DNA microarrays are a key technology that enables these studies. Currently, however, microarrays are only available for a small number of species. Thus, in order to study gene expression levels in species for which microarrays are not available, researchers face three sets of choices: (i) use a microarray designed for another species, but only compare gene expression levels within species, (ii) construct a new microarray for every species whose gene expression profiles will be compared or (iii) build a multi-species microarray with probes from each species of interest. Here, we use data collected using a multi-primate cDNA array to evaluate the reliability of each approach. &lt;b&gt;Results:&lt;/b&gt; We find that, for inter-species comparisons, estimates of expression differences based on multi-species microarrays are more accurate than those based on multiple species-specific arrays. We also demonstrate that within-species expression differences can be estimated using a microarray for a closely related species, without discernible loss of information. &lt;b&gt;Contact:&lt;/b&gt; A.O. (&lt;inter-ref locator=&quot;oshlack@wehi.edu.au&quot; locator-type=&quot;email&quot;&gt;oshlack@wehi.edu.au&lt;/inter-ref&gt;) or Y.G. (&lt;inter-ref locator=&quot;gilad@uchicago.edu&quot; locator-type=&quot;email&quot;&gt;gilad@uchicago.edu&lt;/inter-ref&gt;) &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1235</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm111</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1274</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A new method to measure the semantic similarity of GO terms</dc:title>
<dc:creator>Wang, James Z.</dc:creator>
<dc:creator>Du, Zhidian</dc:creator>
<dc:creator>Payattakool, Rapeeporn</dc:creator>
<dc:creator>Yu, Philip S.</dc:creator>
<dc:creator>Chen, Chin-Fu</dc:creator>
<dc:subject>DATA AND TEXT MINING</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Although controlled biochemical or biological vocabularies, such as Gene Ontology (GO) (&lt;inter-ref locator=&quot;http://www.geneontology.org&quot; locator-type=&quot;url&quot;&gt;http://www.geneontology.org&lt;/inter-ref&gt;), address the need for consistent descriptions of genes in different data sources, there is still no effective method to determine the functional similarities of genes based on gene annotation information from heterogeneous data sources. &lt;b&gt;Results:&lt;/b&gt; To address this critical need, we proposed a novel method to encode a GO term&apos;s semantics (biological meanings) into a numeric value by aggregating the semantic contributions of their ancestor terms (including this specific term) in the GO graph and, in turn, designed an algorithm to measure the semantic similarity of GO terms. Based on the semantic similarities of GO terms used for gene annotation, we designed a new algorithm to measure the functional similarity of genes. The results of using our algorithm to measure the functional similarities of genes in pathways retrieved from the saccharomyces genome database (SGD), and the outcomes of clustering these genes based on the similarity values obtained by our algorithm are shown to be consistent with human perspectives. Furthermore, we developed a set of online tools for gene similarity measurement and knowledge discovery. &lt;b&gt;Availability:&lt;/b&gt; The online tools are available at: &lt;inter-ref locator=&quot;http://bioinformatics.clemson.edu/G-SESAME&quot; locator-type=&quot;url&quot;&gt;http://bioinformatics.clemson.edu/G-SESAME&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;jzwang@cs.clemson.edu&quot; locator-type=&quot;email&quot;&gt;jzwang@cs.clemson.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; &lt;inter-ref locator=&quot;http://bioinformatics.clemson.edu/Publication/Supplement/gsp.htm&quot; locator-type=&quot;url&quot;&gt;http://bioinformatics.clemson.edu/Publication/Supplement/gsp.htm&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1274</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm087</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1211</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Glycan classification with tree kernels</dc:title>
<dc:creator>Yamanishi, Yoshihiro</dc:creator>
<dc:creator>Bach, Francis</dc:creator>
<dc:creator>Vert, Jean-Philippe</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Glycans are covalent assemblies of sugar that play crucial roles in many cellular processes. Recently, comprehensive data about the structure and function of glycans have been accumulated, therefore the need for methods and algorithms to analyze these data is growing fast. &lt;b&gt;Results:&lt;/b&gt; This article presents novel methods for classifying glycans and detecting discriminative glycan motifs with support vector machines (SVM). We propose a new class of tree kernels to measure the similarity between glycans. These kernels are based on the comparison of tree substructures, and take into account several glycan features such as the sugar type, the sugar bound type or layer depth. The proposed methods are tested on their ability to classify human glycans into four blood components: leukemia cells, erythrocytes, plasma and serum. They are shown to outperform a previously published method. We also applied a feature selection approach to extract glycan motifs which are characteristic of each blood component. We confirmed that some leukemia-specific glycan motifs detected by our method corresponded to several results in the literature. &lt;b&gt;Availability:&lt;/b&gt; Softwares are available upon request. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;yoshi@kuicr.kyoto-u.ac.jp&quot; locator-type=&quot;email&quot;&gt;yoshi@kuicr.kyoto-u.ac.jp&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Datasets are available at the following website: &lt;inter-ref locator=&quot;http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/glycankernel/&quot; locator-type=&quot;url&quot;&gt;http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/glycankernel/&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1211</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm090</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1289</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Enhancements and modifications of primer design program Primer3</dc:title>
<dc:creator>Koressaar, Triinu</dc:creator>
<dc:creator>Remm, Maido</dc:creator>
<dc:subject>SEQUENCE ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; The determination of annealing temperature is a critical step in PCR design. This parameter is typically derived from the melting temperature of the PCR primers, so for successful PCR work it is important to determine the melting temperature of primer accurately. We introduced several enhancements in the widely used primer design program Primer3. The improvements include a formula for calculating melting temperature and a salt correction formula. Also, the new version can take into account the effects of divalent cations, which are included in most PCR buffers. Another modification enables using lowercase masked template sequences for primer design. &lt;b&gt;Availability:&lt;/b&gt; Features described in this article have been implemented into the development code of Primer3 and will be available in future versions (version 1.1 and newer) of Primer3. Also, a modified version is compiled under the name of mPrimer3 which is distributed independently. The web-based version of mPrimer3 is available at &lt;inter-ref locator=&quot;http://bioinfo.ebc.ee/mprimer3/&quot; locator-type=&quot;url&quot;&gt;http://bioinfo.ebc.ee/mprimer3/&lt;/inter-ref&gt; and the binary code is freely downloadable from the URL &lt;inter-ref locator=&quot;http://bioinfo.ebc.ee/download/&quot; locator-type=&quot;url&quot;&gt;http://bioinfo.ebc.ee/download/&lt;/inter-ref&gt;. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;maido.remm@ut.ee&quot; locator-type=&quot;email&quot;&gt;maido.remm@ut.ee&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1289</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm091</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1225</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Domain-enhanced analysis of microarray data using GO annotations</dc:title>
<dc:creator>Liu, Jiajun</dc:creator>
<dc:creator>Hughes-Oliver, Jacqueline M.</dc:creator>
<dc:creator>Menius, J. Alan</dc:creator>
<dc:subject>GENE EXPRESSION</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; New biological systems technologies give scientists the ability to measure thousands of bio-molecules including genes, proteins, lipids and metabolites. We use domain knowledge, e.g. the Gene Ontology, to guide analysis of such data. By focusing on domain-aggregated results at, say the molecular function level, increased interpretability is available to biological scientists beyond what is possible if results are presented at the gene level. &lt;b&gt;Results:&lt;/b&gt; We use a &#8216;top&#8211;down&#8217; approach to perform domain aggregation by first combining gene expressions before testing for differentially expressed patterns. This is in contrast to the more standard &#8216;bottom&#8211;up&#8217; approach, where genes are first tested individually then aggregated by domain knowledge. The benefits are greater sensitivity for detecting signals. Our method, domain-enhanced analysis (DEA) is assessed and compared to other methods using simulation studies and analysis of two publicly available leukemia data sets. &lt;b&gt;Availability:&lt;/b&gt; Our DEA method uses functions available in R (&lt;inter-ref locator=&quot;http://www.r-project.org/&quot; locator-type=&quot;url&quot;&gt;http://www.r-project.org/&lt;/inter-ref&gt;) and SAS (&lt;inter-ref locator=&quot;http://www.sas.com/&quot; locator-type=&quot;url&quot;&gt;http://www.sas.com/&lt;/inter-ref&gt;). The two experimental data sets used in our analysis are available in R as Bioconductor packages, &#8216;ALL&#8217; and &#8216;golubEsets&#8217; (&lt;inter-ref locator=&quot;http://www.bioconductor.org/&quot; locator-type=&quot;url&quot;&gt;http://www.bioconductor.org/&lt;/inter-ref&gt;). &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;jliu6@stat.ncsu.edu&quot; locator-type=&quot;email&quot;&gt;jliu6@stat.ncsu.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1225</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm092</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1265</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>The impact of function perturbations in Boolean networks</dc:title>
<dc:creator>Xiao, Yufei</dc:creator>
<dc:creator>Dougherty, Edward R.</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; A network is said to be &lt;it&gt;robust&lt;/it&gt; relative to a certain network characteristic if a small change in network structure does not significantly affect the characteristic. From the perspective of network stability, robustness is desirable; however, from the perspective of intervention to exert influence on network behavior, it is undesirable. For Boolean networks, there are two fundamental types of robustness. One type pertains to perturbing the state of the network and the other to perturbing the rule-based structure. &lt;b&gt;Results:&lt;/b&gt; This article explores the impact of function perturbations in Boolean networks from two aspects: (1) analysis: predict the impact on network state transitions and attractors via analytical approaches or identify a perturbation by observing its consequences; (2) synthesis: preserve or modify the network characteristics, especially attractors, by introducing a judicious change to the functions. The results are applied to achieve intervention that structurally alters the network to achieve a more favorable steady-state distribution and to identify the function perturbation that has led to altered observed behavior. The intervention procedure is applied to a WNT5A network to reduce the risk of metastasis in melanoma, and the identification procedure is applied to a &lt;it&gt;Drosophila melanogaster&lt;/it&gt; segmentation polarity gene network to identify regulatory function perturbation. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;edward@ece.tamu.edu&quot; locator-type=&quot;email&quot;&gt;edward@ece.tamu.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1265</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm093</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1301</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>BioGuideSRS: querying multiple sources with a user-centric perspective</dc:title>
<dc:creator>Cohen-Boulakia, Sarah</dc:creator>
<dc:creator>Biton, Olivier</dc:creator>
<dc:creator>Davidson, Susan</dc:creator>
<dc:creator>Froidevaux, Christine</dc:creator>
<dc:subject>DATABASES AND ONTOLOGIES</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Biologists are frequently faced with the problem of integrating information from multiple heterogeneous sources with their own experimental data. Given the large number of public sources, it is difficult to choose which sources to integrate without assistance. When doing this manually, biologists differ in their &lt;it&gt;preferences&lt;/it&gt; concerning the sources to be queried as well as the &lt;it&gt;strategies&lt;/it&gt;, i.e. the querying process they follow for navigating through the sources. In response to these findings, we have developed BioGuide to assist scientists search for relevant data within external sources while taking their preferences and strategies into account. In this article, we present BioGuideSRS, a user-friendly system which automatically retrieves instances of data by using BioGuide on top of the sequence retrieval system (SRS). BioGuideSRS is an Applet that can be run from its web page on any system with Java 5.0. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.bioguide-project.net&quot; locator-type=&quot;url&quot;&gt;http://www.bioguide-project.net&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;sarahcb@seas.upenn.edu&quot; locator-type=&quot;email&quot;&gt;sarahcb@seas.upenn.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1301</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm088</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1292</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>iPTREE-STAB: interpretable decision tree based method for predicting protein stability changes upon mutations</dc:title>
<dc:creator>Huang, Liang-Tsung</dc:creator>
<dc:creator>Gromiha, M. Michael</dc:creator>
<dc:creator>Ho, Shinn-Ying</dc:creator>
<dc:subject>STRUCTURAL BIOINFORMATICS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; We have developed a web server, iPTREE-STAB for discriminating the stability of proteins (stabilizing or destabilizing) and predicting their stability changes (&#916;&#916;G) upon single amino acid substitutions from amino acid sequence. The discrimination and prediction are mainly based on decision tree coupled with adaptive boosting algorithm, and classification and regression tree, respectively, using three neighboring residues of the mutant site along N- and C-terminals. Our method showed an accuracy of 82% for discriminating the stabilizing and destabilizing mutants, and a correlation of 0.70 for predicting protein stability changes upon mutations. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://bioinformatics.myweb.hinet.net/iptree.htm&quot; locator-type=&quot;url&quot;&gt;http://bioinformatics.myweb.hinet.net/iptree.htm&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;michael-gromiha@aist.go.jp&quot; locator-type=&quot;email&quot;&gt;michael-gromiha@aist.go.jp&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Dataset and other details are given. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1292</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm100</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1297</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>SBML export interface for the systems biology toolbox for MATLAB</dc:title>
<dc:creator>Schmidt, Hening</dc:creator>
<dc:creator>Drews, Gunnar</dc:creator>
<dc:creator>Vera, Julio</dc:creator>
<dc:creator>Wolkenhauer, Olaf</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; In this application note, we present an Systems biology markup language (SBML) export interface for the Systems Biology Toolbox for MATLAB. This interface allows modelers to automatically convert models, represented in the toolbox&apos;s own format (SBmodels) to SBML files. Since SBmodels do not explicitly contain all the information that is required to generate SBML, the necessary information is gathered by parsing SBmodels. The export can be done in two different ways. First, it is possible to call the export from the command line, thereby directly converting a model to an SBML file. The second option is to inspect and edit the conversion results with the help of a graphical user interface and to subsequently export the model to SBML. &lt;b&gt;Availability:&lt;/b&gt; The SBML export interface has been integrated into the Systems Biology Toolbox for MATLAB, which is open source and freely available from &lt;inter-ref locator=&quot;http://www.sbtoolbox.org&quot; locator-type=&quot;url&quot;&gt;http://www.sbtoolbox.org&lt;/inter-ref&gt;. The website also contains a tutorial, extensive documentation and examples. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;henning@fcc.chalmers.se&quot; locator-type=&quot;email&quot;&gt;henning@fcc.chalmers.se&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1297</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm105</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1304</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Mediante: a web-based microarray data manager</dc:title>
<dc:creator>Le Brigand, Kevin</dc:creator>
<dc:creator>Barbry, Pascal</dc:creator>
<dc:subject>DATABASES AND ONTOLOGIES</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Mediante is a MIAME-compliant microarray data manager that links together annotations and experimental data. Developed as a J2EE three-tier application, Mediante integrates a management system for production of long oligonucleotide microarrays, an experimental data repository suitable for home made or commercial microarrays, and a user interface dedicated to the management of microarrays projects. Several tools allow quality control of hybridizations and submission of validated data to public repositories. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.microarray.fr&quot; locator-type=&quot;url&quot;&gt;http://www.microarray.fr&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;barbry@ipmc.cnrs.fr&quot; locator-type=&quot;email&quot;&gt;barbry@ipmc.cnrs.fr&lt;/inter-ref&gt; or &lt;inter-ref locator=&quot;lebrigand@ipmc.cnrs.fr&quot; locator-type=&quot;email&quot;&gt;lebrigand@ipmc.cnrs.fr&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.microarray.fr/SP/lebrigand2007/&quot; locator-type=&quot;url&quot;&gt;http://www.microarray.fr/SP/lebrigand2007/&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1304</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm106</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1299</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Cyclone: java-based querying and computing with Pathway/Genome databases</dc:title>
<dc:creator>F&#232;vre, Fran&#231;ois Le</dc:creator>
<dc:creator>Smidtas, Serge</dc:creator>
<dc:creator>Sch&#228;chter, Vincent</dc:creator>
<dc:subject>SYSTEMS BIOLOGY</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Cyclone aims at facilitating the use of BioCyc, a collection of Pathway/Genome Databases (PGDBs). Cyclone provides a fully extensible Java Object API to analyze and visualize these data. Cyclone can read and write PGDBs, and can write its own data in the CycloneML format. This format is automatically generated from the BioCyc ontology by Cyclone itself, ensuring continued compatibility. Cyclone objects can also be stored in a relational database CycloneDB. Queries can be written in SQL, and in an intuitive and concise object-oriented query language, Hibernate Query Language (HQL). In addition, Cyclone interfaces easily with Java software including the Eclipse IDE for HQL edition, the Jung API for graph algorithms or Cytoscape for graph visualization. &lt;b&gt;Availability:&lt;/b&gt; Cyclone is freely available under an open source license at: &lt;inter-ref locator=&quot;http://sourceforge.net/projects/nemo-cyclone&quot; locator-type=&quot;url&quot;&gt;http://sourceforge.net/projects/nemo-cyclone&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;cyclone@genoscope.cns.fr&quot; locator-type=&quot;email&quot;&gt;cyclone@genoscope.cns.fr&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; For download and installation instructions, tutorials, use cases and examples, see &lt;inter-ref locator=&quot;http://nemo-cyclone.sourceforge.net&quot; locator-type=&quot;url&quot;&gt;http://nemo-cyclone.sourceforge.net&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1299</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm107</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1294</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>GenABEL: an R library for genome-wide association analysis</dc:title>
<dc:creator>Aulchenko, Yurii S.</dc:creator>
<dc:creator>Ripke, Stephan</dc:creator>
<dc:creator>Isaacs, Aaron</dc:creator>
<dc:creator>van Duijn, Cornelia M.</dc:creator>
<dc:subject>GENETICS AND POPULATION ANALYSIS</dc:subject>
<dc:description> Here we describe an R library for genome-wide association (GWA) analysis. It implements effective storage and handling of GWA data, fast procedures for genetic data quality control, testing of association of single nucleotide polymorphisms with binary or quantitative traits, visualization of results and also provides easy interfaces to standard statistical and graphical procedures implemented in base R and special R libraries for genetic analysis. We evaluated GenABEL using one simulated and two real data sets. We conclude that GenABEL enables the analysis of GWA data on desktop computers. &lt;b&gt;Availability:&lt;/b&gt; &lt;inter-ref locator=&quot;http://cran.r-project.org&quot; locator-type=&quot;url&quot;&gt;http://cran.r-project.org&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;i.aoultchenko@erasmusmc.nl&quot; locator-type=&quot;email&quot;&gt;i.aoultchenko@erasmusmc.nl&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1294</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm108</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/10/1307</identifier><datestamp>2007-05-28</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:10</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>DPTF: a database of poplar transcription factors</dc:title>
<dc:creator>Zhu, Qi-Hui</dc:creator>
<dc:creator>Guo, An-Yuan</dc:creator>
<dc:creator>Gao, Ge</dc:creator>
<dc:creator>Zhong, Ying-Fu</dc:creator>
<dc:creator>Xu, Meng</dc:creator>
<dc:creator>Huang, Minren</dc:creator>
<dc:creator>Luo, Jinchu</dc:creator>
<dc:subject>DATABASES AND ONTOLOGIES</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; The database of poplar transcription factors (DPTF) is a plant transcription factor (TF) database containing 2576 putative poplar TFs distributed in 64 families. These TFs were identified from both computational prediction and manual curation. We have provided extensive annotations including sequence features, functional domains, GO assignment and expression evidence for all TFs. In addition, DPTF contains cross-links to the &lt;it&gt;Arabidopsis&lt;/it&gt; and rice transcription factor databases making it a unique resource for genome-scale comparative studies of transcriptional regulation in model plants. &lt;b&gt;Availiability:&lt;/b&gt; DPTF is available at &lt;inter-ref locator=&quot;http://dptf.cbi.pku.edu.cn&quot; locator-type=&quot;url&quot;&gt;http://dptf.cbi.pku.edu.cn&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;dptf@mail.cbi.pku.edu.cn&quot; locator-type=&quot;email&quot;&gt;dptf@mail.cbi.pku.edu.cn&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-05-28</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/10/1307</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btm113</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e191</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>TOPP--the OpenMS proteomics pipeline</dc:title>
<dc:creator>Kohlbacher, Oliver</dc:creator>
<dc:creator>Reinert, Knut</dc:creator>
<dc:creator>Gr&#246;pl, Clemens</dc:creator>
<dc:creator>Lange, Eva</dc:creator>
<dc:creator>Pfeifer, Nico</dc:creator>
<dc:creator>Schulz-Trieglaff, Ole</dc:creator>
<dc:creator>Sturm, Marc</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Experimental techniques in proteomics have seen rapid development over the last few years. Volume and complexity of the data have both been growing at a similar rate. Accordingly, data management and analysis are one of the major challenges in proteomics. Flexible algorithms are required to handle changing experimental setups and to assist in developing and validating new methods. In order to facilitate these studies, it would be desirable to have a flexible &#8216;toolbox&#8217; of versatile and user-friendly applications allowing for rapid construction of computational workflows in proteomics. &lt;b&gt;Results:&lt;/b&gt; We describe a set of tools for proteomics data analysis&#8212;TOPP, The OpenMS Proteomics Pipeline. TOPP provides a set of computational tools which can be easily combined into analysis pipelines even by non-experts and can be used in proteomics workflows. These applications range from useful utilities (file format conversion, peak picking) over wrapper applications for known applications (e.g. Mascot) to completely new algorithmic techniques for data reduction and data analysis. We anticipate that TOPP will greatly facilitate rapid prototyping of proteomics data evaluation pipelines. As such, we describe the basic concepts and the current abilities of TOPP and illustrate these concepts in the context of two example applications: the identification of peptides from a raw dataset through database search and the complex analysis of a standard addition experiment for the absolute quantitation of biomarkers. The latter example demonstrates TOPP&apos;s ability to construct flexible analysis pipelines in support of complex experimental setups. &lt;b&gt;Availability:&lt;/b&gt; The TOPP components are available as open-source software under the lesser GNU public license (LGPL). Source code is available from the project website at &lt;inter-ref locator=&quot;www.OpenMS.de&quot; locator-type=&quot;url&quot;&gt;www.OpenMS.de&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;oliver.kohlbacher@uni-tuebingen.de&quot; locator-type=&quot;email&quot;&gt;oliver.kohlbacher@uni-tuebingen.de&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e191</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl299</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e163</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Rediscovering secondary structures as network motifs--an unsupervised learning approach</dc:title>
<dc:creator>Raveh, Barak</dc:creator>
<dc:creator>Rahat, Ofer</dc:creator>
<dc:creator>Basri, Ronen</dc:creator>
<dc:creator>Schreiber, Gideon</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Secondary structures are key descriptors of a protein fold and its topology. In recent years, they facilitated intensive computational tasks for finding structural homologues, fold prediction and protein design. Their popularity stems from an appealing regularity in patterns of geometry and chemistry. However, the definition of secondary structures is of subjective nature. An unsupervised de-novo discovery of these structures would shed light on their nature, and improve the way we use these structures in algorithms of structural bioinformatics. &lt;b&gt;Methods:&lt;/b&gt; We developed a new method for unsupervised partitioning of undirected graphs, based on patterns of small recurring network motifs. Our input was the network of all H-bonds and covalent interactions of protein backbones. This method can be also used for other biological and non-biological networks. &lt;b&gt;Results:&lt;/b&gt; In a fully unsupervised manner, and without assuming any explicit prior knowledge, we were able to rediscover the existence of conventional &#945;-helices, parallel &#946;-sheets, anti-parallel sheets and loops, as well as various non-conventional hybrid structures. The relation between connectivity and crystallographic temperature factors establishes the existence of novel secondary structures. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;barak.raveh@weizmann.ac.il&quot; locator-type=&quot;email&quot;&gt;barak.raveh@weizmann.ac.il&lt;/inter-ref&gt;; &lt;inter-ref locator=&quot;gideon.schreiber@weizmann.ac.il&quot; locator-type=&quot;email&quot;&gt;gideon.schreiber@weizmann.ac.il&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e163</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl290</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e225</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A tale of two tails: why are terminal residues of proteins exposed?</dc:title>
<dc:creator>Jacob, Etai</dc:creator>
<dc:creator>Unger, Ron</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; It is widely known that terminal residues of proteins (i.e. the N- and C-termini) are predominantly located on the surface of proteins and exposed to the solvent. However, there is no good explanation as to the forces driving this phenomenon. The common explanation that terminal residues are charged, and charged residues prefer to be on the surface, cannot explain the magnitude of the phenomenon. Here, we survey a large number of proteins from the PDB in order to explore, quantitatively, this phenomenon, and then we use a lattice model to study the mechanisms involved. &lt;b&gt;Results:&lt;/b&gt; The location of the termini was examined for 425 small monomeric proteins (50&#8211;200 amino acids) and it was found that the average solvent accessibility of termini residues is 87.1% compared with 49.2% of charged residues and 35.9% of all residues. Using a cutoff of 50% of the maximal possible exposure, 80.3% of the N-terminal and 86.1% of the C-terminal residues are exposed compared to 32% for all residues. In addition, terminal residues are much more distant from the center of mass of their proteins than other residues. Using a 2D lattice, a large population of model proteins was studied on three levels: structural selection of compact structures, thermodynamic selection of conformations with a pronounced energy gap and kinetic selection of fast folding proteins using Monte-Carlo simulations. Progressively, each selection raises the proportion of proteins with termini on the surface, resulting in similar proportions to those observed for real proteins. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;ron@biocom1.ls.biu.ac.il&quot; locator-type=&quot;email&quot;&gt;ron@biocom1.ls.biu.ac.il&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e225</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl318</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e177</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Biological network comparison using graphlet degree distribution</dc:title>
<dc:creator>Przulj, Natasa</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Analogous to biological sequence comparison, comparing cellular networks is an important problem that could provide insight into biological understanding and therapeutics. For technical reasons, comparing large networks is computationally infeasible, and thus heuristics, such as the degree distribution, clustering coefficient, diameter, and relative graphlet frequency distribution have been sought. It is easy to demonstrate that two networks are different by simply showing a short list of properties in which they differ. It is much harder to show that two networks are similar, as it requires demonstrating their similarity in &lt;it&gt;all&lt;/it&gt; of their exponentially many properties. Clearly, it is computationally prohibitive to analyze all network properties, but the larger the number of constraints we impose in determining network similarity, the more likely it is that the networks will truly be similar. &lt;b&gt;Results:&lt;/b&gt; We introduce a new systematic measure of a network&apos;s local structure that imposes a large number of similarity constraints on networks being compared. In particular, we generalize the degree distribution, which measures the number of nodes &#8216;touching&#8217; &lt;it&gt;k&lt;/it&gt; edges, into distributions measuring the number of nodes &#8216;touching&#8217; &lt;it&gt;k graphlets&lt;/it&gt;, where graphlets are small connected non-isomorphic subgraphs of a large network. Our new measure of network local structure consists of 73 &lt;it&gt;graphlet degree distributions&lt;/it&gt; of graphlets with 2&#8211;5 nodes, but it is easily extendible to a greater number of constraints (i.e. graphlets), if necessary, and the extensions are limited only by the available CPU. Furthermore, we show a way to combine the 73 graphlet degree distributions into a network &#8216;agreement&#8217; measure which is a number between 0 and 1, where 1 means that networks have identical distributions and 0 means that they are far apart. Based on this new network agreement measure, we show that almost all of the 14 eukaryotic PPI networks, including human, resulting from various high-throughput experimental techniques, as well as from curated databases, are better modeled by geometric random graphs than by Erd&#246;s&#8211;R&#233;ny, random scale-free, or Barab&#225;si&#8211;Albert scale-free networks. &lt;b&gt;Availability:&lt;/b&gt; Software executables are available upon request. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;natasha@ics.uci.edu&quot; locator-type=&quot;email&quot;&gt;natasha@ics.uci.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e177</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl301</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e212</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Prediction and simulation of motion in pairs of transmembrane {alpha}-helices</dc:title>
<dc:creator>Enosh, Angela</dc:creator>
<dc:creator>Fleishman, Sarel J.</dc:creator>
<dc:creator>Ben-Tal, Nir</dc:creator>
<dc:creator>Halperin, Dan</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Motion in transmembrane (TM) proteins plays an essential role in a variety of biological phenomena. Thus, developing an automated method for predicting and simulating motion in this class of proteins should result in an increased level of understanding of crucial physiological mechanisms. We have developed an algorithm for predicting and simulating motion in TM proteins of the &#945;-helix bundle type. Our method employs probabilistic motion-planning techniques to suggest possible collision-free motion paths. The resulting paths are ranked according to the quality of the van der Waals interactions between the TM helices. Our algorithm considers a wide range of degrees of freedom (dofs) involved in the motion, including external and internal moves. However, in order to handle the vast dimensionality of the problem, we employ some constraints on these dofs in a way that is unlikely to rule out the native motion of the protein. Our algorithm simulates the motion, including all the dofs, and automatically produces a movie that demonstrates it. &lt;b&gt;Results:&lt;/b&gt; Overexpression of the RTK ErbB2 was implicated in causing a variety of human cancers. Recently, a molecular mechanism for rotation-coupled activation of the receptor was suggested. We applied our algorithm to investigate the TM domain of this protein, and compared our results with this mechanism. A motion pathway that was similar to the proposed mechanism ranked first, and motions with partial overlap to this pathway followed in rank order. In addition, we conducted a negative-control computational-experiment using Glycophorin A. Our results confirmed the immobility of this TM protein, resulting in degenerate paths comprising native-like conformations. &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;inter-ref locator=&quot;http://www.cs.tau.ac.il/~angela/EGFR.html&quot; locator-type=&quot;url&quot;&gt;http://www.cs.tau.ac.il/~angela/EGFR.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;angela@post.tau.ac.il&quot; locator-type=&quot;email&quot;&gt;angela@post.tau.ac.il&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e212</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl325</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e198</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Difference detection in LC-MS data for protein biomarker discovery</dc:title>
<dc:creator>Listgarten, Jennifer</dc:creator>
<dc:creator>Neal, Radford M.</dc:creator>
<dc:creator>Roweis, Sam T.</dc:creator>
<dc:creator>Wong, Peter</dc:creator>
<dc:creator>Emili, Andrew</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; There is a pressing need for improved proteomic screening methods allowing for earlier diagnosis of disease, systematic monitoring of physiological responses and the uncovering of fundamental mechanisms of drug action. The combined platform of LC-MS (Liquid-Chromatography-Mass-Spectrometry) has shown promise in moving toward a solution in these areas. In this paper we present atechnique for discovering differences in protein signal between two classes of samples of LC-MS serum proteomic data without use of tandem mass spectrometry, gels or labeling. This method works on data from a lower-precision MS instrument, the type routinely used by and available to the community at large today. We test our technique on a controlled (spike-in) but realistic (serum biomarker discovery) experiment which is therefore verifiable. We also develop a new method for helping to assess the difficulty of a given spike-in problem. Lastly, we show that the problem of class prediction, sometimes mistaken as a solution to biomarker discovery, is actually a much simpler problem. &lt;b&gt;Results:&lt;/b&gt; Using precision&#8211;recall curves with experimentally extracted ground truth, we show that (1) our technique has good performance using seven replicates from each class, (2) performance degrades with decreasing number of replicates, (3) the signal that we are teasing out is not trivially available (i.e. the differences are not so large that the task is easy). Lastly, we easily obtain perfect classification results for data in which the problem of extracting differences does not produce absolutely perfect results. This emphasizes the different nature of the two problems and also their relative difficulties. &lt;b&gt;Availability:&lt;/b&gt; Our data are publicly available as a benchmark for further studies of this nature at &lt;inter-ref locator=&quot;http://www.cs.toronto.edu/~jenn/LCMS&quot; locator-type=&quot;url&quot;&gt;http://www.cs.toronto.edu/~jenn/LCMS&lt;/inter-ref&gt; &lt;b&gt;Supplementary Information:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.cs.toronto.edu/~jennl/LCMS&quot; locator-type=&quot;url&quot;&gt;http://www.cs.toronto.edu/~jennl/LCMS&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;jenn@cs.toronto.edu&quot; locator-type=&quot;email&quot;&gt;jenn@cs.toronto.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e198</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl326</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e205</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Vorolign--fast structural alignment using Voronoi contacts</dc:title>
<dc:creator>Birzele, Fabian</dc:creator>
<dc:creator>Gewehr, Jan E.</dc:creator>
<dc:creator>Csaba, Gergely</dc:creator>
<dc:creator>Zimmer, Ralf</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; Vorolign, a fast and flexible structural alignment method for two or more protein structures is introduced. The method aligns protein structures using double dynamic programming and measures the similarity of two residues based on the evolutionary conservation of their corresponding Voronoi-contacts in the protein structure. This similarity function allows aligning protein structures even in cases where structural flexibilities exist. Multiple structural alignments are generated from a set of pairwise alignments using a consistency-based, progressive multiple alignment strategy. &lt;b&gt;Results:&lt;/b&gt; The performance of Vorolign is evaluated for different applications of protein structure comparison, including automatic family detection as well as pairwise and multiple structure alignment. Vorolign accurately detects the correct family, superfamily or fold of a protein with respect to the SCOP classification on a set of difficult target structures. A scan against a database of &gt;4000 proteins takes on average 1 min per target. The performance of Vorolign in calculating pairwise and multiple alignments is found to be comparable with other pairwise and multiple protein structure alignment methods. &lt;b&gt;Availability:&lt;/b&gt; Vorolign is freely available for academic users as a web server at &lt;inter-ref locator=&quot;http://www.bio.ifi.lmu.de/Vorolign&quot; locator-type=&quot;url&quot;&gt;http://www.bio.ifi.lmu.de/Vorolign&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;fabian.birzele@ifi.lmu.de&quot; locator-type=&quot;email&quot;&gt;fabian.birzele@ifi.lmu.de&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Datasets used throughout the article are available at &lt;inter-ref locator=&quot;http://www.bio.ifi.lmu.de/Vorolign/supplement.html&quot; locator-type=&quot;url&quot;&gt;http://www.bio.ifi.lmu.de/Vorolign/supplement.html&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e205</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl294</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e184</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Similarities and differences of gene expression in yeast stress conditions</dc:title>
<dc:creator>Rokhlenko, Oleg</dc:creator>
<dc:creator>Wexler, Ydo</dc:creator>
<dc:creator>Yakhini, Zohar</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation and Methods:&lt;/b&gt; All living organisms and the survival of all cells critically depend on their ability to sense and quickly adapt to changes in the environment and to other stress conditions. We study stress response mechanisms in &lt;it&gt;Saccharomyces cerevisiae&lt;/it&gt; by identifying genes that, according to very stringent criteria, have persistent co-expression under a variety of stress conditions. This is enabled through a fast clique search method applied to the intersection of several co-expression graphs calculated over the data of Gasch &lt;it&gt;et al&lt;/it&gt;. This method exploits the topological characteristics of these graphs. &lt;b&gt;Results:&lt;/b&gt; We observe cliques in the intersection graphs that are much larger than expected under a null model of changing gene identities for different stress conditions but maintaining the co-expression topology within each one. Persistent cliques are analyzed to identify enriched function as well as enriched regulation by a small number of TFs. These TFs, therefore, characterize a universal and persistent reaction to stress response. We further demonstrate that the vertices (genes) of many cliques in the intersection graphs are co-localized in the yeast genome, to a degree far beyond the random expectation. Co-localization can hypothetically contribute to a quick co-ordinated response. We propose the use of persistent cliques in further study of properties of co-regulation. &lt;b&gt;Supplementary information:&lt;/b&gt; &lt;inter-ref locator=&quot;http://www.cs.technion.ac.il/~olegro/stress.html&quot; locator-type=&quot;url&quot;&gt;http://www.cs.technion.ac.il/~olegro/stress.html&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;olegro@cs.technion.ac.il&quot; locator-type=&quot;email&quot;&gt;olegro@cs.technion.ac.il&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e184</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl308</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e170</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Identification of conserved protein complexes based on a model of protein network evolution</dc:title>
<dc:creator>Hirsh, Eitan</dc:creator>
<dc:creator>Sharan, Roded</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Data on protein&#8211;protein interactions (PPIs) are increasing exponentially. To date, large-scale protein interaction networks are available for human and most model species. The arising challenge is to organize these networks into models of cellular machinery. As in other biological domains, a comparative approach provides a powerful basis for addressing this challenge. &lt;b&gt;Results:&lt;/b&gt; We develop a probabilistic model for protein complexes that are conserved across two species. The model describes the evolution of conserved protein complexes from an ancestral species by protein interaction attachment and detachment and gene duplication events. We apply our model to search for conserved protein complexes within the PPI networks of yeast and fly, which are the largest networks in public databases. We detect 150 conserved complexes that match well-known complexes in yeast and are coherent in their functional annotations both in yeast and in fly. In comparison with two previous approaches, our model yields higher specificity and sensitivity levels in protein complex detection. &lt;b&gt;Availability:&lt;/b&gt; The program is available upon request. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;roded@tau.ac.il&quot; locator-type=&quot;email&quot;&gt;roded@tau.ac.il&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e170</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl295</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e219</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Using an alignment of fragment strings for comparing protein structures</dc:title>
<dc:creator>Friedberg, Iddo</dc:creator>
<dc:creator>Harder, Tim</dc:creator>
<dc:creator>Kolodny, Rachel</dc:creator>
<dc:creator>Sitbon, Einat</dc:creator>
<dc:creator>Li, Zhanwen</dc:creator>
<dc:creator>Godzik, Adam</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Most methods that are used to compare protein structures use three-dimensional (3D) structural information. At the same time, it has been shown that a 1D string representation of local protein structure retains a degree of structural information. This type of representation can be a powerful tool for protein structure comparison and classification, given the arsenal of sequence comparison tools developed by computational biology. However, in order to do so, there is a need to first understand how much information is contained in various possible 1D representations of protein structure. &lt;b&gt;Results:&lt;/b&gt; Here we describe the use of a particular structure fragment library, denoted here as KL-strings, for the 1D representation of protein structure. Using KL-strings, we develop an infrastructure for comparing protein structures with a 1D representation. This study focuses on the added value gained from such a description. We show the new local structure language adds resolution to the traditional three-state (helix, strand and coil) secondary structure description, and provides a high degree of accuracy in recognizing structural similarities when used with a pairwise alignment benchmark. The results of this study have immediate applications towards fast structure recognition, and for fold prediction and classification. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;idoerg@burnham.org&quot; locator-type=&quot;email&quot;&gt;idoerg@burnham.org&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; &lt;inter-ref locator=&quot;http://iddo-friedberg.org/&quot; locator-type=&quot;url&quot;&gt;http://iddo-friedberg.org/&lt;/inter-ref&gt;ECCB06-supplement </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e219</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl310</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e17</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Incremental window-based protein sequence alignment algorithms</dc:title>
<dc:creator>Rangwala, Huzefa</dc:creator>
<dc:creator>Karypis, George</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Protein sequence alignment plays a critical role in computational biology as it is an integral part in many analysis tasks designed to solve problems in comparative genomics, structure and function prediction, and homology modeling. &lt;b&gt;Methods:&lt;/b&gt; We have developed novel sequence alignment algorithms that compute the alignment between a pair of sequences based on short fixed- or variable-length high-scoring subsequences. Our algorithms build the alignments by repeatedly selecting the highest scoring pairs of subsequences and using them to construct small portions of the final alignment. We utilize PSI-BLAST generated sequence profiles and employ a profile-to-profile scoring scheme derived from PICASSO. &lt;b&gt;Results:&lt;/b&gt; We evaluated the performance of the computed alignments on two recently published benchmark datasets and compared them against the alignments computed by existing state-of-the-art dynamic programming-based profile-to-profile local and global sequence alignment algorithms. Our results show that the new algorithms achieve alignments that are comparable with or better than those achieved by existing algorithms. Moreover, our results also showed that these algorithms can be used to provide better information as to which of the aligned positions are more reliable&#8212;a critical piece of information for comparative modeling applications. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;rangwala@cs.umn.edu&quot; locator-type=&quot;email&quot;&gt;rangwala@cs.umn.edu&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; &lt;inter-ref locator=&quot;http://bioinfo.cs.umn.edu/supplements/win-aln/&quot; locator-type=&quot;url&quot;&gt;http://bioinfo.cs.umn.edu/supplements/win-aln/&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e17</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl297</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/134</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Predicting transcription factor affinities to DNA from a biophysical model</dc:title>
<dc:creator>Roider, Helge G.</dc:creator>
<dc:creator>Kanhere, Aditi</dc:creator>
<dc:creator>Manke, Thomas</dc:creator>
<dc:creator>Vingron, Martin</dc:creator>
<dc:subject>GENOME ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Theoretical efforts to understand the regulation of gene expression are traditionally centered around the identification of transcription factor binding sites at specific DNA positions. More recently these efforts have been supplemented by experimental data for relative binding affinities of proteins to longer intergenic sequences. The question arises to what extent these two approaches converge. In this paper, we adopt a physical binding model to predict the relative binding affinity of a transcription factor for a given sequence. &lt;b&gt;Results:&lt;/b&gt; We find that a significant fraction of genome-wide binding data in yeast can be accounted for by simple count matrices and a physical model with only two parameters. We demonstrate that our approach is both conceptually and practically more powerful than traditional methods, which require selection of a cutoff. Our analysis yields biologically meaningful parameters, suitable for predicting relative binding affinities in the absence of experimental binding data. &lt;b&gt;Availability:&lt;/b&gt; The C source code for our TRAP program is freely available for non-commercial use at &lt;inter-ref locator=&quot;http://www.molgen.mpg.de/~manke/papers/TFaffinities/&quot; locator-type=&quot;url&quot;&gt;http://www.molgen.mpg.de/~manke/papers/TFaffinities/&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;vingron@molgen.mpg.de&quot; locator-type=&quot;email&quot;&gt;vingron@molgen.mpg.de&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/134</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl565</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/142</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Reliable prediction of Drosha processing sites improves microRNA gene prediction</dc:title>
<dc:creator>Helvik, Snorre A.</dc:creator>
<dc:creator>Sn&#248;ve, Ola</dc:creator>
<dc:creator>S&#230;trom, P&#229;l</dc:creator>
<dc:subject>GENOME ANALYSIS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Mature microRNAs (miRNAs) are processed from long hairpin transcripts. Even though it is only the first of several steps, the initial Drosha processing defines the mature product and is characteristic for all miRNA genes. Methods that can separate between true and false processing sites are therefore essential to miRNA gene discovery. &lt;b&gt;Results:&lt;/b&gt; We present a classifier that predicts 5&#8242; Drosha processing sites in hairpins that are candidate miRNAs. The classifier, called Microprocessor SVM, correctly predicts the processing site for 50% of known human 5&#8242; miRNAs, and 90% of its predictions are within two nucleotides of the true site. Another classifier that is trained on the output from the Microprocessor SVM outperforms existing methods for prediction of unconserved miRNAs. Reanalysis of characteristics and supporting evidence for a set of newly annotated miRNAs shows that some miRNAs may be misannotated. This suggests that expressed hairpins should not be annotated as miRNAs until they are verified to be Drosha and Dicer substrates. &lt;b&gt;Availability:&lt;/b&gt; The classifiers are publicly available at &lt;inter-ref locator=&quot;https://demo1.interagon.com/miRNA/&quot; locator-type=&quot;url&quot;&gt;https://demo1.interagon.com/miRNA/&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;paal.saetrom@interagon.com&quot; locator-type=&quot;email&quot;&gt;paal.saetrom@interagon.com&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data is available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/142</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl570</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/133</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>EDITORIAL</dc:title>
<dc:creator>Valencia, Alfonso</dc:creator>
<dc:creator>Bateman, Alex</dc:creator>
<dc:creator>Executive Editors,  </dc:creator>
<dc:subject>EDITORIALS</dc:subject>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/133</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl635</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e99</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Electrostatic potentials of proteins in water: a structured continuum approach</dc:title>
<dc:creator>Hildebrandt, Andreas</dc:creator>
<dc:creator>Blossey, Ralf</dc:creator>
<dc:creator>Rjasanow, Sergej</dc:creator>
<dc:creator>Kohlbacher, Oliver</dc:creator>
<dc:creator>Lenhof, Hans-Peter</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> Electrostatic interactions play a crucial role in many biomolecular processes, including molecular recognition and binding. Biomolecular electrostatics is modulated to a large extent by the water surrounding the molecules. Here, we present a novel approach to the computation of electrostatic potentials which allows the inclusion of water structure into the classical theory of continuum electrostatics. Based on our recent purely differential formulation of nonlocal electrostatics [&lt;cross-ref type=&quot;bib&quot; refid=&quot;b8&quot;&gt;Hildebrandt, &lt;it&gt;et al.&lt;/it&gt; (2004)&lt;/cross-ref&gt; &lt;it&gt;Phys. Rev. Lett&lt;/it&gt;., &lt;b&gt;93&lt;/b&gt;, 108104] we have developed a new algorithm for its efficient numerical solution. The key component of this algorithm is a boundary element solver, having the same computational complexity as established boundary element methods for local continuum electrostatics. This allows, for the first time, the computation of electrostatic potentials and interactions of large biomolecular systems immersed in water including effects of the solvent&apos;s structure in a continuum description. We illustrate the applicability of our approach with two examples, the enzymes trypsin and acetylcholinesterase. The approach is applicable to all problems requiring precise prediction of electrostatic interactions in water, such as protein&#8211;ligand and protein&#8211;protein docking, folding and chromatin regulation. Initial results indicate that this approach may shed new light on biomolecular electrostatics and on aspects of molecular recognition that classical local electrostatics cannot reveal. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;anhi@bioinf.uni-sb.de&quot; locator-type=&quot;email&quot;&gt;anhi@bioinf.uni-sb.de&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e99</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl312</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e71</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Family relationships: should consensus reign?--consensus clustering for protein families</dc:title>
<dc:creator>Nikolski, Macha</dc:creator>
<dc:creator>Sherman, David J.</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Reliable identification of protein families is key to phylogenetic analysis, functional annotation and the exploration of protein function diversity in a given phylogenetic branch. As more and more complete genomes are sequenced, there is a need for powerful and reliable algorithms facilitating protein families construction. &lt;b&gt;Results:&lt;/b&gt; We have formulated the problem of protein families construction as an instance of consensus clustering, for which we designed a novel algorithm that is computationally efficient in practice and produces high quality results. Our algorithm uses an election method to construct consensus families from competing clustering computations. Our consensus clustering algorithm is tailored to serve the specific needs of comparative genomics projects. First, it provides a robust means to incorporate results from different and complementary clustering methods, thus avoiding the need for an a priori choice that may introduce computational bias in the results. Second, it is suited to large-scale projects due to the practical efficiency. And third, it produces high quality results where families tend to represent groupings by biological function. &lt;b&gt;Availability:&lt;/b&gt; This method has been used for G&#233;nolevures project to compute protein families of Hemiascomycetous yeasts. The data are available online at &lt;inter-ref locator=&quot;http://cbi.labri.fr/Genolevures/fam/&quot; locator-type=&quot;url&quot;&gt;http://cbi.labri.fr/Genolevures/fam/&lt;/inter-ref&gt; &lt;b&gt;Supplementary information:&lt;/b&gt; Supplementary data are available at &lt;inter-ref locator=&quot;http://cbi.labri.fr/Genolevures/fam/&quot; locator-type=&quot;url&quot;&gt;http://cbi.labri.fr/Genolevures/fam/&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;macha@labri.fr&quot; locator-type=&quot;email&quot;&gt;macha@labri.fr&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e71</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl314</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e84</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Discovering tightly regulated and differentially expressed gene sets in whole genome expression data</dc:title>
<dc:creator>Ye, Chun</dc:creator>
<dc:creator>Eskin, Eleazar</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Recently, a new type of expression data is being collected which aims to measure the effect of genetic variation on gene expression in pathways. In these datasets, expression profiles are constructed for multiple strains of the same model organism under the same condition. The goal of analyses of these data is to find differences in regulatory patterns due to genetic variation between strains, often without a phenotype of interest in mind. We present a new method based on notions of tight regulation and differential expression to look for sets of genes which appear to be significantly affected by genetic variation. &lt;b&gt;Results:&lt;/b&gt; When we use categorical phenotype information, as in the Alzheimer&apos;s and diabetes datasets, our method finds many of the same gene sets as gene set enrichment analysis. In addition, our notion of correlated gene sets allows us to focus our efforts on biological processes subjected to tight regulation. In murine hematopoietic stem cells, we are able to discover significant gene sets independent of a phenotype of interest. Some of these gene sets are associated with several blood-related phenotypes. &lt;b&gt;Availability:&lt;/b&gt; The programs are available by request from the authors. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;cye@bioinf.ucsd.edu&quot; locator-type=&quot;email&quot;&gt;cye@bioinf.ucsd.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e84</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl315</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e5</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Simulating multiplexed SNP discovery rates using base-specific cleavage and mass spectrometry</dc:title>
<dc:creator>B&#246;cker, Sebastian</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Single Nucleotide Polymorphisms (SNPs) are believed to contribute strongly to the genetic variability in living beings, and SNP and mutation discovery are of great interest in today&apos;s Life Sciences. A comparatively new method to discover such polymorphisms is based on base-specific cleavage, where resulting cleavage products are analyzed by mass spectrometry (MS). One particular advantage of this method is the possibility of multiplexing the biochemical reactions, i.e. examining multiple genomic regions in parallel. Simulations can help estimating the performance of a method for polymorphism discovery, and allow us to evaluate the influence of method parameters on the discovery rate, and also to investigate whether the method is well suited for a certain genomic region. &lt;b&gt;Results:&lt;/b&gt; We show how to efficiently conduct such simulations for polymorphism discovery using base-specific cleavage and MS. Simulating multiplexed polymorphism discovery leads us to the problem of uniformly drawing a multiplex. Given a multiset of natural numbers we want to uniformly draw a subset of fixed cardinality so that the elements sum up to some fixed total length. We show how to enumerate multiplex layouts using dynamic programming, which allows us to uniformly draw a multiplex. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;boecker@minet.uni-jena.de&quot; locator-type=&quot;email&quot;&gt;boecker@minet.uni-jena.de&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e5</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl291</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e77</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Optimization of probe coverage for high-resolution oligonucleotide aCGH</dc:title>
<dc:creator>Lipson, Doron</dc:creator>
<dc:creator>Yakhini, Zohar</dc:creator>
<dc:creator>Aumann, Yonatan</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; The resolution at which genomic alterations can be mapped by means of oligonucleotide aCGH (array-based comparative genomic hybridization) is limited by two factors: the availability of high-quality probes for the target genomic sequence and the array real-estate. Optimization of the probe selection process is required for arrays that are designed to probe specific genomic regions in very high resolution without compromising probe quality constraints. &lt;b&gt;Results:&lt;/b&gt; In this paper we describe a well-defined optimization problem associated with the problem of probe selection for high-resolution aCGH arrays. We propose the whenever possible &#8712;-cover as a formulation that faithfully captures the requirement of probe selection problem, and provide a fast randomized algorithm that solves the optimization problem in &lt;it&gt;O&lt;/it&gt;(&lt;it&gt;n&lt;/it&gt; log&lt;it&gt;n&lt;/it&gt;) time, as well as a deterministic algorithm with the same asymptotic performance. We apply the method in a typical high-definition array design scenario and demonstrate its superiority with respect to alternative approaches. &lt;b&gt;Availability:&lt;/b&gt; Address requests to the authors. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;dlipson@cs.technion.ac.il&quot; locator-type=&quot;email&quot;&gt;dlipson@cs.technion.ac.il&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e77</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl316</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e237</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>EBIMed--text crunching to gather facts for proteins from Medline</dc:title>
<dc:creator>Rebholz-Schuhmann, Dietrich</dc:creator>
<dc:creator>Kirsch, Harald</dc:creator>
<dc:creator>Arregui, Miguel</dc:creator>
<dc:creator>Gaudan, Sylvain</dc:creator>
<dc:creator>Riethoven, Mark</dc:creator>
<dc:creator>Stoehr, Peter</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Summary:&lt;/b&gt; To allow efficient and systematic retrieval of statements from Medline we have developed EBIMed, a service that combines document retrieval with co-occurrence-based analysis of Medline abstracts. Upon keyword query, EBIMed retrieves the abstracts from EMBL-EBI&apos;s installation of Medline and filters for sentences that contain biomedical terminology maintained in public bioinformatics resources. The extracted sentences and terminology are used to generate an overview table on proteins, Gene Ontology (GO) annotations, drugs and species used in the same biological context. All terms in retrieved abstracts and extracted sentences are linked to their entries in biomedical databases. We assessed the quality of the identification of terms and relations in the retrieved sentences. More than 90% of the protein names found indeed represented a protein. According to the analysis of four protein&#8211;protein pairs from the Wnt pathway we estimated that 37% of the statements containing such a pair mentioned a meaningful interaction and clarified the interaction of Dkk with LRP. We conclude that EBIMed improves access to information where proteins and drugs are involved in the same biological process, e.g. statements with GO annotations of proteins, protein&#8211;protein interactions and effects of drugs on proteins. &lt;b&gt;Availability:&lt;/b&gt; Available at &lt;inter-ref locator=&quot;http://www.ebi.ac.uk/Rebholz-srv/ebimed&quot; locator-type=&quot;url&quot;&gt;http://www.ebi.ac.uk/Rebholz-srv/ebimed&lt;/inter-ref&gt; &lt;b&gt;Supplementary Data:&lt;/b&gt; Supplementary Data are available at &lt;it&gt;Bioinformatics&lt;/it&gt; online. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;Rebholz@ebi.ac.uk&quot; locator-type=&quot;email&quot;&gt;Rebholz@ebi.ac.uk&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e237</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl302</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e44</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Simultaneous alignment and annotation of cis-regulatory regions</dc:title>
<dc:creator>Bais, Abha Singh</dc:creator>
<dc:creator>Grossmann, Steffen</dc:creator>
<dc:creator>Vingron, Martin</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Current methods that annotate conserved transcription factor binding sites in an alignment of two regulatory regions perform the alignment and annotation step separately and combine the results in the end. If the site descriptions are weak or the sequence similarity is low, the local gap structure of the alignment poses a problem in detecting the conserved sites. It is therefore desirable to have an approach that is able to simultaneously consider the alignment as well as possibly matching site locations. &lt;b&gt;Results:&lt;/b&gt; With SimAnn we have developed a tool that serves exactly this purpose. By combining the annotation step and the alignment of the two sequences into one algorithm, it detects conserved sites more clearly. It has the additional advantage that all parameters are calculated based on statistical considerations. This allows for its successful application with any binding site model of interest. We present the algorithm and the approach for parameter selection and compare its performance with that of other, non-simultaneous methods on both simulated and real data. &lt;b&gt;Availability:&lt;/b&gt; A command-line based C++ implementation of SimAnn is available from the authors upon request. In addition, we provide Perl scripts for calculating the input parameters based on statistical considerations. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;bais@molgen.mpg.de&quot; locator-type=&quot;email&quot;&gt;bais@molgen.mpg.de&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e44</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl305</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e231</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A novel pattern recognition algorithm to classify membrane protein unfolding pathways with high-throughput single-molecule force spectroscopy</dc:title>
<dc:creator>Marsico, Annalisa</dc:creator>
<dc:creator>Labudde, Dirk</dc:creator>
<dc:creator>Sapra, Tanuj</dc:creator>
<dc:creator>Muller, Daniel J.</dc:creator>
<dc:creator>Schroeder, Michael</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Misfolding of membrane proteins plays an important role in many human diseases such as &lt;it&gt;retinitis pigmentosa&lt;/it&gt;, hereditary deafness and &lt;it&gt;diabetes insipidus&lt;/it&gt;. Little is known about membrane proteins as there are only very few high-resolution structures. Single-molecule force spectroscopy is a novel technique, which measures the force necessary to pull a protein out of a membrane. Such force curves contain valuable information on the protein structure, conformation, and inter- and intra-molecular forces. High-throughput force spectroscopy experiments generate hundreds of force curves including spurious ones and good curves, which correspond to different unfolding pathways. Manual analysis of these data is a bottleneck and source of inconsistent and subjective annotation. &lt;b&gt;Results:&lt;/b&gt; We propose a novel algorithm for the identification of spurious curves and curves representing different unfolding pathways. Our algorithm proceeds in three stages: first, we reduce noise in the curves by applying dimension reduction; second, we align the curves with dynamic programming and compute pairwise distances and third, we cluster the curves based on these distances. We apply our method to a hand-curated dataset of 135 force curves of bacteriorhodopsin mutant P50A. Our algorithm achieves a success rate of 81% distinguishing spurious from good curves and a success rate of 76% classifying unfolding pathways. As a result, we discuss five different unfolding pathways of bacteriorhodopsin including three main unfolding events and several minor ones. Finally, we link folding barriers to the degree of conservation of residues. Overall, the algorithm tackles the force spectroscopy bottleneck and leads to more consistent and reproducible results paving the way for high-throughput analysis of structural features of membrane proteins. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;annalisa.marsico@biotec.tu-dresden.de&quot; locator-type=&quot;email&quot;&gt;annalisa.marsico@biotec.tu-dresden.de&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e231</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl293</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e57</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Genetic code symmetry and efficient design of GC-constrained coding sequences</dc:title>
<dc:creator>Gavish, Matan</dc:creator>
<dc:creator>Peled, Amnon</dc:creator>
<dc:creator>Chor, Benny</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Cloning of long DNA sequences (40&#8211;60 bases) into phage display libraries using polymerase chain reaction (PCR) is a low efficiency process, in which PCR is used to incorporate a DNA insert, coding for a certain peptide, into the amplified sequence. The PCR efficiency in this process is strongly affected by the distribution of G&#8211;C bases in the amplified sequence. As any DNA insert coding for the target peptide may be attempted, there is a flexibility in choosing part of the amplified sequence. Since the number of inserts coding for the same peptide is exponential in the peptide length, a computational problem naturally arises&#8212;that of efficiently finding an insert, whose parameters are optimal for PCR cloning. &lt;b&gt;Results:&lt;/b&gt; The GC distribution requirements are formulated as a search problem. We developed an efficient, linear time &#8216;one pass&#8217; algorithm for this problem. Interestingly, our algorithm strongly relies on an interesting symmetry, which we observed in the standard genetic code. Most non-standard genetic codes examined possess this symmetry as well, yet some do not. We generalize the search problem and consider the case of a non-standard, or arbitrary, genetic code where this symmetry does not necessary hold. We solve the generalized problem in polynomial, but nonlinear, time. &lt;b&gt;Availability:&lt;/b&gt; An implementation of the proposed algorithm is available upon request from the authors. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;benny@cs.tau.ac.il&quot; locator-type=&quot;email&quot;&gt;benny@cs.tau.ac.il&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e57</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl317</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e64</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Merging microarray cell synchronization experiments through curve alignment</dc:title>
<dc:creator>Hermans, Filip</dc:creator>
<dc:creator>Tsiporkova, Elena</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; The validity of periodic cell cycle regulation studies in plants is seriously compromised by the relatively poor quality of cell synchrony that is achieved for plant suspension cultures in comparison to yeast and mammals. The present state-of-the-art plant synchronization techniques cannot offer a complete cell cycle coverage and moreover a considerable loss of cell synchrony may occur toward the end of the sampling. One possible solution is to consider combining multiple datasets, produced by different synchronization techniques and thus covering different phases of the cell cycle, in order to arrive at a better cell cycle coverage. &lt;b&gt;Results&lt;/b&gt;: We propose a method that enables pasting expression profiles from different plant cell synchronization experiments and results in an expression curve that spans more than one cell cycle. The optimal pasting overlap is determined via a dynamic time warping alignment. Consequently, the different expression time series are merged together by aggregating the corresponding expression values lying within the overlap area. We demonstrate that the periodic analysis of the merged expression profiles produces more reliable &lt;it&gt;p&lt;/it&gt;-values for periodicity. Subsequent Gene Ontology analysis of the results confirms that merging synchronization experiments is a more robust strategy for the selection of potentially periodic genes. Additional validation of the proposed algorithm on yeast data is also presented. &lt;b&gt;Availability&lt;/b&gt;: Results, benchmark sets and scripts are freely available at our website: &lt;inter-ref locator=&quot;http://www.psb.ugent.be/cbd/publications.php&quot; locator-type=&quot;url&quot;&gt;http://www.psb.ugent.be/cbd/publications.php&lt;/inter-ref&gt; &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;elena.tsiporkova@ugent.be&quot; locator-type=&quot;email&quot;&gt;elena.tsiporkova@ugent.be&lt;/inter-ref&gt;, &lt;inter-ref locator=&quot;fiher@psb.ugent.be&quot; locator-type=&quot;email&quot;&gt;fiher@psb.ugent.be&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e64</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl320</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e50</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A comparative genome approach to marker ordering</dc:title>
<dc:creator>Faraut, T.</dc:creator>
<dc:creator>de Givry, S.</dc:creator>
<dc:creator>Chabrier, P.</dc:creator>
<dc:creator>Derrien, T.</dc:creator>
<dc:creator>Galibert, F.</dc:creator>
<dc:creator>Hitte, C.</dc:creator>
<dc:creator>Schiex, T.</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Genome maps are fundamental to the study of an organism and essential in the process of genome sequencing which in turn provides the ultimate map of the genome. The increased number of genomes being sequenced offers new opportunities for the mapping of closely related organisms. We propose here an algorithmic formalization of a genome comparison approach to marker ordering. &lt;b&gt;Results:&lt;/b&gt; In order to integrate a comparative mapping approach in the algorithmic process of map construction and selection, we propose to extend the usual statistical model describing the experimental data, here radiation hybrids (RH) data, in a statistical framework that models additionally the evolutionary relationships between a proposed map and a reference map: an existing map of the corresponding orthologous genes or markers in a closely related organism. This has concretely the effect of exploiting, in the process of map selection, the information of marker adjacencies in the related genome when the information provided by the experimental data is not conclusive for the purpose of ordering. In order to compute efficiently the map, we proceed to a reduction of the maximum likelihood estimation to the Traveling Salesman Problem. Experiments on simulated RH datasets as well as on a real RH dataset from the canine RH project show that maps produced using the likelihood defined by the new model are significantly better than maps built using the traditional RH model. &lt;b&gt;Availability:&lt;/b&gt; The comparative mapping approach is available in the last version of de Givry,S. &lt;it&gt;et al&lt;/it&gt;. [(2004) &lt;it&gt;Bioinformatics&lt;/it&gt;, 21, 1703&#8211;1704, &lt;inter-ref locator=&quot;www.inra.fr/mia/T/CarthaGene&quot; locator-type=&quot;url&quot;&gt;www.inra.fr/mia/T/CarthaGene&lt;/inter-ref&gt;], a free (the LKH part is free for academic use only) mapping software in C++, including LKH (Helsgaun,K. (2000) &lt;it&gt;Eur. J. Oper. Res&lt;/it&gt;., 126, 106&#8211;130, &lt;inter-ref locator=&quot;www.dat.ruc.dk/keld/research/LKH&quot; locator-type=&quot;url&quot;&gt;www.dat.ruc.dk/keld/research/LKH&lt;/inter-ref&gt;) for maximum likelihood computation. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;thomas.faraut@toulouse.inra.fr&quot; locator-type=&quot;email&quot;&gt;thomas.faraut@toulouse.inra.fr&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e50</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl321</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e30</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Tandem repeats over the edit distance</dc:title>
<dc:creator>Sokol, Dina</dc:creator>
<dc:creator>Benson, Gary</dc:creator>
<dc:creator>Tojeira, Justin</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation&lt;/b&gt;: A tandem repeat in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats occur in the genomes of both eukaryotic and prokaryotic organisms. They are important in numerous fields including disease diagnosis, mapping studies, human identity testing (DNA fingerprinting), sequence homology and population studies. Although tandem repeats have been used by biologists for many years, there are few tools available for performing an exhaustive search for all tandem repeats in a given sequence. &lt;b&gt;Results:&lt;/b&gt; In this paper we describe an efficient algorithm for finding all tandem repeats within a sequence, under the edit distance measure. The contributions of this paper are two-fold: theoretical and practical. We present a precise definition for tandem repeats over the edit distance and an efficient, deterministic algorithm for finding these repeats. &lt;b&gt;Availability:&lt;/b&gt; The algorithm has been implemented in C++, and the software is available upon request and can be used at &lt;inter-ref locator=&quot;http://www.sci.brooklyn.cuny.edu/~sokol/trepeats&quot; locator-type=&quot;url&quot;&gt;http://www.sci.brooklyn.cuny.edu/~sokol/trepeats&lt;/inter-ref&gt;. The use of this tool will assist biologists in discovering new ways that tandem repeats affect both the structure and function of DNA and protein molecules. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;sokol@sci.brooklyn.cuny.edu&quot; locator-type=&quot;email&quot;&gt;sokol@sci.brooklyn.cuny.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e30</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl309</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e36</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Designing patterns for profile HMM search</dc:title>
<dc:creator>Sun, Yanni</dc:creator>
<dc:creator>Buhler, Jeremy</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; Profile HMMs are a powerful tool for modeling conserved motifs in proteins. These models are widely used by search tools to classify new protein sequences into families based on domain architecture. However, the proliferation of known motifs and new proteomic sequence data poses a computational challenge for search, requiring days of CPU time to annotate an organism&apos;s proteome. &lt;b&gt;Results:&lt;/b&gt; We use PROSITE-like patterns as a filter to speed up the comparison between protein sequence and profile HMM. A set of patterns is designed starting from the HMM, and only sequences matching one of these patterns are compared to the HMM by full dynamic programming. We give an algorithm to design patterns with maximal sensitivity subject to a bound on the false positive rate. Experiments show that our patterns typically retain at least 90% of the sensitivity of the source HMM while accelerating search by an order of magnitude. &lt;b&gt;Availability:&lt;/b&gt; Contact the first author at the address below. &lt;b&gt;Contact:&lt;/b&gt; &lt;inter-ref locator=&quot;yanni@cse.wustl.edu&quot; locator-type=&quot;email&quot;&gt;yanni@cse.wustl.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e36</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl323</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<record><header><identifier>oai:open-archive.highwire.org:bioinfo:23/2/e24</identifier><datestamp>2007-01-19</datestamp><setSpec>HighWire</setSpec><setSpec>OUP</setSpec><setSpec>bioinfo:23:2</setSpec></header><metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Multiple alignment by sequence annealing</dc:title>
<dc:creator>S. Schwartz, Ariel</dc:creator>
<dc:creator>Pachter, Lior</dc:creator>
<dc:subject>ORIGINAL PAPERS</dc:subject>
<dc:description> &lt;b&gt;Motivation:&lt;/b&gt; We introduce a novel approach to multiple alignment that is based on an algorithm for rapidly checking whether single matches are consistent with a partial multiple alignment. This leads to a &lt;it&gt;sequence annealing&lt;/it&gt; algorithm, which is an incremental method for building multiple sequence alignments one match at a time. Our approach improves significantly on the standard progressive alignment approach to multiple alignment. &lt;b&gt;Results&lt;/b&gt;: The sequence annealing algorithm performs well on benchmark test sets of protein sequences. It is not only sensitive, but also specific, drastically reducing the number of incorrectly aligned residues in comparison to other programs. The method allows for adjustment of the sensitivity/specificity tradeoff and can be used to reliably identify homologous regions among protein sequences. &lt;b&gt;Availability&lt;/b&gt;: An implementation of the sequence annealing algorithm is available at &lt;inter-ref locator=&quot;http://bio.math.berkeley.edu/amap/&quot; locator-type=&quot;url&quot;&gt;http://bio.math.berkeley.edu/amap/&lt;/inter-ref&gt; &lt;b&gt;Contact&lt;/b&gt;: &lt;inter-ref locator=&quot;sariel@cs.berkeley.edu&quot; locator-type=&quot;email&quot;&gt;sariel@cs.berkeley.edu&lt;/inter-ref&gt; </dc:description>
<dc:publisher>Oxford University Press</dc:publisher>
<dc:date>2007-01-19</dc:date>
<dc:type>TEXT</dc:type>
<dc:format>text/html</dc:format>
<dc:identifier>http://bioinformatics.oxfordjournals.org/cgi/content/short/23/2/e24</dc:identifier>
<dc:identifier>http://dx.doi.org/10.1093/bioinformatics/btl311</dc:identifier>
<dc:language>en</dc:language>
<dc:rights>Copyright (C) 2007, Oxford University Press</dc:rights>
</oai_dc:dc>
</metadata></record>
<resumptionToken expirationDate="2009-11-21T07:28:48Z">1258784928457!0001-01-01!9999-12-31!bioinfo:23!100!oai_dc</resumptionToken></ListRecords></OAI-PMH>