Sunday, June 03, 2007

写好英语科技论文的诀窍zz

写好英语科技论文的诀窍:
主动迎合读者期望,预先回答专家可能质疑

周耀旗
印地安那大学信息学院
印地安那大学医学院计算生物学和生物信息中心

以此文献给母校中国科技大学五十周年校庆

前言
我 的第一篇英语科技论文写作是把在科大的学士毕业论文翻译成英文。当我一九九零年从纽约州立大学博士毕业时,发表了20多篇英语论文。 但是,我对怎样写高 质量科技论文的理解仍旧处于初级阶段,仅知道尽量减少语法错误。之所以如此,是因为大多数时间我都欣然接受我的博士指导老师 Dr. George Stell和Dr. Harold Friedman的修改,而不知道为什么要那样改,也没有主动去问。这种情况一直持续到我去北 卡州立大学做博士后。我的博士后指导老师Dr. Carol Hall建议我到邻近的杜克大学去参加一个为期两天的写作短训班。这堂由Gopen教授主办 的短训班真使我茅塞顿开。第一次,我知道了读者在阅读中有他们的期望,要想写好科技论文,最有效的方法是要迎合他们的期望。这堂写作课帮我成功地完成了我 的第一个博士后基金申请,有机会进入哈佛大学Dr. Martin Karplus组。在哈佛大学的五年期间,在Karplus教授的指导下,我认识到一 篇好的论文需要从深度广度进行里里外外自我审查。目前,我自己当了教授,有了自己的科研组,也常常审稿。我觉得有必要让我的博士生和博士后学好写作。 我 不认为我自己是写作专家。我的论文也常常因为这样或那样的原因被退稿。但是我认为和大家共享我对写作的理解和我写作的经验教训,也许大家会少走一些我走过 的弯路。由于多年未用中文写作,请大家多多指正。来信请寄:
yqzhou@iupui.edu。 欢迎访问我的网站:http://sparks.informatics.iupui.edu

相关连接
1.周耀旗文章下载:http://csbl.bmb.uga.edu/~ffzhou/how-to-write.pdf
2.周耀旗参考的英文文章 The Science of Scientific Writing by G. D. Gopen and J. A. Swan, Scientific American, 78, 550-558, 1990. http://www.americanscientist.org/template/AssetDetail/assetid/23947?fulltext=true&print=yes

Wednesday, April 18, 2007

A Pro-Linux Cartoon

I see this picture at http://www.whylinuxisbetter.net/. Creative and Interesting!








Monday, March 12, 2007

Recommended Firefox Extensions

1. Tab Mix Plus

Tab Mix Plus enhances Firefox's tab browsing capabilities. It includes such features as duplicating tabs, controlling tab focus, tab clicking options, undo closed tabs and windows, plus much more. I strongly recommend the options of Select the tab pointed (Tool->Tab Mix Plus Option->Mouse->Mouse Gestures). It greatly decrease the amount of clicks when browsing.
2. Session Manager
Session Manager saves and restores the state of all windows - either when you want it or automatically at startup and after crashes. For example, if you start you work everyday by opening some webpages , you can store them in a session. Additionally it offers you to reopen (accidentally) closed windows and tabs. If you're afraid of losing data while browsing - this extension allows you to relax...
3. Firefox Showcase
If you habitually find yourself awash in open tabs (It seems that one of my classmates usually runs into such situation), clicking around looking for the page you need, Firefox Showcase will save you a lot of aggravation. Once you install the extension, you'll have a new Showcase submenu under the View menu. From here you can choose to show thumbnails of all tabs in the current window or all tabs in all windows. Firefox has lots of options and keyboard shortcuts, however I will never dive into those complex options. Simply click F12 to get the thumbnail view, uparrow and downarrow to select the intended tab and enter to see the full view the select tab. I can also exit the thumbnail view by clicking Esc. That's all. For more, see View->Showcase and for a sidebar view, follow View->Sidebar->Showcase Sidebar.
4. Download Statusbar
If you're tired with that sometimes-pesky Downloads window that pops up whenever you download a file in Firefox. Download Statusbar suppresses that window from popping up, and instead provides you the same information in the status bar at the bottom of the browser window. You can roll your mouse over the filename and get a pop-up tool tip with some extra information about your download, too
5. DownThemAll
DownThemAll is is download manager and accelerator. It lets you download all the links or images contained in a webpage and much more: you can refine your downloads by fully customizable criteria to get only what you really want! Simply, it saves you them time to open a shell and use wget.
6. Fotofox
Using Fotofox, you are able to grab a picture from any pages to the Fotofox sidebar, title it, tag it and upload them to your flickr album with a simple click. It is also compitable with other picture hosting site such as Tabblo, 23hq, Smugmug, Marela, and Kodak EasyShare Gallery.
7. Fasterfox
Exactly I don't know its performance.
8. Google Browser Sync
Google Browser Sync for Firefox is an extension that continuously synchronizes your browser settings – including bookmarks, history, persistent cookies, and saved passwords when you are using firefox on several computers. It also allows you to restore open tabs and windows across different machines and browser sessions. Alternatively You can choose to sync cookies, or not to sync cookies, but you can't make the decision based on individual cookies. Suppose that you are now using a public computer. At first you can install this extension, sign in with your google account. The Google Browser Sync begins to synchronize your firefox settings on the server and thus you get a familiar firefox in the public computer. When you are going to leave, choose Tools->Google Browser Sync->Stop Syncing and click Tools->Clear Private Data... (Ctrl+Shilt+Delete) to clean your personal leaved on the public computer.
9. Google Notebook
In my opinion Google Notebook is a simple but great product though most people ignorate it. This is possibly because that there is not a convinient on-one-click client interface and that the usability is poor in the webpage user interface. For example, I have been long hoping for the tag feature and sharing ability based on a note. Google Notebook it the client interface. It simple the process you make notes. Just select any part of the webpages, whether be it text or picture, click Note This from pop menu activated by right mouse button. I believe Google notebook will outrun Clipmarks or other similar services.
10. Firefox Google Bookmarks
Firefox Google Bookmarks (GBookmarks) creates a menu to access your google bookmarks from any computer. ( Your google bookmarks resides in your search history). Additionally it server a backup mechanism for all bookmarks. This extension may overlap with the Google Browser Sync. Personally, I consider the Bookmark in firefox a lightweight bookmarks that store only the everyday used links and use GBookmarks as a heavyweight repository of all links that may be useful some day.
11. StumbleUpon
StumbleUpon is also a bookmark service that incorporate social network elements. It resembles delicious. StumbleUpon lets you "channelsurf" the best-reviewed sites on the web. It is a collaborative surfing tool for browsing, reviewing and sharing great sites with like-minded people. This helps you find interesting webpages you wouldn't think to search for. You can also share pages of interest within a community. Particularly, it will insert into your google search results which shows you other people's rating and reviews of the search result.
12. Greasemonkey
Greasemonkey basically allows you to add JavaScript to any Web page, which implies infinite control over the behavior of web pages. Greasemonkey is not for the faint of heart. The good news is that there are many generous souls out there who share the scripts they create. Check out userscripts.org for a script repository. If you want to write your own scripts, try diveintogreasemonkey.org or pick up Mark Pilgrim's Greasemonkey Hacks from O'Reilly Media. Personally I strongly recommend Gmail Macros. After installation, it empower gmail with striking and convinient keyboard shortcuts like Google Reader.
13. Adblock
Adblock is a content filtering plug-in for the Mozilla and Firefox browsers. It allows the user to specify filters, which remove unwanted content based on the source-address. Adblock supports two types of filters: simple, and Regular Expression. Adblock will also provide a default list of filters, which is enough for a lazy person like me.
14. ScrapBook
ScrapBook is a Firefox extension, which helps you to save Web pages and easily manage collections. This enable you to surf web pages off line. You can also directly copy the ScrapBook folder from your firefox option folder to other computers and see what you saved there.
References:
20 must-have Firefox extensions
Firefox Add-ons Recommended Add-ons
千种风情千种树: My Firefox Extensions

Friday, February 09, 2007

Alon Halevy and Peter Norvig


Alon Halevy and Peter Norvig, two Googlers, have been selected for the 2006 class of ACM Fellows.

Peter, who was Google's first director of search quality and is currently director of Google Research, has been recognized for his many contributions to the disciplines of artificial intelligence and information retrieval. His personal websites is http://norvig.com/.

Alon, who leads one of our structured data initiatives, has been honored for his contributions in data integration and knowledge representation. His old personal website: http://www.cs.washington.edu/homes/alon/, new personal website: http://alonhalevy.googlepages.com/, his blog is http://www.alonhalevy.blogspot.com/

Saturday, February 03, 2007

A Elegent Blog: Designer's Block


The blog Designer's Block is elegant blog. The writer is from UK. I really like the design (see left), grace and mysterious style. I will choose it as my blog's background.








update: In fact, the painter of above paintings are Melissa Mossart. In her website there are also many paintings of similar style.





































See more paintings at http://www.melissamossart.com/paint.htm
Melissa Mossart and nine other women artists' gallery: http://www.tenwomen.org/venicegallery.html

Thursday, February 01, 2007

Research Projects on Microarray Analysis


BioConductor


Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data.

Bioconductor is primarily based on the R programming language but we do accept contributions in any programming language. Although initial efforts focused primarily on DNA microarray data analysis, many of the software tools are general and can be used broadly for the analysis of genomic data, such as SAGE, sequence, or SNP data.

The broad goals of the projects are to

  • provide access to a wide range of powerful statistical and graphical methods for the analysis of genomic data;
  • facilitate the integration of biological metadata in the analysis of experimental data: e.g. literature data from PubMed, annotation data from LocusLink;
  • allow the rapid development of extensible, scalable, and interoperable software;
  • promote high-quality documentation and reproducible research;
  • provide training in computational and statistical methods for the analysis of genomic data.
If you are new to Bioconductor you might consider buying Bioinformatics and Computational Biology Solutions Using R and Bioconductor


Gene Ontology


The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions of gene products in different databases. The GO project has developed three structured controlled vocabularies (ontologies) that describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner. There are three separate aspects to this effort: first, the development and maintenance of the ontologies themselves; second, the annotation of gene products, which entails making associations between the ontologies and the genes and gene products in the collaborating databases; and third, development of tools that facilitate the creation, maintenance and use of ontologies.

The use of GO terms by collaborating databases facilitates uniform queries across them. The controlled vocabularies are structured so that they can be queried at different levels: for example, you can use GO to find all the gene products in the mouse genome that are involved in signal transduction, or you can zoom in on all the receptor tyrosine kinases. This structure also allows annotators to assign properties to genes or gene products at different levels, depending on the depth of knowledge about that entity.

International HapMap Project


The HapMap is a catalog of common genetic variants that occur in human beings. It describes what these variants are, where they occur in our DNA, and how they are distributed among people within populations and among populations in different parts of the world. The International HapMap Project is not using the information in the HapMap to establish connections between particular genetic variants and diseases. Rather, the Project is designed to provide information that other researchers can use to link genetic variants to the risk for specific illnesses, which will lead to new methods of preventing, diagnosing, and treating diseases.

Microarray Gene Expression Data Society - MGED Society

The Microarray Gene Expression Data (MGED) Society is an international organisation of biologists, computer scientists, and data analysts that aims to facilitate the sharing of microarray data generated by functional genomics and proteomics experiments. The current focus is on establishing standards for microarray data annotation and exchange, facilitating the creation of microarray databases and related software implementing these standards, and promoting the sharing of high quality, well annotated data within the life sciences community. A long-term goal for the future is to extend the mission to other functional genomics and proteomics high throughput technologies


The MolTools consortium



The MolTools consortium started on January 1st 2004, as a joint research programme bringing together 12 leading European academic groups, four biotech SMEs and one US laboratory working in the area of postgenomic technology development. The partners have pioneered a series of important molecular techniques and will now work together to establish next-generation tools for molecular analysis. Its scientific aims are to establish genome analysis technologies set to monitor extensive molecular repertoires, and with the capacity to investigate even single molecules. Its current research projects include:

Tuesday, January 30, 2007

Resources of Genomics and Microarray Analysis

Genomics and Microarrays:
Stanford Microarray Database:
storing lots of raw and normalized data from microarray experiments

Genomics tutorial at Genome Canada
http://www.genomecanada.ca/xpublic/dnaBasics/index.asp?l=e

Introductions to microarray at NCBI
http://www.ncbi.nlm.nih.gov/About/primer/microarrays.html

Microarray (movie)
http://www.broad.harvard.edu/chembio/lab_schreiber/anims/videos/microarray.html

other resourses for microarray
http://www.learner.org/channel/courses/biology/units/genom/images.html

image analysis for microarray
http://www.maths.usyd.edu.au/u/jeany/ (publication)
http://www.stat.berkeley.edu/users/terry/zarray/Talks/image/jpegindex.html http://cmm.ensmp.fr/~angulo/research/dnamicro.htm

Bioconductor
The richest source of freely available packages for genomic data analysis

Nature article
the perspective of biologists facing heaps of noisy genomic data including their urgent need for better methods and computationally and statistically skilled support.

dChip Software
http://biosun1.harvard.edu/complab/dchip/

dChip Software: Analysis and visualization of gene expression and SNP microarrays



Biology:

Retroviruses
http://www.whfreeman.com/kuby/content/anm/kb03an01.htm (FLASH)

human genome project(movies)
http://www.genome.gov/Pages/EducationKit/download.html

the central dogma of molecular biology (wonderful movie):
http://www.genome.gov/Pages/EducationKit/video/qt/3D.mov

EBM & Clinical Research Workstation menu
http://www.shdem.com/ebm/default.asp

Biochemistry & Epidemiology useful link:
http://www.med-ed.virginia.edu/menu/otherMedEd.cfm
 
Statistics:

online textbook for statistics
http://www.stat.berkeley.edu/~stark/SticiGui/Text/toc.htm

Terry Speed's Microarray homepage
statistical challenges related to microarray data
new webpage: http://www.stat.berkeley.edu/~terry/Group/home.html

Statistics
http://www.bettycjung.net/statsiteS.htm

Computing technology

Introduction to R for biologists (by Natalie Roberts, WEHI, Melbourne)
R manuals under link "Manuals" (left column)
manuals: http://cran.r-project.org/manuals.html
R tutorial: http://www.personality-project.org/r/
R package: Statistics for Microarray Analysis

Directionary of Blogs About Microarray Analysis: Draft

Biodefense Bioinformatics Blog http://ai59694.blogspot.com/
Rotten bananas - http://heathermaughan.blogspot.com/index.html
Synthetic Biology and Gene Synthesis - http://syntheticbio.blogspot.com
Genomics Online - http://genomics-info.blogspot.com/index.html
formerscienceguy - http://formerscienceguy.blogspot.com/index.html

Friday, January 26, 2007

Alan Perelson

Dr. Perelson received his B.S. degrees in Life Science and Electrical Engineering from MIT in 1967, and a Ph.D. in Biophysics, under the supervision of Aharon Katchalsky-Katzir, from UC Berkeley in 1972. He was Acting Assistant Professor, Division of Medical Physics, Berkeley, in 1973 and a postdoctoral fellow at the Department of Chemical Engineering, University of Minnesota, in 1974. He was a staff member in the Theoretical Biology and Biophysics Group at Los Alamos National Laboratory from 1974 - 1991, a Laboratory Fellow from 1991 - 2002, head of the Theoretical Biology and Biophysics Group between 1995 - 2001, and is currently a Los Alamos National Laboratory Senior Fellow. He spent the 1978 and 1979 academic years at Brown University as an Assistant Professor of Medical Sciences in the Division of Biology and Medicine and the Lefschetz Center for Dynamical Systems, was a visiting scientist at the Mathematical Institute, Oxford University in 1986 and a visiting professor of Physics at Ecole Normale Superieure, Paris in 1990, and the University of Paris VII in 1992. He is also a member of the Science Board and head of the Theoretical Immunology Program at the Santa Fe Institute. He is also an adjunct professor of Bioinformatics at Boston University and an adjunct professor of biology at the University of New Mexico.

Research Interests

Mathematical and theoretical biology, with an emphasis on problems in immunology, virology,
and cell and molecular biology.

Time Zone of United States

PST: Washington, Oregon, Neveda, California

MT: Montana, Wyoming, Idaho, Utah, Colorado, Arizona, New Mexico and parts of
North Dakota, South Dakota and Nebraska

CT: Parts of North Dakota, South Dakota and Nebraska, Kansas, Oklahoma, Texas,
Minnesota, Iowa, Missouri, Arkansas, Louisiana Wisconsin, Illinois,
Tennessee, Mississippi, Alabama

EST: Michigan, Indiana, Ohio, Kentucky, Georgia, New York, Pennsylvania, West
Virginia, Virginia, North Carolina, South Carolina, Florida, Washington DC
New Jersey, Connecticut, Ehode Island, Massachusetts, New Hampshire,
Vermont, Maine

Thursday, January 25, 2007

Friday, January 19, 2007

A Haplotype Map of the Human Genome

A Haplotype Map of the Human Genome

David Altshuler
Harvard Medical School, Massachusetts General Hospital, Whitehead Institute

Eric Lander
Whitehead Institute and MIT

Goal

The next key step of the Human Genome Project (HGP) (following the creation of the genetic, physical, sequence and SNP maps) is the generation of a "haplotype" map of the human genome. Such a "haplotype" map consists of a high density of SNPs defining the small number of ancestral haplotypes (blocks of tightly correlated genetic variants) in each region of the human genome. Knowledge of these haplotypes will allow comprehensive and efficient testing of the association of human genes with human diseases. The haplotype map can and should be generated rapidly and should be made freely available to researchers worldwide.

Background

A haplotype map of the human genome has become both justified and practical due to significant advances over the last two years.

Specifically, these advances include:

  • Genomic Sequence: The development of a complete genome sequence - integrated with human genes and annotations - providing a reference framework on which to layer knowledge about allelic variation.

  • Genetic Variants: The development of a dense (and rapidly growing) map of 1.4 million human SNPs provides a genome-wide resource of genetic variation adequate to uniquely tag the vast majority of human haplotypes.

  • Genotyping Technology: The development of high-throughput methods, allowing a rapid, efficient and cost-effective experimental approach to a project of the required scale.

  • Long-range LD: The discovery that human SNPs display strong linkage disequilibrium (LD or allelic association) over large distances. LD is detectable over distances in the range of 100kb and is extremely strong over regions spanning several tens of kb (the size of typical genes). For such regions, the vast majority of chromosomes in the population carry one of a handful of highly conserved haplotypes. As a result, genetic diversity in the region can be represented by a small number of well-chosen SNPs.

Impact on biomedical research

The availability of a haplotype map of the human genome will have a substantial impact on human genetic studies.

Specifically, these studies include:

  • Comprehensive association studies of individual genes. The association of genes with disease has traditionally been probed by testing individuals SNPs one-at-a-time. The drawback to this approach is that the task is never-ending: one can exclude particular SNPs as playing a role, but one cannot exclude a gene. Once the haplotype structure of the genome is defined, one can (1) comprehensively test all significant haplotypes in the gene, and (2) decrease the number of SNPs needed by selecting a subset that defines the population variability. This will allow haplotype studies of individual genomic loci in an unbiased manner, without assumption about the locations of causal mutations in coding regions, promoters or regulatory sites at significant distance away. And, it will greatly decrease the technical and financial barriers faced by laboratories in undertaking such work

  • Genome-wide association studies. A genome-wide haplotype map will make possible whole-genome scans for association in the population. Rather than focusing only on 'candidate' genes, it will become possible to search the genome in an unbiased manner for genes whose common variation contributes to disease in the population. Routine use of genome-wide association studies will also require further decreases in genotyping costs, but such decreases are likely to be driven by the development of the haplotype map.

  • Human population structure and history. Knowledge of haplotypes will transform our understanding of human population structure and history. The LD pattern turns out to be an extremely sensitive indicator of population history, because the multi-allelic nature of haplotypes provides rich detail and because the breakdown of haplotypes follows a predictable clock set by recombination rates. In particular, LD patterns are more powerful than traditional studies of allele frequencies per se. Information about human population history is interesting in its own right, but is also very valuable in the design of medical studies (such as admixture mapping).

Technical Issues

Generating a haplotype map would involve the following components:

  • Population Samples. Development of appropriate population samples, consisting of parent-offspring trios (to allow inference of haplotypes). We estimate that a total of about 300 samples will be needed, representing major ethnic groups in a manner appropriate for generating a map that can be used for medical studies in all populations. The population samples should be a renewable resource (i.e., immortalized cell lines).

  • Sample and Data Availability. The samples should be made freely available so that any interested scientific group can contribute data (in the manner of the CEPH panel and the DNA Polymorphism Discovery Resource). Conversely, all data generated by the project should be immediately released into the public domain without restrictions of any kind.

  • Numbers of SNPs to be genotyped. It is estimated that generating the haplotype map will require successful genotyping of 450,000 SNPs, which will in turn require initial testing of some 800,000 to 900,000 SNPs. The required scale is now well within reach: the Whitehead and Sanger Centre are each currently engaged in pilot projects involving 25,000 SNPs using automated genotyping setup and MALDI-TOF-based detection. Given the required scale and efficiencies, it is likely that the bulk of the work should be performed by a few large groups, but all groups should be encouraged to participate in the project by analyzing genes and regions of interest.

  • Analytical Tools. The project will require various analytical tools to readily define haplotype blocks from genotype data, software systems to aid in the hierarchical selection of SNPs to fill in blocks, and databases to make the information maximally useful to the community. Prototype systems have been developed, but focused effort will be needed to develop mature systems.