Gene Myers 博士講演会のお知らせ

Gene Myers博士は、現代ゲノム科学を創られました。

BLAST を研究開発し、提案したsuffix arrayは多様なアラインメントツール(BWA等)の基礎的データ構造になりました。提唱したwhole genome shotgun sequencing 法の有効性を示すため、ヒトゲノムのアセンブラをセレラ社にて研究開発し、現在はゲノムアセンブリの手段として普遍化しました。

全ゲノムの解読が可能になってから四半世紀経ちますが、解読困難な重要な領域が残されています。たとえばセントロメア、HLA/MSA等ゲノム重複領域、相同染色体、多倍体ゲノムなどです。

完全な解読は可能か? Myers 先生にご講演をお願いしました。 

* 日時 2019年3月22日(金) 午後1時から 2時間程度(質疑応答含めて)
* 場所 東京大学 理学部3号館412講義室
  https://www.u-tokyo.ac.jp/campusmap/cam01_06_03_j.html
* 世話人 東京大学・森下真一研究室

------------------------------------------------------------
Is Perfect de novo DNA assembly Possible
Gene Myers
Director & Founding Chair of Systems Biology
Max Planck Institute for Molecular Cell Biology and Genetics

We are about to enter an era of DNA sequencing where one can, in the near future produce, a de novo reference-quality genome of any living species for 1,000 EU. This ability will revolutionize ecology, evolution, and conservation science and effectively mark the beginning of a new exploration of the natural world.

The technological driver is the advent of long read sequencers such as the PacBio Sequel and Oxford Promethion. The long reads in effect make assembly easier, and one sees corresponding improvements in the continuity of the results, but the underlying algorithms are effectively the same as those first developed 20 years ago, and repetitions at the scale of read length are still an issue. Indeed, truly better assembly requires finding all artifacts in the reads and the resolution of repeat families, topics that I don’t think have received sufficient attention and that are particularly critical issues for long reads.

Therefore we are developing algorithms that carefully analyze a long read shotgun data set before assembly in an attempt to perfect and haplotype phase them beforehand. This has proved particularly difficult in the face of an 11-13% sequencing error rate. But using a circular consensus protocol, one can effectively start with reads that have only a .5% error rate but are only 15kbp long, versus the 30-40Kbp real length possible at higher error. Solutions to the problems of perfecting reads and resolving repeats are still required, but are substantially easier. An interesting question is which kind of data is better? And can one assembly the data perfectly?


なおMyers 先生は、UC Berkeley で教鞭を取られHoward Hughes Medical Institute での研究の後、現在はMax-Planck 研究所にてシステムズ生物学分野を率いておられます。また自らも、ゲノムアセンブリのためのアルゴリズムとソフトウエアを研究開発されています。
参考 URL https://dazzlerblog.wordpress.com/author/thegenemyers/

公開日:2019.02.04