This NSF-funded project is a collaborative effort between the Goldberg laboratory at UCLA and the Harada laboratory at UCD to understand what are all the genes required to make a soybean seed. We used soybean and Arabidopsis Affymetrix GeneChips, Laser Capture Microdissection (LCM), and next-generation high-throughput sequencing technologies to profile the mRNA sets present in different seed regions and compartments throughout development. Our long-term goal is to understand the genes and regulatory networks required to make a seed. Click here to learn more about this project and what has been accomplished.


To date, we have profiled the mRNA sets present in 71 soybean and Arabidopsis seed compartments from preglobular- to early maturation-stage seeds. All GeneChip data are stored in this web-based database. Under the Soybean GeneChip Experiments and Arabidopsis GeneChip Experiments sections on top, we created the built-in analysis tools to allow the user to not only browse the database by probe identification, gene ontology, and functional category, but also compare gene activity in different seed compartments during development.


Furthermore, we used next generation sequencing technology, RNA-Seq, to profile mRNAs and non-coding small RNAs on a whole-genome basis during soybean seed development, including (a) whole seeds from all stages of seed development from fertilization to dormancy, (b) isolated cotyledons from maturation-stage seeds and germinating seedlings, (c) isolated axis and seed coat at early-maturation stage, (d) 40 different LCM-captured seed compartments from globular to early-maturation stage, (e) post-germination seedlings, and (f) mature plant leaves, roots, stems, and flowers. To date, we generated ~ 7 million reads, or 400 gigabytes of sequences. All generated RNA-Seq and small RNA-Seq data for this NSF-sponsored project have been deposited in GEO. Click here to learn more.


To study the functions of transcription factor genes active during soybean seed development, we collaborated with Dr. David Somers (Monsanto) to generate a collection of soybean seed RNAi knock-out lines for 63 transcription factor genes that are expressed in specific seed regions at the globular, heart, cotyledon, and early-maturation stages of development. Click here to view the complete list of RNAi lines.


We also took efforts to annotate the Affymetrix soybean array and the Arabidopsis ATH1 array. To download the annotations and summaries, click
Soybean Annotation or Arabidopsis Annotation.

We also mapped individual probes to soybean predicted gene models (generated by the Department of Energy (DOE) Joint Genome Institute, Glyma version 1.01, released April 7, 2009) using BLASTN (≥ 23/25 nucleotide identity) to associate soybean array probe sets with soybean gene models. Click here to download the association of Soybean array probe sets and Soybean gene models under the Annotation section.

A Soybean Whole Transcript (WT) Array was created by a collaborative effort between the Goldberg laboratory and Affymetrix to interrogate all the genes in the genome. Click here to learn more.


Professor Goldberg created and currently teaches a novel course series that is a part of this NSF-sponsored project. The course series contains a lecture course, HC70A - Genetic Engineering in Medicine, Agriculture, and Law and a lab course, HC70AL - Gene Discovery Lab. This novel course series targets non-science majors and entering life science students. Professor Goldberg's objective is to teach undergraduates about the excitement of discovery, the process by which science is carried out, how advances in biology affect our daily lives, and how science is taught. Professor Goldberg and Professor Harada also used long-distance learning to teach students simultaneously at UCLA, UC Davis and Tuskegee University. Click here to learn more details about this unique teaching program.

NSF