Content uploaded by Thalia A Farazi
Author content
All content in this area was uploaded by Thalia A Farazi
Content may be subject to copyright.
2011;71:4443-4453. Published OnlineFirst May 17, 2011.Cancer Res
Thalia A. Farazi, Hugo M. Horlings, Jelle J. ten Hoeve, et al.
Tumors by Deep Sequencing
MicroRNA Sequence and Expression Analysis in Breast
Updated Version 10.1158/0008-5472.CAN-11-0608doi:
Access the most recent version of this article at:
Material
Supplementary
html
http://cancerres.aacrjournals.org/content/suppl/2011/05/17/0008-5472.CAN-11-0608.DC1.
Access the most recent supplemental material at:
Cited Articles http://cancerres.aacrjournals.org/content/71/13/4443.full.html#ref-list-1
This article cites 46 articles, 14 of which you can access for free at:
E-mail alerts related to this article or journal.Sign up to receive free email-alerts
Subscriptions
Reprints and .pubs@aacr.orgPublications Department at
To order reprints of this article or to subscribe to the journal, contact the AACR
Permissions .permissions@aacr.orgDepartment at
To request permission to re-use all or part of this article, contact the AACR Publications
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
Molecular and Cellular Pathobiology
MicroRNA Sequence and Expression Analysis in Breast
Tumors by Deep Sequencing
Thalia A. Farazi
1
, Hugo M. Horlings
2,3
, Jelle J. ten Hoeve
4
, Aleksandra Mihailovic
1
, Hans Halfwerk
2,3
,
Pavel Morozov
1
, Miguel Brown
1
, Markus Hafner
1
, Fabien Reyal
3
, Marieke van Kouwenhove
3
, Bas Kreike
3,5
,
Daoud Sie
3,6
, Volker Hovestadt
1
, Lodewyk F.A. Wessels
4
, Marc J. van de Vijver
2,3
, and Thomas Tuschl
1
Abstract
MicroRNAs (miRNA) regulate many genes critical for tumorigenesis. We profiled miRNAs from 11 normal
breast tissues, 17 noninvasive, 151 invasive breast carcinomas, and 6 cell lines by in-house–developed barcoded
Solexa sequencing. miRNAs were organized in genomic clusters representing promoter-controlled miRNA
expression and sequence families representing seed sequence–dependent miRNA target regulation. Unsuper-
vised clustering of samples by miRNA sequence families best reflected the clustering based on mRNA expression
available for this sample set. Clustering and comparative analysis of miRNA read frequencies showed that
normal breast samples were separated from most noninvasive ductal carcinoma in situ and invasive carcinomas
by increased miR-21 (the most abundant miRNA in carcinomas) and multiple decreased miRNA families
(including miR-98/let-7), with most miRNA changes apparent already in the noninvasive carcinomas. In
addition, patients that went on to develop metastasis showed increased expression of mir-423, and triple-
negative breast carcinomas were most distinct from other tumor subtypes due to upregulation of the mir!17-92
cluster. However, absolute miRNA levels between normal breast and carcinomas did not reveal any significant
differences. We also discovered two polymorphic nucleotide variations among the more abundant miRNAs miR-
181a (T19G) and miR-185 (T16G), but we did not identify nucleotide variations expected for classical tumor
suppressor function associated with miRNAs. The differentiation of tumor subtypes and prediction of metastasis
based on miRNA levels is statistically possible but is not driven by deregulation of abundant miRNAs, implicating
far fewer miRNAs in tumorigenic processes than previously suggested. Cancer Res; 71(13); 4443–53. !2011 AACR.
Introduction
Breast cancer is a heterogeneous disease involving various
oncogenic pathways and/or genetic alterations. Although
prognostic mRNA expression signatures have been defined
for some invasive breast carcinomas (1, 2), the underlying
pathways regulating breast cancer aggressiveness remain
poorly understood.
MicroRNAs (miRNA) are 19- to 23-nucleotide (nt) RNAs that
regulate gene expression (3–5). miRNAs have been suggested
to act as oncogenes (or tumor suppressors) contributing to
distinct tumor characteristics (6–9). Some miRNAs are located
in genomic regions exhibiting copy number alterations (10),
and miRNA levels can be dramatically deregulated in tumors
[e.g., upregulation of mir-21 (reviewed in ref. 11), mir!17-92
cluster (reviewed in ref. 12), and mir-26a (ref. 13) and down-
regulation of let-7 family members (reviewed in ref. 14)].
miRNAs control key processes in tumorigenesis such as tumor
initiation (13), metastasis (15, 16), inflammation (17), and
differentiation (18). Components of the miRNA biogenesis
pathway have been implicated in tumorigenesis, suggesting
that miRNAs are less abundant in tumors (19, 20). For
example, DICER1 was found to function as a haploinsufficient
tumor suppressor (20, 21); 27% of various tumors were sug-
gested to have a hemizygous deletion of the gene that encodes
DICER1 (20). Furthermore, miRNA profiling in a variety of
tissues and cancers has not only identified cell-type–specific
and cancer-deregulated miRNAs (19, 22) but also profiles that
correlate with prognosis (23).
Since 2005, when miRNA deregulation was described in
breast tumors (24), more than 400 studies have been published
about changes in miRNA levels, their regulation, and role in
breast cancer (reviewed in ref. 25). Most of these studies were
Authors' Affiliations:
1
Howard Hughes Medical Institute, Laboratory of
RNA Molecular Biology, The Rockefeller University, New York;
2
Academic
Medical Center, Department of Pathology;
3
Division of Experimental
Therapy,
4
Department of Molecular Biology,
5
Division of Radiation Oncol-
ogy, and
6
Central Microarray Facility, Netherlands Cancer Institute,
Amsterdam, The Netherlands
Note: Supplementary data for this article are available at Cancer Research
Online (http://cancerres.aacrjournals.org/).
T.A. Farazi and H.M. Horlings contributed equally to the work.
Corresponding Authors: Thomas Tuschl, The Rockefeller University,
1230 York Avenua, Box 186, New York, NY 10065. Phone: 212-327-
7651; Fax: 212-327-7652; E-mail: ttuschl@mail.rockefeller.edu; or
Marc J. van de Vijver. Phone: 31-20-5666646; Fax: 31-20-5669523;
E-mail: m.j.vandevijver@amc.uva.nl; or Lodewyk Wessels. Phone:
31-20-5127987; Fax: 31-20-6691383; E-mail: l.wessels@nki.nl
doi: 10.1158/0008-5472.CAN-11-0608
!2011 American Association for Cancer Research.
Cancer
Research
www.aacrjournals.org 4443
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
conducted for a specified subset of miRNAs using microarrays
or real-time PCR, and did not identify miRNA abundance or
sequence variation. Studies investigating the role of specific
miRNAs in metastasis were conducted in cell lines and animal
models and were supported by small patient cohorts. For
example, overexpression/knockdown of several miRNAs,
including miR-10b, miR-9, miR-31, miR-126, and miR-335,
was shown to play a role in metastasis (15, 16, 26, 27).
However, correlation between these miRNAs and metastasis
was not identified in large clinical studies (28, 29).
Here, we surveyed by barcoded Solexa sequencing 185
breast specimens, including 11 normal breast, 17 ductal
carcinoma in situ (DCIS), 151 invasive carcinomas [including
126 invasive ductal carcinomas (IDC)], and 6 ductal cell lines
and obtained expression levels based on sequence read num-
ber. This approach not only provided the fold differences but
also characterized miRNA abundance and nucleotide varia-
tion, accommodating a large number of clinical samples to
address interpatient variation in a cost-effective manner. We
conducted unsupervised clustering to assess whether miRNAs
organized in genomic clusters or sequence families could
resolve breast cancer types, as classified by immunohisto-
chemical (IHC) analyses, that is, estrogen receptor (ER),
progesterone receptor (PR), and HER2 status, and molecular
subtypes based on mRNA profiles (30). By comparing HER2-
positive noninvasive (DCIS) and invasive breast carcinomas,
we identified miRNAs with altered levels in tumor invasion. By
comparing breast tumor types with different IHC character-
istics, we identified miRNAs unique to triple-negative breast
carcinomas (TNBC), lacking expression for ER, PR, and HER2.
Ninety-seven of 151 invasive breast carcinoma patients had
clinical follow-up records available, allowing us to evaluate the
role of miRNAs as prognostic markers.
Materials and Methods
Patient clinical and tumor histopathologic
characteristics
Samples were obtained from patients treated at The Nether-
lands Cancer Institute between 1985 and 2006, comprising
DCIS (n¼17) and invasive carcinoma (n¼151) patients with
nonmetastatic disease at diagnosis. Eleven normal breast
specimens were obtained from mammoplasty patients treated
at the Institut Curie. The medical ethical committee of The
Netherlands Cancer Institute approved this study. For detailed
clinical and pathologic information, see Supplementary
Table S1. Clinical follow-up data were available for 48 TNBC
and 49 HER2-positive patients. Therapy for these patients
consisted of breast-conserving surgery or modified radical
mastectomy, with or without adjuvant therapy (chemother-
apy, radiation, or hormonal therapy). Twenty-two of 49 HER2-
positive and 28 of 48 TNBC patients received anthracycline-
based, cyclophosphamide, methotrexate, and fluorouracil che-
motherapy, hormonal treatment, or a combination modality.
The MCF7, HCC38, MCF10A, and BT-474 cell lines were
purchased from American Type Culture Collection. ZR-751
was a gift from John Hilkens and MDA-MB134 from Marleen
Kok (both from The Netherlands Cancer Institute).
RNA isolation
Fresh-frozen (frozen immediately after surgery and
stored at #80$C) normal breast and tumor tissue was
collected. RNA was isolated from approximately thirty
30-mm cryosections corresponding to approximately
20 mg of tissue, using the first and last section to assess
tumor content; only samples containing 50% or above
tumor content were further characterized. The tissues were
homogenized in TRIzol (Invitrogen) using a Polytron
instrument (polytron, PT, MR2100; Kinematica AG) for 1
minute and total RNA was isolated by a modified TRIzol
protocol (Supplementary Methods). Quality of isolated
RNA was assessed on a 1% agarose gel based on the relative
abundance of 18S and 28S subunits of ribosomal RNA.
Small RNA sequencing and bioinformatics analysis
We used a barcoded small RNA sequencing approach
(described in ref. 31 and summarized in Supplementary
Methods). We employed 21 different barcodes obtaining
1.8 to 26.5 mio reads per sequence run (18%–87% reads
with extractable barcodes; Supplementary Table S2). We
selected reads with an insert of 16 to 25 nt. Adapter
sequences were extracted from sequence reads using the
following criteria: 4-nt minimum overlap of 30adapter or 5-
nt minimum 30overlap of adapter with 1 mismatch exclud-
ing insertions and deletions in the first nucleotide of the
adapter past the barcode; barcodes were assigned without
allowing any mismatches. On an average, 80% of the
extracted reads represented prototypical miRNAs (based
on our annotation database—see below), 0.005% viral miR-
NAs, and 0.2% piRNAs (based on National Center for Bio-
technology Information definitions). The samples with lower
miRNA content had a higher percentage of rRNA, likely
reflecting sample quality. The miRNA expression profiles
were submitted to the Gene Expression Omnibus (GEO) with
accession number GSE28884.
Setting a threshold of 10 or more reads per miRNA for the
pool of all 49,479,978 miRNA sequence reads (179 patient
samples and 6 cell lines), we identified a total of 888 mature
and miRNA star sequences (representing the 2 strands
obtained after miRNA precursor processing). Resetting the
threshold to 5,000 reads, we identified 231 miRNA and miRNA
star species together constituting 99% of all miRNA reads.
To assess whether experimental variables affected miRNA
profiles, we profiled 54 samples in replicate and conducted
Spearman correlation yielding a median correlation coeffi-
cient of 0.92 using the top expressed miRNAs (represented
with >5,000 sequence reads across all samples; Supplementary
Table S3).
miRNA genomic clusters were redefined taking into con-
sideration expressed sequence tag (EST) evidence and levels of
miRNA expression from our data (similar to that described in
ref. 22). Typically, the greatest genomic distance between
clustered miRNAs was 5 kb. Sequence families were defined
on the basis of seed sequence similarity (position 2–8) allowing
only 1 transition in these positions, as well as 30end similarity
(position 9 through 30end) allowing up to 50% mismatches
with additional manual curation (Supplementary Table S4).
Farazi et al.
Cancer Res; 71(13) July 1, 2011 Cancer Research
4444
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
Northern blot analysis
Five micrograms of total RNA was used for Northern blot
analysis as previously described (22). Equal loading of the
lanes was confirmed by ethidium bromide staining of the
tRNA band. Synthetic standards containing 2, 10, and 50 fmol
oligoribonucleotide corresponding to each miRNA were run
on each gel and used as references to quantify miRNA levels.
After each experiment, blots were stripped in 0.1% SSC at 85$C
for 5 minutes and reprobed up to 4 times, the final time for U6
small nuclear RNA (snRNA) as control for loading and degra-
dation. The spot intensities were quantified using ImageGauge
software in units of photostimulated luminescence and cor-
rected for background intensity.
Statistical analysis
Kaplan–Meier analysis was conducted for the top 5 sig-
nificantly deregulated miRNAs (a%0.001 as multiple testing
correction for 20 comparisons) in patients who developed
metastasis. For each miRNA tested, the patients were split into
2 groups at the median. Difference between the 2 Kaplan–
Meier curves was assessed with the log-rank test.
Results
Small RNA cDNA library preparation and miRNA
abundance analysis
We isolated total RNA from 11 normal breast samples, 17
DCIS, and 151 invasive breast carcinomas (Supplementary
Table S1). IHC typing of the breast carcinomas showed that 71
lacked ER, PR, and HER2 expression (TNBC), whereas 97 were
positive for a single or multiple of the 3 markers. Twenty-three
TNBCs represented special histologic types: metaplastic (n¼
9), atypical medullary (n¼8), adenoid cystic (n¼2), and
apocrine carcinomas (n¼4). In addition, we isolated RNA
from 6 ductal breast cell lines (MCF7, MCF10A, HCC38, BT474,
MDA-MB134, and ZR-751).
We processed all samples by 30-adapter barcoded small
RNA cDNA library Solexa sequencing (31). A total of 61,319,767
barcoded sequence reads were extracted yielding a median of
134,022 reads per sample (range: 7,403–1,608,855; Supplemen-
tary Table S2); these reads were mapped to the genome
allowing 1 mismatch/insertion/deletion and then to our non-
coding RNA and miRNA databases allowing up to 2 mis-
matches or 1 insertion/deletion (22). We constructed a
miRNA read database from more than 1,000 human samples
sequenced in the Tuschl laboratory, and defined prototypical
miRNAs (557 precursors, corresponding to 1,112 mature and
star sequences, miR-451 and miR-618 being the only miRNAs
without a star sequence). We added 269 not yet reported star
sequences, ignored putative miRNAs from miRBase for which
we did not obtain read evidence, and renamed specific miR-
NAs according to the read ratio between mature and star
sequences (Supplementary Table S4).
Previous reports suggested that miRNAs were less abun-
dant in tumor compared with normal breast samples (19, 20,
32). For a representative set of 31 samples (5 normals, 4 DCIS,
18 IDC, and 4 cell lines), we determined the absolute amount
of miRNA per mg of total RNA. We added a cocktail of an
equimolar amount of 10 synthetic 22-nt 50phosphorylated
RNAs distinct from human sequences per mg of total RNA, on
an average representing 18% of the total reads (range: 7%–
57%; ref. 31). We were not able to detect significant changes in
miRNA content among normal breast, disease tissues, and
cell lines, assuming that calibrator RNAs and miRNAs were
cloned with similar average efficiency. Normal breast con-
tained an average of 16 &4, DCIS 14 &4, IDC 15 &5, and cell
lines 9 &5 fmol miRNA per mg of total RNA. We did not
account for tumor cell type heterogeneity, but tumor samples
were selected to comprise 50% or above tumor cells. Con-
sistent with these results, examination of mRNA levels in the
same samples for miRNA pathway components or other
factors implicated in miRNA biogenesis (reviewed in ref. 9)
did not suggest globally differential miRNA processing in
normal breast and carcinomas (Supplementary Fig. S1). On
the basis of calculations in MCF7 cells, each tumor cell
contained 145,000 miRNA molecules, illustrating that miRNAs
expressed at 1% of the total miRNA content (see below) in
each cell would represent 1,500 copies per cell. Considering
that miRNAs regulate many mRNAs, each represented by
many transcripts, less abundant miRNAs would be insufficient
to confer measurable target mRNA regulation, unless they act
like siRNAs on nearly fully complementary target mRNAs.
Quantitative Western blot analysis for EIF2C2/AGO2 protein,
the main component of the miRNA effector complex, in MCF7
showed the presence of approximately 42,000 copies per cell
(Supplementary Fig. S1). Assuming similar abundance for the
often coexpressed EIF2C/AGO members, the number of effec-
tor complexes matches the miRNA copy number.
Prior to the development of the barcoded sequencing with
addition of calibrator RNAs, we had conducted quantitative
Northern blotting for a subset of 10 miRNAs in 84 tumor
samples from our collection (Supplementary Table S5). The
Spearman correlation coefficients for miRNA expression
based on sequence reads compared with Northern quantita-
tion varied between 0.20 (miR-96) and 0.72 (miR-375) when
comparing across all 84 samples. The absolute amount of each
miRNA per mg of total RNA derived from Northern blotting
was in general agreement with calibrator-assisted sequencing
calculations.
miRNA profiles in normal and tumor specimens
miRNA profiles of a sample can be presented as relative
percentage of miRNA read frequencies (rf) by dividing miRNA
read counts by total miRNA reads per library. Furthermore,
miRNA profiles can be condensed either by assigning individual
miRNA or miRNA star reads to their originating miRNA geno-
mic clusters or to sequence families (denoted cluster-mir and
sf-miR respectively, listing number of cluster/family members
in parenthesis; Supplementary Table S4). Either approach
reduces the complexity of the data. The genomic clusterprofiles
represent promoter-controlled miRNA expression, whereas the
sequence families are most informative for characterization of
seed sequence–dependent miRNA target regulation. For sam-
ple comparison, we required 5,000 or above total miRNA
sequence reads per replicate merged library, resulting in inclu-
sion of 179 samples (from 183 sequenced) and 6 cell lines.
miRNA Sequence and Expression Analysis in Breast Tumors
www.aacrjournals.org Cancer Res; 71(13) July 1, 2011 4445
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
We conducted unsupervised hierarchical clustering for 179
clinical samples using individual miRNAs, precursor clusters,
and sequence families expressed at 85% of the total miRNA
reads in at least 1 sample (Fig. 1 and Supplementary Fig. S2).
To enhance visualization of changes in miRNA expression,
miRNA rf was transformed by standardization across each
miRNA. Normal breast samples clustered together, close to a
small group of ER- and HER2-positive tumors characterized by
lower expression of sf-miR-21(1) and higher expression of sf-
miR-22(1) compared with the remainder of tumor samples.
Some DCIS samples also clustered together, breaking up
groups of invasive tumor samples, suggesting that DCIS
samples accumulate changes in miRNA expression early in
the tumorigenic process. Invasive tumor samples positive for 1
or more IHC marker clustered together, whereas TNBCs
emerged as several groups distinct from the other tumors.
Samples did not cluster according to other pathologic and
clinical characteristics.
To understand miRNA expression in the context of tissue
heterogeneity, we compared the 6 ductal cell lines to human
samples to identify miRNAs expressed in tissues, but not
ductal cell lines, likely contributed by other cell types. We
Distant metastasis
Event death
HER2 IHC
PR IHC
ER IHC
Molecular type
Histological type
sf−miR−424(1)
sf−miR−10a(2)
sf−miR−199a−1−5p(3)
sf−miR−199a−1−3p(3)
sf−miR−497(1)
sf−miR−376c(1)
sf−miR−143*(1)
sf−miR−145(1)
sf−miR−99a(3)
sf−miR−452(1)
sf−miR−224(1)
sf−miR−23a(2)
sf−miR−193a−3p(2)
sf−miR−423−3p(1)
sf−miR−320−RNASEN(1)
sf−miR−22(1)
sf−miR−140−3p(1)
sf−miR−150(1)
sf−miR−146a(2)
sf−miR−193a−5p(1)
sf−miR−423−5p(1)
sf−miR−378(1)
sf−miR−24−1(2)
sf−miR−205(1)
sf−miR−127−3p(1)
sf−miR−125a(3)
sf−miR−143(1)
sf−miR−203(1)
sf−miR−27a(2)
sf−miR−210(1)
sf−miR−181a−1(6)
sf−miR−221(1)
sf−let−7a−1(12)
sf−miR−186(1)
sf−miR−130a(2)
sf−miR−223(1)
sf−miR−9−1(3)
sf−miR−19a(3)
sf−miR−25(4)
sf−miR−17(8)
sf−miR−148a(3)
sf−miR−151−5p(1)
sf−miR−151−3p(1)
sf−miR−183(1)
sf−miR−182(1)
sf−miR−30a(6)
sf−miR−126*(1)
sf−miR−126(1)
sf−miR−144(1)
sf−miR−451−DICER1(1)
sf−miR−184(1)
sf−miR−375(1)
sf−miR−425(1)
sf−miR−191(1)
sf−miR−103−1(3)
sf−miR−101−1(2)
sf−miR−142−5p(1)
sf−miR−142−3p(1)
sf−miR−15a(2)
sf−miR−16−1(3)
sf−miR−141(5)
sf−miR−374a(2)
sf−miR−29a(4)
sf−miR−26a−1(3)
sf−miR−21(1)
Histological type
= DCIS
= ER+/HER2− (IDC)
= HER2+/ER+ (IDC)
= HER2+/ER− (IDC)
= Normal
= TNBC (special type)
= ER+ (special type)
= TNBC (IDC)
Mololecular type
= <0.1
= Basal
= HER2
= LumA
= LumB
= Normal
IHC/Clinical
= negative
= positive
Relative expression
High
Low
Figure 1. Unsupervised hierarchical clustering with complete linkage and Spearman correlation for patient samples conducted using the miRNA sequence
families making up 85% of the sequence reads within at least 1 sample. Color histogram represents miRNA rf standardized across each miRNA.
Farazi et al.
Cancer Res; 71(13) July 1, 2011 Cancer Research
4446
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
visualized miRNA abundance in normal breast, DCIS, and
IDC IHC types. To simplify the figure, we selected 4 repre-
sentative samples from each IHC category that showed
miRNA expression closest to the average expression for each
category, representing miRNAs expressed at 85% of the total
miRNA reads in at least 1 sample or cell line (Fig. 2). We
identified miRNAs present at similar levels in cell lines and
tumors that may be involved in tumorigenic processes, and
miRNAs absent from cell lines that are likely expressed by
other cell types, such as cluster-mir-126(1), cardiovascular
system, cluster-mir-143(2), adipose tissue; cluster-mir-144(2),
and cluster-mir-142(1) hematopoietic system (9, 22).
Clustering of 179 samples based on miRNAs (using 98%
of miRNAs expressed in at least 1 sample) and clustering of
161 of these samples according to their mRNA profiles (using
the same number of genes as miRNAs, selecting genes with
most variance) is depicted in Fig. 3. mRNA profiles better
separated HER2-positive samples (P¼1.90 '10
#15
), sug-
gesting that the HER2 pathway is not related to miRNA
expression changes. miRNA clustering was weakly correlated
with TNBC. Three TNBC groups emerged by clustering
miRNA profiles, one of which included mostly special his-
tologic types; these groups did not show distinct patient
characteristics or outcome.
Distant metastasis
Event death
HER2 IHC
PR IHC
ER IHC
Molecular type
Histological type
Histological type
= Cell line
= DCIS
= ER+/HER2− (IDC)
= HER2+/ER+ (IDC)
= HER2+/ER− (IDC)
= Normal
Molecular type
= <0.1
= Basal
= HER2
= LumA
= LumB
= Normal
IHC/Clinical
= negative
= positive
−0.64
−23
A
= TNBC (IDC)
other
miR−452*
miR−376a−3p(2)
miR−138(2)
miR−301a
miR−376c
miR−18a
miR−17*
miR−196a(2)
miR−143*
miR−452
miR−181c
miR−150
miR−144
miR−193b
miR−34a
miR−7(3)
miR−151−5p
miR−342
miR−126*
miR−193a−3p
miR−183
miR−146b
miR−140−3p
miR−423−3p
miR−148b
miR−10a
miR−146a
miR−99b
miR−186
miR−151−3p
miR−224
let−7e
miR−203
miR−130a
miR−182
miR−10b
miR−92a(2)
miR−20a
miR−23b
miR−125a
miR−497
miR−375
miR−425
let−7c
miR−222
miR−378
miR−19a
miR−30c(2)
miR−424
miR−205
miR−25
let−7d
miR−30b
miR−142−5p
miR−199a−5p(2)
miR−320−RNASEN
miR−29c
miR−199b−3p
miR−200a
miR−200b
miR−17
miR−27b
miR−181a(2)
miR−221
miR−29b(2)
miR−16(2)
miR−30d
miR−93
miR−30e
miR−100
miR−126
miR−26b
miR−106b
miR−142−3p
miR−145
miR−191
miR−199a−3p(2)
let−7g
miR−99a
miR−101(2)
miR−30a
miR−27a
miR−125b(2)
let−7i
miR−29a
miR−200c
miR−23a
let−7b
miR−148a
miR−19b(2)
miR−103(2)
miR−24(2)
miR−451−DICER1
miR−26a(2)
miR−22
miR−141
miR−143
let−7a(3)
let−7f(2)
miR−21
Log2 abundance
Distant metastasis
Event death
HER2 IHC
PR IHC
ER IHC
Molecular type
Histological type
Distant metastasis
Event death
HER2 IHC
PR IHC
ER IHC
Molecular type
Histological type
B
C
other
cluster−mir−181c(2)
cluster−mir−7−1(3)
cluster−mir−146a(1)
cluster−mir−10a(1)
cluster−mir−134(41)
cluster−mir−203(1)
cluster−mir−10b(1)
cluster−mir−375(1)
cluster−mir−378(1)
cluster−mir−193a(4)
cluster−mir−151(1)
cluster−mir−224(2)
cluster−mir−205(1)
cluster−mir−424(2)
cluster−mir−195(2)
cluster−mir−320(1)
cluster−mir−199b(1)
cluster−mir−96(3)
cluster−mir−26b(1)
cluster−mir−181a−1(4)
cluster−mir−99b(3)
cluster−mir−126(1)
cluster−mir−15a(4)
cluster−mir−135a−1(3)
cluster−mir−221(2)
cluster−mir−101−1(2)
cluster−mir−30b(2)
cluster−mir−200a(3)
cluster−mir−142(1)
cluster−let−7i(1)
cluster−mir−191(2)
cluster−mir−199a−1(3)
cluster−mir−148a(1)
cluster−mir−103−1(2)
cluster−mir−25(3)
cluster−mir−26a−1(2)
cluster−mir−144(2)
cluster−mir−22(1)
cluster−mir−30a(4)
cluster−mir−29a(4)
cluster−mir−17(12)
cluster−mir−143(2)
cluster−mir−141(2)
cluster−mir−23a(6)
cluster−mir−98(13)
cluster−mir−21(1)
other
sf−miR−376c(1)
sf−miR−143*(1)
sf−miR−452(1)
sf−miR−144(1)
sf−miR−7−1(3)
sf−miR−151−5p(1)
sf−miR−151−3p(1)
sf−miR−224(1)
sf−miR−203(1)
sf−miR−182(1)
sf−miR−497(1)
sf−miR−375(1)
sf−miR−425(1)
sf−miR−222(1)
sf−miR−378(1)
sf−miR−424(1)
sf−miR−146a(2)
sf−miR−205(1)
sf−miR−142−5p(1)
sf−miR−320−RNASEN(1)
sf−miR−10a(2)
sf−miR−199a−1−5p(3)
sf−miR−221(1)
sf−miR−16−1(3)
sf−miR−126(1)
sf−miR−25(4)
sf−miR−142−3p(1)
sf−miR−181a−1(6)
sf−miR−145(1)
sf−miR−191(1)
sf−miR−101−1(2)
sf−miR−199a−1−3p(3)
sf−miR−125a(3)
sf−miR−23a(2)
sf−miR−27a(2)
sf−miR−24−1(2)
sf−miR−103−1(3)
sf−miR−148a(3)
sf−miR−99a(3)
sf−miR−19a(3)
sf−miR−451−DICER1(1)
sf−miR−22(1)
sf−miR−29a(4)
sf−miR−17(8)
sf−miR−143(1)
sf−miR−26a−1(3)
sf−miR−30a(6)
sf−miR−141(5)
sf−let−7a−1(12)
sf−miR−21(1)
Cardiovascular
Hematopoietic
Endocrine gland
Adipose
Figure 2. Unsupervised hierarchical clustering with complete linkage and Spearman correlation depicting (A) the 85% top expressed mature miRNAs,
(B) miRNA genomic clusters, and (C) sequence families. Every clustering includes 4 samples from each ER/HER2 IHC category, normal breast, and
6 cell lines. Color histogram represents log
2
miRNA abundance (rf). Examples of lineage-specific miRNA genomic clusters are noted.
miRNA Sequence and Expression Analysis in Breast Tumors
www.aacrjournals.org Cancer Res; 71(13) July 1, 2011 4447
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
We used the edgeR package (33) to identify individual
mature miRNAs, miRNA genomic clusters, and miRNA
sequence families differentially represented between nor-
mal, DCIS, and invasive tumor samples. We included miR-
NAs that were amongst the top 85% sequence reads in at
least 1 sample, to limit our analysis to regulatory important
miRNAs and potential biomarkers. The edgeR algorithm
compensates for the difference in sequence reads between
samples, as well as the different number of samples within
the categories compared.
First, we compared normal samples to HER2-positive IDC
and DCIS samples. sf-miR-21(1) and cluster-mir-142(1)
members [sf-miR-142-3p(1), sf-miR-142-5p(1)] had higher
abundance in IDC compared with normal breast with a
value of P%0.001 (Fig. 4A and Table 1 and Supplementary
Table S6). Cluster-mir-98(13) members [sf-miR-125a(3), sf-
miR-99a(3)], sf-miR-22(1), and cluster-mir-143(2) members
[sf-miR-145(1), sf-miR-143*(1)], sf-miR-378(1), sf-miR-497(1),
sf-miR-320(1), and cluster-mir-144(2) members [sf-miR-451
(1), sf-miR-144(1)] were abundant miRNAs in normal breast
reduced in IDC. miRNAs present both in ductal cell lines and
tumors likely represent tumor downregulated miRNAs
[cluster-mir-98/let-7(13) and cluster-mir-22(1)]. miRNAs
absent in ductal cell lines likely reflect differences in tissue
composition [i.e., adipose specific cluster-mir-143(2) is likely
related to the presence of adipose tissue in the biopsies; ref.
9). Most miRNA changes in IDC were already apparent in
DCIS samples.
To identify miRNAs altered in tumor invasion, we compared
HER2-positive DCIS and IDC samples. Cluster-mir-142(1)
members [sf-miR-142-3p(1) and sf-miR-142-5p(1)] were over-
represented in HER2-positive IDC compared with DCIS. Given
that cluster-mir-142(1) is hematopoietic lineage specific, this
most likely reflects a change in tissue composition (22).
HER2 IHC
PR IHC
ER IHC
Molecular type
Histological type
Histological type
= DCIS
= ER+/HER2− (IDC)
= HER2+/ER+ (IDC)
= HER2+/ER− (IDC)
= Normal
= TNBC (special type)
= ER+ (special type)
= TNBC (IDC)
Mololecular type
= <0.1
= Basal
= HER2
= LumA
= LumB
= Normal
IHC/Clinical
= negative
= positive
A
B
miRNA clustering
mRNA clustering
Distant metastasis
Event death
HER2 IHC
PR IHC
ER IHC
Molecular type
Histological type
Distant metastasis
Event death
Figure 3. Comparison of clustering using miRNA sequence families (characterizing seed sequence–dependent miRNA target regulation) and mRNA profiles.
A, unsupervised hierarchical clustering conducted for miRNA sequence families, using the top 98% expressed families within at least 1 sample (221 families;
Spearman correlation). B, unsupervised hierarchical clustering of 161 samples with available mRNA profiles, using the 221 genes with the highest
variance (Pearson correlation).
Farazi et al.
Cancer Res; 71(13) July 1, 2011 Cancer Research
4448
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
Normal (11) vs. HER2+/ER–/IDC (34)
ER+ (19) vs. ER– (48)
(HER2–/IDC)
HER2+ (34) vs. HER2– (48)
(ER–/IDC)
HER2– (19) vs. HER2+ (25)
(ER+/IDC)
ER– (34) vs. ER+ (25)
(HER2+/IDC)
No metastasis (66) vs. metastasis (31)
(all IDC)
No metastasis (28) vs. metastasis (21)
(HER2+)
No metastasis (38) vs. metastasis (10)
(TNBC)
Normal (11) vs. DCIS (17) DCIS (17) vs. HER2+/ER–/IDC (34)
6
142–5p(1)
A
B
C
142–3p(1)
99a(3)
22(1) 497(1)
378(1)
99a(3)
21(1) 142–3p(1)
142–5p(1)
22(1)
451(1)
320(1)
125a(3)
378(1)
146a(2)
19a(3)
19a(3)
205(1)
451(1)
423–5p(1) 423–5p(1)
184(1) 184(1)
423–3p(1)
423–3p(1)
375(1)
143*(1)
144(1)
497(1)
320(1)
451(1)
145(1)
21(1)
Log2 fold change read freq.
Average log2 miRNA read freq.
Average log2 miRNA read freq.
Average log2 miRNA read freq.
Average log2 miRNA read freq. Average log2 miRNA read freq. Average log2 miRNA read freq.
Average log2 miRNA read freq.
Average log2 miRNA read freq.
Log2 fold change read freq.
4
2
0
–2
3
Log2 fold change read freq.
Log2 fold change read freq.
Log2 fold change read freq.
Log2 fold change read freq.
Log2 fold change read freq.
Log2 fold change read freq.
Log2 fold change read freq.
2
1
0
–1
0.5
0.0
–0.5
–1.0
–1.5
–2.0
1.0
0.5
0.0
–0.5
–1.0
–1.5
1.5
1.0
0.5
–0.5
0.0
–1.0
–1.5
–2.0
2
1
0
–1
–2
–3
5
4
3
2
0
1
–1
1
0
–1
–2
–3
–16 –14 –12 –10 –8 –6 –4 –2
–14 –12 –10 –8 –6 –4 –2
–14–16 –12 –10 –8 –6 –4 –2
–12 –10 –8 –6 –4 –2
–12 –10 –8 –6 –4 –2
–12 –10 –8 –6 –4 –2 –12 –10 –8 –6 –4 –2
6
4
2
0
–2
Log2 fold change read freq.
3
2
1
0
–1
–14 –12 –10 –8 –6 –4
Average log2 miRNA read freq.
–14 –12 –10 –8 –6 –4
Average log2 miRNA read freq.
–12 –10 –8 –6 –4 –2
Figure 4. Results of edgeR comparison analysis between groups of samples. Results plotted as log
2
of the fold change between normal and/or tumor
categories as a function of the log
2
of the average miRNA abundance in the 2 categories compared. Colored dots represent miRNAs that are significantly
differentially expressed (P%0.001). miRNAs in green are overexpressed, whereas miRNAs in red are underexpressed in the second category of samples
within each comparison. Abundant miRNAs (log
2
concentration (#8) are labeled in black, whereas low abundant miRNAs are labeled in gray.
A, miRNA sequence families (sf) differentially represented in tumor stages. B, miRNA sf that differentiate tumor IHC types. C, miRNA sf that predict
metastasis in IDC, also separately evaluated in HER2-positive and TNBC patients.
miRNA Sequence and Expression Analysis in Breast Tumors
www.aacrjournals.org Cancer Res; 71(13) July 1, 2011 4449
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
Table 1. Comparison between normal breast, DCIS, and IDC specimens using edgeR
miRNA levels in
carcinomas
miRNA genomic cluster Normal (rf) DCIS (rf) HER2-positive
IDC (rf)
Normal vs.
DCIS fold
change (P)
Normal vs.
HER2-positive
IDC fold
change (P)
DCIS vs.
HER2-positive
IDC fold
change (P)
Downregulated miRNAs
sf-miR-22(1) cluster-mir-22(1) 0.120 0.033 0.042 3.7 (0.0006) 2.9 (0.0008) 0.8 (1)
sf-miR-125a(3) cluster-mir-98(13) 0.088 0.029 0.013 3.1 (0.007) 6.8 (1.3E-13) 2.2 (0.02)
sf-miR-99a(3) cluster-mir-98(13) 0.066 0.018 0.015 3.6 (0.0008) 4.5 (3.9E-08) 1.2 (1)
sf-let-7a-1(12) cluster-mir-98(13) 0.148 0.076 0.120 2.0 (1) 1.2 (1) 0.6 (1)
sf-miR-451-DICER1(1) cluster-mir-144(2) 0.053 0.009 0.004 5.7 (2.4E-07) 11.8 (8.8E-23) 2.1 (0.06)
sf-miR-144(1) cluster-mir-144(2) 0.003 0.001 0.000 2.6 (0.067) 7.0 (5.67E-14) 2.7 (0.0005)
sf-miR-145(1) cluster-mir-143(2) 0.046 0.015 0.009 3.1 (0.006) 4.9 (3.0E-09) 1.6 (1)
sf-miR-143(1) cluster-mir-143(2) 0.042 0.043 0.022 1.0 (1) 1.9 (0.559) 1.9 (0.130)
sf-miR-143*(1) cluster-mir-143(2) 0.008 0.001 0.001 8.0 (4.48E-10) 7.9 (6.54E-16) 1.0 (1)
sf-miR-320-RNASEN(1) cluster-mir-320-RNASEN(1) 0.036 0.006 0.005 6.5 (2.5E-08) 7.7 (1.2E-15) 1.2 (1)
sf-miR-378(1) cluster-mir-378(1) 0.022 0.004 0.003 5.7 (3.3E-07) 8.5 (4.6E-17) 1.5 (1)
sf-miR-497(1) cluster-mir-195(2) 0.011 0.003 0.003 4.3 (0.00004) 4.2 (1.7E-07) 1.0 (1)
sf-miR-16-1(3) cluster-mir-195(2) 0.002 0.007 0.006 0.3 (0.012) 0.3 (0.052) 1.3 (1)
Upregulated miRNAs
sf-miR-21(1) cluster-mir-21(1) 0.035 0.365 0.357 10.4 (2.5E-09) 10.2 (1.4E-09) 1.0 (1)
sf-miR-142-3p(1) cluster-mir-142(1) 0.002 0.003 0.010 1.5 (1) 5.6 (0.00001) 3.9 (8.2E-06)
sf-miR-142-5p(1) cluster-mir-142(1) 0.001 0.006 0.008 4.7 (0.0003) 5.9 (7.31E-06) 1.3 (1)
NOTE: sf-miRNAs expressed at >1% rf, P%0.001 for at least 1 comparison are in bold type. sf-miRNAs that belong to the same miRNA genomic cluster irrespective of expression
and Pvalues are in plain text.
Farazi et al.
Cancer Res; 71(13) July 1, 2011 Cancer Research
4450
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
We then employed edgeR to determine whether miRNA levels
correlated with IHC or special histologic characteristics (Fig. 4B
and Supplementary Table S6). Differences in miRNA represen-
tation between IHC subtypes involved less abundant miRNAs.
TNBC showed the largest number of miRNA level changes.
Cluster-mir-17(12) member sf-miR-19a(3), present in ductal cell
lines, sf-miR-205(1) and sf-miR-146a(2), lower represented in
ductal cell lines, were higher expressed in TNBC compared with
ER- and/or HER2-positive tumors. sf-miR-451(1) was less abun-
dant in TNBC. Special histologic TNBCs (atypical medullary,
metaplastic, adenoid cystic, and apocrine carcinomas) were
differentiated from IDC TNBCs by less abundant sf-miR-224
(1), absent from some ductal cell lines.
Sequence-specific biases in the efficiency of cDNA library
preparation can distort the representation of individual
miRNAs by number of sequence read counts in a reproducible
manner. We used an equimolar pool of 770 synthetic miRNAs
to address possible biases in our method (M. Hafner, unpub-
lished data). This analysis did not show an overrepresentation
above the median rf over 5-fold for any of the miRNAs meeting
our analysis cutoff value but did show underrepresentation
over 5-fold for miR-193a, miR-193b, miR-26b, miR-29c, and
miR-30b. Among the more than 5-fold underrepresented miR-
NAs was miR-31, previously implicated in metastasis, which
did not meet our analysis cutoff value. These biases in absolute
abundance would not affect the comparisons across sample
groups. Supplementary Table S7 lists edgeR comparisons for all
detected miRNAs, irrespective of abundance, including a sum-
mary of our findings for miRNAs investigated in animal models
(see Introduction). For example, miR-520c showed statistically
significant upregulation in patients that developed metastasis,
consistent with its proposed prometastatic roles.
miRNAs as prognostic markers within TNBC- and
HER2-positive tumors
We evaluated the prognostic capacity of miRNAs on 48
TNBC patients and 49 HER2-positive patients, for which
clinical follow-up information (distant metastasis-free survi-
val and overall survival) was available with a mean follow-up
of 5.6 years. Less abundant cluster-mir-423(1) members
[sf-miR-423-3p(1) and sf-miR-423-5p(1)] and cluster-mir-375
(1)/sf-miR-375(1) were more abundant in patients that went
on to develop metastasis, whereas cluster-mir-184(1)/sf-miR-
184(1) was less abundant (P%0.001; Fig. 4C and Supplemen-
tary Table S6). Kaplan–Meier analyses supported the findings
for cluster-mir-423(2),P¼0.013, and sf-miR-184(1)/cluster-
mir-184(1),P¼0.041, in HER2-positive patients (Supplemen-
tary Fig. S3). Univariate and multivariate Cox regression
indicated that cluster-mir-423(1) is an independent predictor
of outcome in the presence of other clinical parameters
(tumor size, grade, and lymph node status; data not shown).
However, these findings should be interpreted with caution, as
these Pvalues do not account for multiple testing.
Analysis of miRNA sequence variation in clinical
specimens and cell lines
miRNAs have been proposed to act as tumor suppressors or
oncogenes, yet somatic mutations in cancer patients have not
been detected that support this proposal. For identification of
nucleotide variations relative to the reference human genome,
we required 10 or above varied sequence reads covering a
given position per sample, with 25% or above variation
frequency ((40'coverage). On the basis of analysis of deep
sequencing data from a pool of 770 synthetic miRNAs
(M. Hafner, unpublished data), 10% or higher variation fre-
quency guaranteed that 98% of the identified variations were
not random events due to sequencing errors but likely due to
mismapping between miRNAs similar in sequence. We
excluded the 30most terminal residue of all sequence reads
from somatic mutational analysis if it was altered because it
frequently contained 30untemplated nucleotide addition (22).
We identified 144 distinct nucleotide variations located
within 117 mature and star miRNA sequences. There were
109 variations in the last 2 positions of the predominant
mature sequence read, likely representing instances of untem-
plated 30terminal addition that were insufficiently repressed
by our computational approach of not considering the 30most
nucleotide (102 variations represented changes into A or U;
Supplementary Table S8A and B). Further targeted analysis of
the 30end variations is included in Supplementary Table S8C.
None of the remaining 35 variations was detected in abundant
miRNAs ((1% rf). These sequence variations could represent
RNA editing events (including deamination, polyuridylation,
and polyadenylation), single-nucleotide polypeptides (SNP), or
somatic mutations. The most common of the 35 variations
observed in the mature and star miRNA sequences were A to
G, likely representing A to I RNA editing by dsRNA-specific
adenosine deaminases (22, 34, 35, 36). This observation was
further supported by a well-represented unimodal distribution
of the nucleotide variation frequency for miR-376a and miR-
376c, previously reported as edited in normal tissues (35). miR-
625 (detected in 17 samples), miR-497* (n¼9), and miR-381
[member of cluster-mir-134(41), which also includes miR-376;
n¼17] could represent editing events not previously reported
(Supplementary Figs. S4 and 5). Deamination events were not
observed in cell lines.
We detected 2 known SNPs (SNPdb version 131) in less
abundant miR-196a-2* and miR-146a which have been studied
in the context of breast cancer risk (rs11614913 and rs2910164;
refs. 37–40). rs2910164 (C5G) was detected in 5 carcinomas,
whereas rs11614913 (C18T) was detected in 39 samples
including 1 normal breast sample and 3 cell lines. We identi-
fied 10 nucleotide variations that are candidate new SNPs,
based on the trimodal distribution of their variation fre-
quency. Two of these variations were identified in miRNAs
present within 85% of all miRNA reads in at least 1 sample:
miR-181a (T19G) observed in 10 carcinomas and miR-185
(T16G) observed in 3 carcinomas. There was no evidence for
somatic mutations in miRNAs.
Discussion
Do miRNAs hold potential as diagnostic and prognostic
markers in breast cancer?
Our sequencing approach using a large diverse sample
collection allowed us to evaluate miRNA deregulation in
miRNA Sequence and Expression Analysis in Breast Tumors
www.aacrjournals.org Cancer Res; 71(13) July 1, 2011 4451
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
the context of miRNA abundance, sequence variation, and
tissue heterogeneity, important elements in identifying miRNAs
that would be useful prognostic and diagnostic markers, or
prioritizing miRNAs for further studies. The first study on
miRNA deregulation in breast cancer, by Iorio and colleagues,
used microarrays to compare a varietyof breast carcinomas (n¼
76) to normal (n¼10) breast tissue (24), identifying 17 upre-
gulated and 12 downregulated miRNAs in carcinomas, and
miRNAs differentially expressed in ER-positive (11 miRNAs)
and PR-positive (7 miRNAs) samples. Follow-up studies vali-
dated some of these miRNAs (41), suggested more than 30
miRNAs differentiating tumor subtypes (42, 43), anddefined the
cell-type–specific localization of some of these miRNAs (44).
When comparing miRNA levels between normal breast and
carcinomas we identified 10 abundant miRNA sequence
families ((1% rf) showing 3- to 12-fold changes (Table 1).
We confirmed upregulation of sf-miR-21(1) and downregula-
tion of cluster-mir-98(13) members, sf-miR-22(1), sf-miR-145
(1), sf-miR-378(1), sf-miR!497(1), sf-miR-320(1), sf-miR-451
(1), and identified upregulation of sf-miR-142-3p(1) not pre-
viously reported. Oncogenic cluster-mir-17-92 member sf-miR-
19a(3) showed approximately 3-fold higher levels in TNBCs
suggesting regulatory potential, but exhibiting a modest
change for a robust diagnostic marker. Mostly less abundant
miRNAs showed changes between different categories of ER/
HER2 IHC groups, challenging their direct regulatory role in
ER/HER2-related pathways. Moreover, in a comparison of the
potential of overall miRNA profiles for differentiating tumor
types with the potential of mRNAs profiles, miRNAs did not
further clarify the assignment.
In the 2 large patient studies (>100 patients), only miR-210
was shown to be inversely correlated with time to metastasis,
disease-free, and overall survival (28, 29). Foekens and collea-
gues (28) focused on ER-positive lymph node–negative breast
tumors but also extended their findings to TNBC, whereas
Camps and colleagues (29) focused on ER-positive patients. Our
study showed higher levels of miR-210 in invasive compared
with noninvasive carcinomas but did not confirm a significant
prognostic role. Other studies investigating miRNAs in breast
cancer metastasis based on cell line and animal models were
validated with smaller (n!20) patient sample collections (15,
16, 26, 27). The differences in miRNA levels in patients that
developed metastases involve less abundant miRNAs, challen-
ging to translate into prognostic markers given the detection
limits of currently available experimental methods.
By comparing ductal cell lines with ductal tumors, we
identified miRNAs expressed in ductal versus other cell types.
Changes in miRNA levels reflect either up- or downregulation
in ductal cells likely signifying associated tumor cell oncogenic
pathways, or changes in tissue composition (i.e., presence of
lymphocytic infiltrate). Given this lineage specificity, miRNA
levels may allow estimation of cell types present in hetero-
geneous biopsy samples to normalize molecular array–based
diagnostic tests.
Understanding miRNA involvement in oncogenic
processes
Viewing miRNAs in the context of their abundance
defined miRNAs whose levels suggest regulatory functions
(1% rf roughly equivalent to 1,500 copies per cell). miRNA
nucleotide variations are implicated in tumorigenesis; how-
ever, evidence for their significance is limited (45). In our
study, nucleotide variations were not identified in highly
abundant miRNAs. We were not able to detect statistically
significant differences between the occurrence of sequence
variations in normal breast compared with carcinomas
(Supplementary Table S8). It is important, however, to note
that less abundant miRNAs could be highly expressed in a
subpopulation of cells responsible for tumor invasion.
We noted the drastic upregulation of miR-21 in carcinomas,
recently identified as an oncogene in mouse models (46, 47).
We showed that miRNA levels were comparable in normal
breast and tumors, suggesting that the downregulation of other
abundant miRNAs in tumors could be a consequence of
competition for processing by the RNA interference machinery.
Upregulation of miR-21 may drive tumorigenesis, both through
a direct effect on targets involved in repressor functions and
downregulation of tumor suppressor miRNAs, such as mem-
bers of the cluster-mir-98(13), as suggested from our data. In
conclusion, our abundance-based view of miRNA expression in
breast cancer supports a focus on oncogenic miR-21, and miR-
21 targets with potential tumor suppressor functions, as pro-
mising therapeutic targets. Given that miR-21 is upregulated in
many other malignancies, identifying the tumorigenic path-
ways it regulates has broad implications in oncology.
Disclosure of Potential Conflicts of Interest
T. Tuschl is cofounder and scientific advisor to Alnylam Pharmaceuticals and
scientific advisor to Regulus Therapeutics and has ownership interests (includ-
ing patents) in Alnylam Pharmaceuticals and Regulus Therapeutics. No poten-
tial conflicts of interest have been disclosed by the other authors.
Acknowledgments
The authors thank Scott Dewell at the Rockefeller University Genomics
Center and Iddo Ben-Dov and Sean McGeary for editing the manuscript.
Grant Support
The study was supported by grants from the Dutch Cancer Society
(NKB2002-2575) and the NIH (1RC1CA145442).
The costs of publication of this article were defrayed in part by the payment
of page charges. This article must therefore be hereby marked advertisement in
accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received February 24, 2011; revised May 2, 2011; accepted May 9, 2011;
published OnlineFirst May 17, 2011.
References
1. van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW,
et al. A gene-expression signature as a predictor of survival in breast
cancer. N Engl J Med 2002;347:1999–2009.
2. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA,
et al. Molecular portraits of human breast tumours. Nature 2000;406:
747–52.
Farazi et al.
Cancer Res; 71(13) July 1, 2011 Cancer Research
4452
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608
3. Bartel DP. MicroRNAs: target recognition and regulatory functions.
Cell 2009;136:215–33.
4. Fabian MR, Sonenberg N, Filipowicz W. Regulation of mRNA transla-
tion and stability by microRNAs. Annu Rev Biochem 2010;79:351–79.
5. Krol J, Loedige I, Filipowicz W. The widespread regulation of micro-
RNA biogenesis, function and decay. Nat Rev Genet 2010;11:
597–610.
6. Ventura A, Jacks T. MicroRNAs and cancer: short RNAs go a long
way. Cell 2009;136:586–91.
7. Medina PP, Slack FJ. microRNAs and cancer: an overview. Cell Cycle
2008;7:2485–92.
8. Garzon R, Marcucci G, Croce CM. Targeting microRNAs in cancer:
rationale, strategies and challenges. Nat Rev Drug Discov 2010;9:
775–89.
9. Farazi TA, Spitzer JI, Morozov P, Tuschl T. miRNAs in human cancer. J
Pathol 2011;223:102–15.
10. Calin GA, Sevignani C, Dumitru CD, Hyslop T, Noch E, Yendamuri S,
et al. Human microRNA genes are frequently located at fragile sites
and genomic regions involved in cancers. Proc Natl Acad Sci U S A
2004;101:2999–3004.
11. Jazbutyte V, Thum T. MicroRNA-21: from cancer to cardiovascular
disease. Curr Drug Targets 2010;11:926–35.
12. Xiang J, Wu J. Feud or friend? The role of the miR-17–92 cluster in
tumorigenesis. Curr Genomics 2010;11:129–35.
13. Huse JT, Brennan C, Hambardzumyan D, Wee B, Pena J, Rouhanifard
SH, et al. The PTEN-regulating microRNA miR-26a is amplified in
high-grade glioma and facilitates gliomagenesis in vivo. Genes Dev
2009;23:1327–37.
14. Roush S, Slack FJ. The let-7 family of microRNAs. Trends Cell Biol
2008;18:505–16.
15. Ma L, Teruya-Feldstein J, Weinberg RA. Tumour invasion and metas-
tasis initiated by microRNA-10b in breast cancer. Nature 2007;449:
682–8.
16. Tavazoie SF, Alarcon C, Oskarsson T, Padua D, Wang Q, Bos PD,
et al. Endogenous human microRNAs that suppress breast cancer
metastasis. Nature 2008;451:147–52.
17. Iliopoulos D, Jaeger SA, Hirsch HA, Bulyk ML, Struhl K. STAT3
activation of miR-21 and miR-181b-1 via PTEN and CYLD are part
of the epigenetic switch linking inflammation to cancer. Mol Cell
2010;39:493–506.
18. Gregory PA, Bert AG, Paterson EL, Barry SC, Tsykin A, Farshid G,
et al. The miR-200 family and miR-205 regulate epithelial to mesench-
ymal transition by targeting ZEB1 and SIP1. Nat Cell Biol 2008;10:
593–601.
19. Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, et al.
MicroRNA expression profiles classify human cancers. Nature
2005;435:834–8.
20. Kumar MS, Pester RE, Chen CY, Lane K, Chin C, Lu J, et al. Dicer1
functions as a haploinsufficient tumor suppressor. Genes Dev 2009;
23:2700–4.
21. Lambertz I, Nittner D, Mestdagh P, Denecker G, Vandesompele J,
Dyer MA, et al. Monoallelic but not biallelic loss of Dicer1 promotes
tumorigenesis in vivo.CellDeathDiffer2010;17:633–41.
22. Landgraf P, Rusu M, Sheridan R, Sewer A, Iovino N, Aravin A, et al. A
mammalian microRNA expression atlas based on small RNA library
sequencing. Cell 2007;129:1401–14.
23. Takamizawa J, Konishi H, Yanagisawa K, Tomida S, Osada H, Endoh
H, et al. Reduced expression of the let-7 microRNAs in human lung
cancers in association with shortened postoperative survival. Cancer
Res 2004;64:3753–6.
24. Iorio MV, Ferracin M, Liu CG, Veronese A, Spizzo R, Sabbioni S, et al.
MicroRNA gene expression deregulation in human breast cancer.
Cancer Res 2005;65:7065–70.
25. O’Day E, Lal A. MicroRNAs and their target gene networks in breast
cancer. Breast Cancer Res 2010;12:201.
26. Ma L, Young J, Prabhala H, Pan E, Mestdagh P, Muth D, et al. miR-9, a
MYC/MYCN-activated microRNA, regulates E-cadherin and cancer
metastasis. Nat Cell Biol 2010;12:247–56.
27. Valastyan S, Reinhardt F, Benaich N, Calogrias D, Sz!
asz AM, Wang
ZC, et al. A pleiotropically acting microRNA, miR-31, inhibits breast
cancer metastasis. Cell 2009;137:1032–46.
28. Foekens JA, Sieuwerts AM, Smid M, Look MP, de Weerd V, Boersma
AW, et al. Four miRNAs associated with aggressiveness of lymph
node-negative, estrogen receptor-positive human breast cancer.
Proc Natl Acad Sci U S A 2008;105:13021–6.
29. Camps C, Buffa FM, Colella S, Moore J, Sotiriou C, Sheldon H, et al.
hsa-miR-210 Is induced by hypoxia and is an independent prognostic
factor in breast cancer. Clin Cancer Res 2008;14:1340–8.
30. Hu Z, Fan C, Oh DS, Marron JS, He X, Qaqish BF, et al. The molecular
portraits of breast tumors are conserved across microarray platforms.
BMC Genomics 2006;7:96.
31. Hafner M, Renwick N, Pena J, Mihailovic A, Tuschl T. Barcoded cDNA
libraries for miRNA profiling by next-generation sequencing. In:Hart-
mann RK, Bindereif A, Schon A, Westhof E, editors. Handbook of RNA
biochemistry. Weinheim, Germany: VCh-Wiley; 2010.
32. Kumar MS, Lu J, Mercer KL, Golub TR, Jacks T. Impaired microRNA
processing enhances cellular transformation and tumorigenesis. Nat
Genet 2007;39:673–7.
33. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor
package for differential expression analysis of digital gene expression
data. Bioinformatics 2010;26:139–40.
34. Blow MJ, Grocock RJ, van Dongen S, Enright AJ, Dicks E, Futreal PA,
et al. RNA editing of human microRNAs. Genome Biol 2006;7:R27.
35. Kawahara Y, Zinshteyn B, Sethupathy P, Iizasa H, Hatzigeorgiou AG,
Nishikura K. Redirection of silencing targets by adenosine-to-inosine
editing of miRNAs. Science 2007;315:1137–40.
36. Chiang HR, Schoenfeld LW, Ruby JG, Auyeung VC, Spies N, Baek D,
et al. Mammalian microRNAs: experimental evaluation of novel and
previously annotated genes. Genes Dev 2010;24:992–1009.
37. Gao LB, Bai P, Pan XM, Jia J, Li LJ, Liang WB, et al. The association
between two polymorphisms in pre-miRNAs and breast cancer risk: a
meta-analysis. Breast Cancer Res Treat 2011;125:571–4.
38. Hoffman AE, Zheng T, Yi C, Leaderer D, Weidhaas J, Slack F, et al.
microRNA miR-196a-2 and breast cancer: a genetic and epigenetic
association study and functional analysis. Cancer Res 2009;69:
5970–7.
39. Catucci I, Yang R, Verderio P, Pizzamiglio S, Heesen L, Hemminki K,
et al. Evaluation of SNPs in miR-146a, miR196a2 and miR-499 as low-
penetrance alleles in German and Italian familial breast cancer cases.
Hum Mutat 2010;31:E1052–7.
40. Shen J, Ambrosone CB, DiCioccio RA, Odunsi K, Lele SB, Zhao H. A
functional polymorphism in the miR-146a gene and age of familial
breast/ovarian cancer diagnosis. Carcinogenesis 2008;29:1963–6.
41. Volinia S, Calin GA, Liu CG, Ambs S, Cimmino A, Petrocca F, et al. A
microRNA expression signature of human solid tumors defines cancer
gene targets. Proc Natl Acad Sci U S A 2006;103:2257–61.
42. Blenkiron C, Goldstein LD, Thorne NP, Spiteri I, Chin SF, Dunning MJ,
et al. MicroRNA expression profiling of human breast cancer identifies
new markers of tumor subtype. Genome Biol 2007;8:R214.
43. Mattie MD, Benz CC, Bowers J, Sensinger K, Wong L, Scott GK, et al.
Optimized high-throughput microRNA expression profiling provides
novel biomarker assessment of clinical prostate and breast cancer
biopsies. Mol Cancer 2006;5:24.
44. Sempere LF, Christensen M, Silahtaroglu A, Bak M, Heath CV,
Schwartz G, et al. Altered MicroRNA expression confined to specific
epithelial cell subpopulations in breast cancer. Cancer Res 2007;67:
11612–20.
45. Ryan BM, Robles AI, Harris CC. Genetic variation in microRNA net-
works: the implications for cancer research. Nat Rev Cancer 2010;
10:389–402.
46. Medina PP, Nolde M, Slack FJ. OncomiR addiction in an in vivo model
of microRNA-21-induced pre-B-cell lymphoma. Nature 2010;467:
86–90.
47. Hatley ME, Patrick DM, Garcia MR, Richardson JA, Bassel-Duby R,
van Rooij E, et al. Modulation of K-Ras-dependent lung tumorigenesis
by MicroRNA-21. Cancer Cell 2010;18:282–93.
miRNA Sequence and Expression Analysis in Breast Tumors
www.aacrjournals.org Cancer Res; 71(13) July 1, 2011 4453
American Association for Cancer Research Copyright © 2011
on July 1, 2011cancerres.aacrjournals.orgDownloaded from
Published OnlineFirst May 17, 2011; DOI:10.1158/0008-5472.CAN-11-0608