If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Institute of Experimental Hematology, Hannover Medical School, Hannover, GermanyREBIRTH Cluster of Excellence, Hannover Medical School, Hannover, GermanyDivision of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
Combinatorial expression of fluorescent marker proteins produces color-coded cell populations.
Various lentiviral vector systems can be employed for the production of complex color codes.
Flow cytometry enables the deconvolution of input samples in color-coded cell mixtures.
Color coding facilitates the multiplexed longitudinal characterization of cell fate decisions.
Hematopoiesis depends on the controlled differentiation of hematopoietic stem cells to mature cells with defined functions. Although each cell population within the hematopoietic hierarchy can be described by phenotypic markers, isolation of marker pure populations does not necessarily result in cells with homogeneous functionality. However, techniques that enable the efficient characterization of cell behavior with high resolution are limited. Although single-cell transplantation assays demand high mouse numbers and workload, sequencing-based fate tracking techniques require the destruction of the host cell, substantial financial resources, and bioinformatics expertise and suffer from a delay between sample acquisition and data interpretation. To make analyses more efficient, several laboratories recently developed flow cytometry-driven, fluorescence-based multiplexing approaches that enable parallel analysis of longitudinal behavior from multiple clonally derived cells or polyclonal populations. Although these fluorescent genetic barcoding systems are still in their infancy, their power lies in the use of retroviral vectors for gene marking of multiple populations with unique fluorescent color codes. Tracing of color-coded cells by flow cytometry guarantees the accessibility of information on population behavior in real time and at low cost, supports the prospective isolation of cells for downstream analyses, and can be applied to cell line models as well as to human- and animal-derived primary cells. Here, we discuss recent progress in the emerging field of fluorescent genetic barcoding for longitudinal multiplex cell tracking in biomedical research and how this technique will help to uncover mechanisms regulating cell behavior with clonal resolution in a reduced number of experimental samples.
Hematopoiesis depends on the hierarchical production of mature cells from a pool of self-renewing hematopoietic stem cells (HSCs). Due to their high regenerative capacity, great interest focuses on the understanding of HSC biology, which can be best assessed in functional transplantation assays that allow for the read-out of HSC numbers and their potential to produce cells of disparate lineages. To track individual HSC fate decisions, initial transplantation studies employed retroviral gene marking of bulk cells and subsequent Southern blot analyses to assess clonal hematopoietic contributions [
]. Given the low sensitivity of this assay, it remained unknown whether the low number of bands observed on autoradiograms that were representative of putative clones originated from a low number of engraftment compatible cells, low HSC gene-marking efficiencies, clonal fluctuations or a combination of these factors. Subsequently, the development of improved purification techniques facilitated the prospective isolation of murine and human HSCs to near purity [
]. In contrast to experiments with bulk cells, transplantation of single murine HSCs revealed differences in their potency to produce mature cells of the lymphoid and myeloid lineages over long periods of time [
]. However, the requirement for efficient cell-sorting and transplantation strategies, along with high mouse numbers to sufficiently monitor clonal fate decisions, makes single-cell transplantations demanding. To facilitate parallel analysis of transplantation experiments with single-cell resolution, retroviral gene-marking techniques were thus further optimized to enable DNA barcoding and integration site analysis, respectively, both of which allow for highly sensitive longitudinal lineage tracing by molecular techniques and bioinformatics analysis [
]. To circumvent artifacts possibly associated with the purification, genetic modification, and transplantation of cells, transgenic mice with endogenous gene-marking cassettes have also been developed and supported elaborate cell-fate-tracking applications of the undisturbed hematopoiesis [
]. However, all of these approaches depend on the extraction of cellular DNA for sequencing-based cell fate tracking and thus render subsequent functional assays with desired cells impossible.
The ideal system for the investigation of cell characteristics would therefore support the parallel tracing of multiple viable (clonally-derived) populations with standard laboratory equipment increased throughput and high efficiency.
Advent of fluorescence-based fate tracking
Over the last decades, flow cytometers have become an integral device to phenotype hematopoietic cells and routinely separate 10–15 fluorescent parameters. Nevertheless, this number of parameters does not always suffice for complex analyses and requires high amounts of antibodies to characterize large sample sets. An elegant approach to overcoming these limitations is multiplexing, which describes the combined analysis of multiple samples from a single acquisition tube. In fluorescent cell barcoding (FCB), multiplexing is facilitated through the staining of each sample with a unique combination of amine-reactive dye dilutions prior to sample mixing combined with antibody staining and acquisition [
]. Although FCB reduces processing and acquisition times of large sample numbers, cell labeling requires cellular destruction so that longitudinal analyses as required for complex cell-fate-tracking approaches cannot be accomplished.
The replacement of exogenous labels with intrinsically expressed genetic reporters such as fluorescent proteins (XFPs) enables long-term cell marking in biomedical research [
]. A prototypical example is the transgenic Brainbow mouse and its derivatives that employ stochastic recombination of multicopy XFP reporter cassettes to create complex and stably inherited color codes in individual cell clones [
]. The subsequent assessment of clonal complexities requires the fluorescence-microscopy-assisted integration of fluorescence signals and their localization so that RGB-mediated cell tracing conventionally works best in solid organs.
The adaptation of multiplex fluorescent labeling approaches to the flow-cytometric analysis of hematopoiesis was first achieved through the recombination-mediated production of complex color codes and their bioinformatics deconvolution [
]. This approach demonstrated the feasibility of real-time clonal fate tracking, the opportunity to isolate clones of interest by fluorescence activated cell sorting (FACS) and to execute secondary transplantation, and molecular investigations to correlate cell behavior and transcriptional/epigenetic states.
Retroviral fluorescent genetic barcoding
Transgenic lineage tracing depends on appropriate animal models and does not permit studies in human cells. However, the use of color-coding-compatible retroviral gene-marking constructs offers greater flexibility in experimental design strategies because these vectors can: (1) be modified according to experimental requirements, (2) be produced with standard laboratory equipment, and (3) support gene transfer into cells from multiple species. With the exception of RGB marking [
], typical gene-marking experiments utilize one or two fluorescent reporters for labeling of test cells and control cells, respectively, thereby reserving fluorescent channels for additional flow cytometric phenotyping investigations. To increase color code complexities with a limited set of XFPs, Smurthwaite et al. proposed a fluorescent genetic barcoding (FGB) approach that employs combinatorial staggered expression of retrovirally encoded XFP variants to create flow-cytometry-compatible color codes [
]. Based on theoretical calculations, a “mixing matrix” consisting of three XFPs expressed alone, in pairs, or in triplets with two defined intensities already yields 26 unique color codes, which means that this concept may overcome potential obstacles associated with random color code mixtures created by stochastic recombination and RGB-marking strategies (Table 1). However, one critical component to the application of FGB is the establishment of staggered fluorescent expression levels, which, in the original description, depended upon retroviral copy number variations and sorting for populations with the desired fluorescent properties [
]. In an attempt to ease the application of FGB for hematopoietic cell research, the Humphries group subsequently developed three dedicated lentiviral vector systems based on the expression of green fluorescent protein (GFP), yellow fluorescent protein (YFP), and monomeric Kusabira Orange 2 (mKO2) XFPs with color code complexities of 26 (26xFGB), 14 (14xFGB), and 6 (6xFGB), respectively, that do not necessarily rely on elaborate sorting procedures and thus enabled color coding of primary cells (Figures 1A and 1B) [
Table 1Maximal complexity of “color codes” as a function of input XFPs and expression intensities
No. of XFPs per cell mixture
Experimental design strategies for the production of color codes with vectors encoding for fluorescent proteins expressed at one or two different intensities are depicted in Figure 1A (two intensities) and Figure 1C (one intensity).
The 26xFGB system utilizes a set of six lentiviral vectors to express GFP, YFP, and mKO2 at bright or intermediate fluorescent intensity (Figure 1A). Staggered expression levels were achieved through the utilization of two different translational initiation elements, the Kozak consensus sequence and an ACC/ACC upstream open reading frame (uORF), respectively, which result in a 10-fold difference in XFP brightness [
]. Using these vectors, the production of the whole panel of color codes requires a combinatorial transduction approach with eight different combinations of FGB vectors in separate wells prior to pooling of all samples. However, due to the combinatorial transduction approach, cell mixes inherently exhibit an underrepresentation of double- and triple-XFP-expressing cells. Therefore, pure color-coded populations need to be generated by FACS prior to differential treatment or genetic modification of individual samples, and their subsequent characterization in longitudinal multiplex experiments.
To ease the production of color-coded populations independently from additional purification steps, the 14xFGB vector series expresses each color code from a dedicated vector (Figure 1B). These vectors are equipped with monocistronic or 2A cleavage-site-dependent bicistronic and tricistronic fluorescent marker cassettes. By equipping each of these seven vectors with either Kozak or ACC/ACC uORF-dependent translational initiation sequences, constructs will either yield a bright or intermediate bright signal for all encoded fluorescent proteins. This explains the lower complexity of the polycistronic (14xFGB) vector series compared with the combinatorial transduction approach (26xFGB). Therefore, transduction of each of the 14xFGB vectors into a separate well will generate all color codes independently of one, two, or three XFPs with comparable efficiency and offers the opportunity for color-code-specific cell modification prior to multiplex tracking over time. Advantageously, silencing of any of the 14xFGB vectors will lead to the extinction of the respective color code rather than the acquisition of an alternative color-code-dependent cell identity as possible in color-coding systems that utilize combinatorial XFP transductions.
Because the 26xFGB and 14xFGB systems both depend on staggered expression levels to achieve their full complexity, the best resolution between bright and intermediate bright populations depends on single vector integrations. Because gene transfer rates into primary cells cannot always be tightly controlled, the 6xFGB system only employs the six Kozak-dependent color codes composed of one or two XFPs from the 14xFGB system to enable flow cytometric deconvolution of color-coded populations regardless of vector copy numbers and associated differences in fluorescent intensity (Figure 1B) [
]. As before, multiplexing first requires gene transfer of each individual vector into a dedicated culture well prior to mixing of all samples and population tracking.
The opportunity to label and track cell populations or clones by FGB over time promises a significant reduction in cell culture efforts and animal numbers because multiple samples can be monitored in the same vessel under competitive conditions. This may be especially advantageous for branches of biomedical research that attempt to correlate phenotype and function [
]. This supported the exclusive detection of bone-marrow-derived color codes in myeloid or T-cell subsets, respectively, arguing for fate tracking of lineage-biased HSC clones over time. Similar experiments also demonstrated efficient multiplexing of human hematopoietic stem and progenitor cells in xenograft models.
To allow investigation of acute myeloid leukemia (AML), a disease still fatal to a high number of patients, multiplexing experiments were also performed with Hoxa9- and Meis1-transformed cells [
]. These experiments demonstrated the dynamic competitive behavior of individual color-coded input samples over time as well as the outgrowth of a single dominant color code in the majority of mice. The potential for flow-cytometric assessment of color code identities as well as their population sizes furthermore encouraged the establishment of multiplex limiting dilution assays (MLDA). Conventional LDAs require the transplantation of various cell doses into multiple groups of mice to estimate stem cell numbers through correlation of the engraftment (HSCs) and lethality (leukemic stem cells, LSCs) rate in relation to input cell numbers, respectively [
]. To make these assays more efficient, MLDA utilizes six different cell mixes with predefined color code ratios (e.g., a threefold dilution series). Therefore, six cell dilutions could be assayed in sixfold repetitions in just six mice compared with 36 mice required for conventional LDA. Likewise, multiplexing reduced the number of recipient mice of in vitrodrug-treated cells for the assessment of anti-leukemic properties in short-term in vivoassays by 12-fold [
]. However, it should be clear that MLDA and short-term drug assays both have limitations in that MLDA will preferentially identify populations with the highest LSC frequency that outcompete smaller input populations and that the potential of multiplex drug assays to quantify long-term in vivoeffects ofin vitro drug-treated cells must still be demonstrated.
Clonal fate mapping in glioblastoma
Currently, the 26xFGB, 14xFGB, and 6xFGB systems have not been applied to experimental systems that aimed for clonal fate tracking, although this may have been achieved through transplantation of limiting HSC numbers [
]. However, the Fehse group employed an advanced generation of their widely used RGB marking system called “optical barcoding” (OBC) to demonstrate flow-cytometry-assisted tracking of up to 41 and 21 glioblastoma clones in vitroand in vivo, respectively [
]. This high degree of clonal complexity in OBC experiments was achieved through combinatorial transductions of bulk cells with collections of up to six lentivirally encoded fluorescent proteins followed by single-cell sorting, clonal expansion, and mixing (Figure 1C). Similar to limitations of the 26xFGB system, fluorescent populations that expressed multiple XFPs were typically underrepresented in bulk transduction scenarios and the creation of multi-fluorescent color codes could be increased through sequential rounds of transduction with different fluorescent protein vectors and FACS for desired populations. Due to the requirement for sorting clones with defined fluorescent characteristics, it remains to be determined whether OBC could also be applied to primary HSC or LSC populations given the rapid loss of stemness during in vitrocultivation [
]. Furthermore, the utilization of six fluorescent proteins for the combinatorial creation of up to 63 color codes may already occupy the majority of channels in flow cytometers so that a reduction to four fluorescent proteins (up to 15 color codes) may be advisable to accommodate additional phenotyping options (Table 1).
Theoretical and practical considerations
Based on the above examples, it is becoming increasingly clear that color-coding-mediated multiplexing permits efficient investigation of cell properties not currently achievable with conventional in vitroand in vivoanalyses of cell populations or clones at moderate costs.
A complete color-coding experiment with cell lines can be set up in less than 4 weeks, including the production of concentrated vector stocks, titration of viral particles, and transduction of target cells with defined vector doses for the preparation of color-coded cell mixes from bulk cultures or single-cell-derived clones. Although polyclonal cell mixtures can be generated immediately after sorting, the expansion of single-cell-derived clones can be tedious, especially when experiencing low subcloning efficiencies and slow cell growth. Furthermore, primary cells might require shorter transduction procedures without further enrichment of genetically modified cells before transplantation. We therefore embedded our FGB color-coding cassettes into a lentiviral backbone to allow highly efficient gene transfer into dividing and nondividing cells, including the (mainly) quiescent long-term repopulating HSC population [
]. In contrast to cell lines, genetic modification of HSCs ideally requires vector libraries consisting of single vectors for each color code, such as the 14xFGB and 6xFGB systems, which circumvents the need for additional sorting steps. Additionally, these vectors allow for the direct coupling of color codes to factors that influence cell fate decisions (e.g., single-guide RNA, microRNA, or short-hairpin RNA) [
This leads to the question of how many labels suffice for efficient tracking applications. Although the experimental comparison of cell populations may already be informative with just two to six color codes [
]. Based on these examples and our own experience, the tracking of ∼20–30 color codes seems reasonable and yields initial population sizes of ∼3–5% within the whole donor cell fraction. However, caution must be taken to acquire sufficiently large cell numbers to achieve distinct color code signals in flow cytometric analyses. Additionally, our own experiments with 6xFGB-transduced HSCs and AML cells displayed variations in color code expression intensities, not only among different cell types, which to a certain extent can be circumvented when assessing color code distributions within cell-type-specific gates rather than investigating cell types within color code gates, but also among different mice. Although this effect may be associated with variegation of vector expression and retroviral copy number variations [
], it complicated analyses and thus required time-consuming manual expert gating of each individual sample. In part, ease of data interpretation can be achieved by utilizing proper compensation controls. We achieved the best compensation settings when first using single color-coded/stained controls followed by fine-tuning of compensations based on “fluorescence-minus-one” controls as well as samples containing all expected fluorescent protein combinations and surface marker stains, respectively. Although single-labeled compensation controls and mixed samples can be propagated over time for cell culture experiments, the production of frozen cell aliquots (e.g., peripheral blood, spleen, and bone marrow) containing single markers or mixtures of markers possibly derived through pooling of multiple primary samples may be advised for murine experiments. Regardless, when using more than 20 color codes, cumbersome data interpretation should be anticipated.
Some of the FGB-associated challenges (e.g., the production of multiple color-coded vector stocks, complicated gating procedures, and variegation of vector expression/silencing) may not apply to DNA-barcoding experiments. However, barcoding requires the extraction of DNA from sorted cells of disparate lineages as well as amplification and sequencing of the barcode repertoire (estimated at a total of $20 per sample), which delays the availability of data compared with pure color-coding experiments by 3–5 days under optimal conditions. Although accurate barcode calling may be influenced by multiple factors, especially in the absence of standardized protocols [
], the sensitivity of cell tracking and clonal resolution and the ease of sample handling in DNA barcoding experiments are still unmatched compared with alternative techniques such as integration site analysis [
]. However, one should not consider FGB and DNA barcoding as competing technologies, but rather be aware of the advantages and limitations of each technique (Table 2). As a first step toward closing the gap between flow cytometric and sequencing-based multiplexing, Cornils et al. developed a lentiviral RGB vector platform in which each of the three fluorescent markers was combined with a unique DNA barcode library [
]. Therefore, advanced FGB vector designs in combination with complex DNA libraries may guide the way toward the development of next-generation cell-tracking tools that should be accompanied by the implementation of algorithm-assisted automated gating routines to fully unleash the potential of high-definition FGB in hematopoietic research [
The ongoing discovery of new and improved fluorescent proteins and the availability of reasonably priced multicolor flow cytometers will further spur high-complexity FGB-mediated clonal fate mapping and multiplexing of experimental samples. Although setup and operation of flow cytometers for FGB do not differ from conventional analyses, the amount of generated data will increase significantly, especially when including phenotypical analysis of complex samples in the experiment. Therefore, FGB will greatly benefit from automated gating and analysis strategies to ease data interpretation, along with the development of improved color coding vector libraries.
The authors thank Nontsikelelo Mpofu-Maetzig for critical discussion of the manuscript.
This work was supported by the German Research Council (MA 7010/1-1), by the Cluster of Excellence REBIRTH (EXC 62/1), and the SFB738.
Introduction of a selectable gene into primitive stem cells capable of long-term reconstitution of the hemopoietic system of W/Wv mice.