diff --git a/vignettes/single_cell_rnaseq_basic.Rmd b/vignettes/single_cell_rnaseq_basic.Rmd
deleted file mode 100644
index 9bc10b1..0000000
--- a/vignettes/single_cell_rnaseq_basic.Rmd
+++ /dev/null
@@ -1,329 +0,0 @@
----
-title: "Analysis of single-cell RNA-seq data using a topic model, Part 1: basic concepts"
-author: Peter Carbonetto
-date: "`r Sys.Date()`"
-output:
-  html_document:
-    toc: yes
-    toc_float: no
-    highlight: textmate
-    theme: readable
-vignette: >
-  %\VignetteIndexEntry{Analysis of single-cell RNA-seq data using a topic model, Part 1: basic concepts}
-  %\VignetteEngine{knitr::rmarkdown}
-  %\VignetteEncoding{UTF-8}
----
-
-The aim of this vignette is to introduce the basic concepts behind an
-analysis of single-cell RNA-seq data using a topic model, and to show
-how to use 'fastTopics' to implement this analysis. This first
-vignette explains the analysis steps at only a high level---see
-[Part 2][vignette-part-2] for additional explanations and guidance.
-
-Since a topic model analysis is quite different from most conventional
-analyses of single-cell RNA-seq data, we point out key
-differences.
-
-One important difference is that a topic model is a model of count
-data, so *the topic model should be applied directly to the count
-data.* In contrast, many methods require preprocessing of the count
-data.
-
-```{r knitr-opts, include=FALSE}
-knitr::opts_chunk$set(comment = "#",collapse = TRUE,results = "hold",
-                      fig.align = "center",dpi = 120)
-```
-
-We begin our analysis by loading the packages. Then we set the seed so
-that the results can be reproduced.
-
-```{r load-pkgs, message=FALSE, warning=FALSE}
-library(Matrix)
-library(fastTopics)
-library(ggplot2)
-library(cowplot)
-set.seed(1)
-```
-
-The example data set
---------------------
-
-We will illustrate the concepts using a single-cell RNA-seq data set
-from [Zheng *et al* (2017)][zheng-2017]. These data are reference
-transcriptome profiles from 10 bead-enriched subpopulations of
-peripheral blood mononuclear cells (PBMCs). The original data set is
-much larger---for this introduction, we have taken a small subset of
-approximately 3,700 cells.
-
-The data we will analyze are unique molecular identifier (UMI)
-counts. These data are stored as an $n \times m$ sparse matrix, where
-$n$ is the number of cells and $m$ is the number of genes:
-
-```{r load-data}
-data(pbmc_facs)
-counts <- pbmc_facs$counts
-dim(counts)
-```
-
-The UMI counts are "sparse"---that is, most of the counts are
-zero. Indeed, over 95% of the UMI counts are zero:
-
-```{r nonzeros}
-mean(counts > 0)
-```
-
-No need to normalize or transform
----------------------------------
-
-Most analyses of single-cell RNA-seq data involve a preprocessing step
-in which the UMI counts are log-transformed and normalized. *When
-analyzing UMI counts with a topic model, you should not normalize or
-transform the counts.*
-
-Additionally, analyses of single-cell RNA-seq data typically select
-only the most highly variable genes. *We recommend instead using all
-or most genes.* The exception are genes in which all the UMI counts
-are zero---these genes should be removed prior to a topic model
-analysis.
-
-Fit the topic model
--------------------
-
-Since no pre-processing is needed, we can move directly to the next
-step of the analysis: fitting the topic model to the UMI count
-data. This is accomplished with a call to `fit_topic_model`:
-
-```{r fit-topic-model, eval=FALSE}
-fit <- fit_topic_model(counts,k = 6)
-```
-
-Note it may take several minutes to fit the topic model on this data
-set. For convenience, we saved the output from this call:
-
-```{r load-fit}
-fit <- pbmc_facs$fit
-```
-
-To fit a topic model, we must specify $K$, the number of topics.
-Here, we have chosen $K = 6$ topics. In most settings, a good choice
-of $K$ will not be known in advance, so you will you want to explore
-the results from topic models at different settings of $K$.
-
-The `fit_topic_model` interface is intended to hide the details of
-model fitting, and it should work well for many data sets. Larger or
-more complex data sets my require some fine-tuning of the model
-fitting. See [Part 2][vignette-part-2] for more on this.
-
-Each cell is represented as a unique combination of topics
-----------------------------------------------------------
-
-A key feature of the topic model is that each cell $i$ is represented
-as a *unique combination of the topics.* Therefore, each cell can be
-summarized by the $K$ topic proportions. These cell-specific
-proportions are learned from the data. In 'fastTopics', the topic
-proportions for all cells are stored in an $n \times K$ matrix:
-
-```{r loadings-1}
-dim(fit$L)
-```
-
-To illustrate, here is a cell in which the pattern of expression is
-almost fully captured by the fourth topic:
-
-```{r loadings-2}
-rows <- "GATATATGTCAGTG-1-b_cells"
-round(fit$L[rows,],digits = 3)
-```
-
-Here are two more examples:
-
-```{r loadings-3}
-rows <- c("GACAGTACCTGTGA-1-memory_t",
-          "TGAAGCACACAGCT-1-b_cells")
-round(fit$L[rows,],digits = 3)
-```
-
-The last example is interesting because the observed expression is
-best captured by a combination of topics 4 and 6.
-
-To make sense of these results, we first need to understand the
-biological relevance of the topics, which we explore next.
-
-Interpreting topics using available cell labels
------------------------------------------------
-
-In some cases, you may have additional information about the cells,
-such as the tissue the cells were sampled from. (For interpreting
-topics when cells are not labeled, see [Part 2][vignette-part-2].)  In
-the PBMC data, the cells were labeled using fluorescence-activated
-cell sorting (FACS). Each cell is assigned one of five labels (each
-corresponding to a cell type): B cells, CD14+ monocytes, CD34+ cells,
-natural killer (NK) cells, and T cells.
-
-```{r summary-subpop}
-samples <- pbmc_facs$samples
-summary(samples$subpop)
-```
-
-Create a "Structure plot" to visualize the relationship between the
-cell labels and the topic proportions:
-
-```{r structure-plot-with-celltype-labels, fig.width=7.5, fig.height=1.75, results="hide", message=FALSE}
-topic_colors <- c("skyblue","forestgreen","darkmagenta","dodgerblue",
-                  "gold","darkorange")
-structure_plot(fit,colors = topic_colors,topics = 1:6,gap = 25,
-               grouping = samples$subpop)
-```
-
-The Structure plot is a stacked bar chart in which each topic is
-represented as a bar of a different colour, and the bar heights are
-the topic proportions. Patterns begin to emerge when the cells are
-arranged so that cells with similar topic proportions are positioned
-close to each other. This arrangement is automated in `structure_plot`.
-
-From the Structure plot, we see that topics 1 (light blue), 2 (green)
-and 4 (dark blue) closely correspond to NK, CD14+ and B cell
-types, respectively; expression in these cells is largely
-explained by a single topic.
-
-The topic model also captures interesting substructure within the T
-cells: most T cells are explained by at least two topics, with wide
-variation in the proportions for these two topics; also, there is a
-subset of T cells that are explained by three topics (topics 1, 5 and
-6).
-
-Predicting topics in other cells
---------------------------------
-
-Having fitted a topic model to a single-cell data set, we can use that
-model to predict topics for cells that were not used to train the
-model. We illustrate this with a separate collection of 100 cells
-drawn from the same PBMC data set:
-
-```{r test-data}
-counts_test <- pbmc_facs$counts_test
-dim(counts_test)
-```
-
-The "predict" function estimates the topic proportions based on a
-previously fit topic model.
-
-```{r predict, results="hide", message=FALSE}
-Ltest <- predict(fit,counts_test)
-```
-
-The predictions appear to align well with the provided cell labels:
-
-```{r structure-plot-test, fig.width=7.5, fig.height=1.75, results="hide", message=FALSE}
-fit_test <- list(F = fit$F,L = Ltest)
-class(fit_test) <- c("multinom_topic_model_fit","list")
-structure_plot(fit_test,topics = 1:6,colors = topic_colors,gap = 2,
-               grouping = pbmc_facs$samples_test$subpop)
-```
-
-Topics capture patterns of relative expression
-----------------------------------------------
-
-Each topic is represented as a vector of $m$ expression levels which
-are stored as an $m \times K$ matrix:
-
-```{r factors-1}
-dim(fit$F)
-```
-
-These expression levels are comparable across topics. For example,
-genes *CD79A* are *CD79B* are important to B cells, so we would expect
-higher expression in topic 4:
-
-```{r factors-2}
-genes <- pbmc_facs$genes
-rbind(fit$F[genes$symbol == "CD79B",],
-      fit$F[genes$symbol == "CD79A",])
-```
-
-The most informative genes are those with higher expression in one
-topic compared to the other topics. In the next section we explain how
-to identify these genes.
-
-Annotating topics by differentially expressed genes
----------------------------------------------------
-
-Consider a standard differential expression (DE) analysis of cell
-types: for a given cell type, the log-fold change (LFC) is estimated
-as the (base-2) log-ratio of two expression levels, the expression in
-cells belonging to the cell type over the expression in cells not
-belonging to the cell type.
-
-'fastTopics' extends the standard DE analysis to allow for *partial
-membership* to multiple groups (the methods are described
-[here][single-cell-topics-paper]). This "grade of membership" (GoM)
-differential expression analysis is implemented by the
-function `de_analysis`:
-
-```{r de-analysis-1, eval=FALSE}
-set.seed(1)
-de <- de_analysis(fit,counts,pseudocount = 0.1,
-                  control = list(ns = 1e4,nc = 4))
-```
-
-Since this computation can take 10 minutes or more to run, we have
-provided the output of this `de_analysis` call in the `pbmc_facs` data
-set:
-
-```{r de-analysis-2}
-de <- pbmc_facs$de
-```
-
-The GoM DE analysis generates LFC estimates that have a similar
-interpretation to a standard DE analysis. To illustrate, the GoM DE
-analysis estimates strong overexpression of B-cell genes *CD79A* and
-*CD79B* in topic 4:
-
-```{r diff-count-analysis-3}
-rbind(de$postmean[genes$symbol == "CD79A",],
-      de$postmean[genes$symbol == "CD79B",])
-```
-
-Ultimately, we would like to discover genes most relevant to a topic,
-not just report LFC estimates for known genes. A common approach is to
-visualize the results of the DE analysis using a volcano plot. For
-example, here is the volcano plot for topic 4:
-
-```{r volcano-plot-bcells, fig.height=4, fig.width=4.5}
-volcano_plot(de,k = 4,labels = genes$symbol)
-```
-
-Typically, the most interesting genes are found on the right-hand side
-of the volcano plot---that is, genes with large LFC---so long as they
-have a small lfsr (local false sign rate). Indeed, B cell gene *CD79A*
-is on the far right-hand side of the plot.
-
-Likewise, natural killer genes such as *NKG7* and *GNLY* emerge on the
-right-hand side of the volcano plot for topic 1:
-
-```{r volcano-plot-nk, fig.height=4, fig.width=4.5}
-volcano_plot(de,k = 1,labels = genes$symbol,ymax = 100)
-```
-
-Some of the genes with the largest LFCs are *CD3D*, *CD3E* and *CD8B*
-which suggest topic 6 is capturing expression specific to T cells:
-
-```{r volcano-plot-tcells, fig.height=4, fig.width=4.5}
-volcano_plot(de,k = 6,labels = genes$symbol,ymax = 100)
-```
-
-See also [Part 2][vignette-part-2] for more DE analysis examples.
-
-Session info
-------------
-
-This is the version of R and the packages that were used to generate
-these results.
-
-```{r session-info}
-sessionInfo()
-```
-
-[vignette-part-2]: https://stephenslab.github.io/fastTopics/articles/single_cell_rnaseq_practical.html
-[zheng-2017]: https://doi.org/10.1038/ncomms14049
-[single-cell-topics-paper]: https://doi.org/10.1101/2023.03.03.531029 
diff --git a/vignettes/single_cell_rnaseq_practical.Rmd b/vignettes/single_cell_rnaseq_practical.Rmd
deleted file mode 100644
index aea4bd1..0000000
--- a/vignettes/single_cell_rnaseq_practical.Rmd
+++ /dev/null
@@ -1,231 +0,0 @@
----
-title: "Analysis of single-cell RNA-seq data using a topic model, Part 2: practical implementation"
-author: Peter Carbonetto
-date: "`r Sys.Date()`"
-output:
-  html_document:
-    toc: yes
-    toc_float: no
-    highlight: textmate
-    theme: readable
-vignette: >
-  %\VignetteIndexEntry{Analysis of single-cell RNA-seq data using a topic model, Part 2: practical implementation}
-  %\VignetteEngine{knitr::rmarkdown}
-  %\VignetteEncoding{UTF-8}
----
-
-This vignette continues from [Part 1][vignette-part-1], where we
-introduced the basic steps of a topic modeling analysis for
-single-cell data. Here we give more detailed guidance and discuss
-complications that may arise.
-
-```{r knitr-opts, include=FALSE}
-knitr::opts_chunk$set(comment = "#",collapse = TRUE,results = "hold",
-                      fig.align = "center",dpi = 120)
-```
-
-We begin our analysis by loading the packages. Then we set the seed so
-that the results can be reproduced.
-
-```{r load-pkgs, message=FALSE, warning=FALSE}
-library(Matrix)
-library(fastTopics)
-library(ggplot2)
-library(cowplot)
-set.seed(1)
-```
-
-Now we load the data set and the $K = 6$ pre-fitted topic model.
-
-```{r load-data}
-data(pbmc_facs)
-counts <- pbmc_facs$counts
-fit <- pbmc_facs$fit
-```
-
-Assessing convergence of model fitting
---------------------------------------
-
-The topic model was fitted using the `fit_topic_model` function. This
-function, as we mentioned, hides most of the complexities of model
-fitting. Nevertheless, it is a good idea to check that the model
-fitting has converged to a maximum-likelihood solution. This is
-easily done with 'fastTopics':
-
-```{r plot-loglik, fig.height=2, fig.width=4}
-plot_progress(fit,x = "iter",add.point.every = 10,colors = "black") +
-  theme_cowplot(font_size = 10)
-```
-
-Internally, `fit_topic_model` first fits a non-negative matrix
-factorization (NMF) to the count data, then converts the fitted NMF to
-a topic model, so the plot shows the improvement in the model fit over
-time. Judging by this plot, the parameter estimates get close to a
-maxium-likelihood solution after about 150 updates.
-
-For larger or more complex data sets, more updates may be needed to
-obtain a good fit. For guidance, see `help(fit_topic_model)` and
-`help(fit_poisson_nmf)`.
-
-Evaluating fit: single-cell likelihoods
----------------------------------------
-
-The topic model can be used to calculate *a likelihood for each cell:*
-
-```{r loglik-1}
-loglik <- loglik_multinom_topic_model(counts,fit)
-```
-
-This can be used to assess how well the topic model "fits" each cell.
-
-```{r loglik-2, fig.height=2, fig.width=4.5}
-pdat <- data.frame(loglik)
-ggplot(pdat,aes(loglik)) +
-  geom_histogram(bins = 64,color = "white",fill = "black",size = 0.25) +
-  labs(y = "number of cells") +
-  theme_cowplot(font_size = 10)
-```
-
-Most of the poorly fit cells are in the CD34+ subpopulation:
-
-```{r loglik-3, fig.height=2, fig.width=5.5}
-subpop_colors <- c("dodgerblue","forestgreen","darkmagenta","skyblue","gold")
-pdat <- data.frame(loglik = loglik,subpop = pbmc_facs$samples$subpop)
-ggplot(pdat,aes(x = loglik,fill = subpop)) +
-  geom_histogram(bins = 64,color = "white",size = 0.25) +
-  scale_fill_manual(values = subpop_colors) +
-  labs(y = "number of cells") +
-  theme_cowplot(font_size = 10)
-```
-
-Visualizing the topics without cell labels
-------------------------------------------
-
-In [Part 1][single_cell_rnaseq_practical], we made use of additional
-information about the cells to help us interpret the topics. Even
-without the cell labels, the Structure plot can still be effective:
-
-```{r structure-plot-without-labels, fig.width=6.5, fig.height=1.5, results="hide", message=FALSE}
-topic_colors <- c("skyblue","forestgreen","darkmagenta","dodgerblue",
-                  "gold","darkorange")
-structure_plot(fit,colors = topic_colors)
-```
-
-From this Structure plot, the topics clearly distinguish the B cells
-(dark blue), CD14+ monocytes (green) and CD34+ cells (purple).  The NK
-cells (light blue) and T cells (cells high proportions of yellow and
-orange) are harder to distinguish, and this is reflected in sharing of
-topics (mostly topic 6) among NK and T cells.
-
-Of course, without the cell labels, we cannot know that the topics
-correspond to these cell types without further downstream
-analyses---for this, we performed a GoM DE analysis (see
-[Part 1][vignette-part-1]).
-
-Identifying clusters from the topic proportions
------------------------------------------------
-
-In more complex data sets, some structure may not show up well without
-additional fine-tuning of the plot.
-
-One simple strategy that often works well is to first identify
-clusters in the topic proportions matrix, $L$. Here we use $k$-means
-to identify clusters, but other methods could be used.
-
-```{r clustering-1}
-set.seed(1)
-pca <- prcomp(fit$L)$x
-clusters <- kmeans(pca,centers = 6,iter.max = 100)$cluster
-summary(factor(clusters))
-```
-
-We chose 6 clusters, but to be clear the best number of clusters may
-not be the same as the number of topics.
-
-Now we create another Structure plot in which the cells are grouped
-according to the clusters:
-
-```{r structure-plot-by-cluster-1, fig.width=7.5, fig.height=1.75, results="hide", message=FALSE}
-structure_plot(fit,topics = 1:6,colors = topic_colors,gap = 25,
-               grouping = clusters) +
-  theme(axis.text.x = element_blank())
-```
-
-k-means somewhat arbitrarily split the T cells (cells with high
-proportions of topics 5 and 6) into clusters 1 and 3. Therefore, we
-merge these two clusters:
-
-```{r clustering-2}
-clusters[clusters == 3] <- 1
-clusters <- factor(clusters)
-```
-
-We have found that visual inspection of the clusters followed by
-manual refinement is often be needed to get the "right" clustering.
-
-Now we can label the clusters by these cell types in the plot.
-(Noting that this labeling is usually not possible until downstream
-analyses have been performed.)
-
-```{r structure-plot-by-cluster-2, fig.width=7.5, fig.height=1.75, results="hide", message=FALSE}
-levels(clusters) <- c("T","CD14+","B","CD34+","NK")
-structure_plot(fit,topics = 1:6,colors = topic_colors,gap = 25,
-               grouping = clusters)
-```
-
-It is also sometimes helpful to visualize the structure in PCA plots
-which show the projection of the cells onto PCs of the topic
-proportions:
-
-```{r pca-plot-1, fig.width=3.5, fig.height=2.5, message=FALSE, warning=FALSE}
-cluster_colors <- c("gold","forestgreen","dodgerblue","darkmagenta","skyblue")
-pca_plot(fit,fill = clusters) +
-  scale_fill_manual(values = cluster_colors)
-```
-
-When there are many overlapping points, plotting the *density* can
-sometimes show the clusters more clearly:
-
-```{r pca-plot-2, fig.width=3.75, fig.height=2.5, message=FALSE}
-pca_hexbin_plot(fit,bins = 24)
-```
-
-More on differential expression analysis
-----------------------------------------
-
-In [Part 1][vignette-part-1], we performed a DE analysis using the
-topic model.
-
-```{r diff-count-analysis}
-de <- pbmc_facs$de
-```
-
-When the volcano plot shows many overlapping differentially expressed
-genes, it can be helpful to explore the results interactively. The
-function `volcano_plotly` creates an *interactive volcano plot* that can
-be viewed in a Web browser. For example, here is an interactive
-volcano plot for the T cells topic:
-
-```{r volcano-plotly, warning=FALSE}
-genes <- pbmc_facs$genes
-p <- volcano_plotly(de,k = 6,file = "volcano_plot_t_cells.html",
-                    labels = genes$symbol,ymax = 100)
-```
-
-This call creates a new file `volcano_plot_t_cells.html`. The
-interactive volcano plot can also be viewed [here][volcano-plotly], or
-by calling `print(p)` in R.
-
-Session info
-------------
-
-This is the version of R and the packages that were used to generate
-these results.
-
-```{r session-info}
-sessionInfo()
-```
-
-[vignette-part-1]: https://stephenslab.github.io/fastTopics/articles/single_cell_rnaseq_basic.html
-[volcano-plotly]: https://stephenslab.github.io/fastTopics/articles/volcano_plot_t_cells.html
-[zheng-2017]: https://doi.org/10.1038/ncomms14049