From the results of 16S rRNA flora analysis using Qiime2, we will introduce a method for visualizing the distribution of the composition rate of specific bacteria. In the previous section, we compared the intestinal flora of the CD (Crohn's disease) group, UC (ulcerative colitis) group, and nonIBD group (non-inflammatory bowel disease) group. I will introduce how to represent it with a box whiskers diagram. With reference to this article, you will be able to create the following boxplots.
This time, I will use Altair which can create various graphs by inputting Python DataFrame. Drawings other than box plots are also introduced at here.
To create a boxplot, you need count data that summarizes the number of bacterial reads for each sample and sample metadata. For details, refer to Previous section.
Table.qza
and taxonomy.qza
are required to get the count data. For how to create each file, refer to here. In this paper, since we use Phylum level count data, execute the following command, paying attention to --p-level 2
.
Terminal (in Qiime2 virtual environment)
qiime taxa collapse --i-table table.qza --i-taxonomy taxonomy.qza --p-level 2 --o-collapsed-table L2_table.qza
qiime tools export --input-path L2_table.qza --output-path L2
biom convert -i L2/feature-table.biom -o L2/table.tsv --to-tsv
If you get the following file, you are successful.
Create the following metadata in tsv format.
You can get a box plot by executing the following command.
alt_comp_plot.py
import os
import altair as alt
import pandas as pd
#Designation of classification class. Phylum is level 2.
l_select = 'L2'
#Get current directory
cwd = os.getcwd()
#Acquisition of count data
count_path = [l_select,'table.tsv']
count_file = os.path.join(cwd, *count_path)
count = pd.read_table(count_file, sep='\t', index_col=0 ,header=1).T # header=Note 1
#Convert to composition data
comp = count.apply(lambda x: x/sum(x), axis=1)
#Get metadata
md_path = ['metadata.tsv']
md_file = os.path.join(cwd, *md_path)
md = pd.read_table(md_file, sep='\t', index_col=0 ,header=0)
#Convert line name to str type (This line name is a number, so it has been processed by int type)
comp.index = comp.index.astype(str)
md.index = md.index.astype(str)
#Combine count data and metadata. (If the line name is not str type, it will not be combined)
df = pd.concat([comp,md], axis=1)
#This time, I will examine the flora of Ileum (ileum) and Rectum (rectum). (Because the number of samples was small in other parts)
df = df[df['biopsy_location'].isin(['Ileum','Rectum'])]
#Run Altair
boxplot = alt.Chart(df).mark_boxplot(size=100,ticks=alt.MarkConfig(width=30), median=alt.MarkConfig(color='black',size=100)).encode(
alt.X('diagnosis',sort = alt.Sort(['CD','UC','nonIBD']), axis=alt.Axis(labelFontSize=15, ticks=True, titleFontSize=18, title='Diagnosis')),
alt.Y('D_0__Bacteria;D_1__Firmicutes', axis=alt.Axis(format='%', labelFontSize=15, ticks=True, titleFontSize=18, grid=False,domain=True, title='Firmicutes'), scale=alt.Scale(domain=[0,0.02])),
alt.Color('diagnosis'),
alt.Column('biopsy_location', header=alt.Header(labelFontSize=15, titleFontSize=18), sort = alt.Sort(['Ileum','Rectum']), title='Biopsy')
).properties(
width=600,
height=500,
)
#Display of figure
boxplot.show()
A brief introduction to Altair's commands.
.mark_boxplot ()
Set the box plot. size
Box width ticks
beard setting median
median.encode ()
Settings that depend on the contents of the DataFrameʻalt.X ()
`Specify the column that determines the X-axis componentʻalt.Sort ()
`Determine the order of axesʻalt.Axis ()
`Axis settings. It is also possible to enter characters different from DataFrame in title
. The line on the axis disappears with ticks = False
.ʻalt.Y ()
`Specify the column that determines the Y-axis componentʻalt.Axis ()
`Axis settings. You can display percentages with format ='%'
. The horizontal line on the graph is erased with grid = False
.ʻalt.Color ()
`Specify the column that determines the color.ʻalt.Column ()
`Arrange graphs in parallel.property ()
Settings that do not depend on the contents of DataFrame. Here, the size of the figure is specified.You can save the figure in png format or svg format from "..." on the upper right.
Recommended Posts