[LINUX] Obtain OTU (microorganism) count data as a text file using QIIME2

In the analysis using the microbial analysis pipeline "QIIME2", in Official Tutorial, the count data of gate and genus level from fastq file Can be obtained, but there is no description on how to obtain OTU count data.

However, in recent years, analysis at the OTU level has been actively performed, and I think it is necessary, so I investigated it.

[Software used (OS)] VirtualBox QIIME2 2019.10 (for VirtualBox) For more information → Installing QIIME2

(Since Docker is popular, there is a possibility that VirtalBox will shift to Docker in the future ...)

Get sample data

As a prerequisite, we assume that you have table.qza according to the Official Tutorials (https://docs.qiime2.org/2020.6/tutorials/moving-pictures/).

Please get the sample data from the link below. download table.qza

Export OTU data

Output to a file in .biom format. A file called "feature-table.biom" will be created in the same directory.

QIIME2


qiime tools export --input-path table.qza --output-path ./

Convert to text file

Since normal analysis (analysis using a table with OTU name and count) cannot be performed with the biom file as it is, convert biom to a tsv file (text file such as .txt).

QIIME2


biom convert -i feature-table.biom -o feature-table.tsv --to-tsv

Referenced site (English)

You now have a text file that looks like this:

image.png

The first column is the OTU name (ID), and the second row is the sample name. I think the first line is a hindrance to processing the data frame, so I think it's okay to delete it manually.

In the future, we will continue to improve the analysis method using this.

Recommended Posts

Obtain OTU (microorganism) count data as a text file using QIIME2
I tried reading data from a file using Node.js.
Problems when using Elasticsearch as a data source in Redash
Create a dummy data file
Export a gzip-compressed text file
[Python] How to store a csv file as one-dimensional array data
[Python] Read a csv file with a large data size using a generator
Count specific strings in a file