Get the MimeType from the file contents. MimeType can be inferred from the extension of the file, but since it is used for purposes that are troublesome if it is rewritten, it is obtained from the file contents.
Use Apache Tika.
However, the result depends on the file name unless it involves TikaInputStream
.
Install and use Apache Tika Core from Maven.
I used 1.21
for verification.
It's a rough sample, but it's a sample to get MimeType for all files under src / main / resources
.
As a side note, the ʻorg.apache.tika.metadata.Metadatatype is named because it is named Kotlin's
Metadata` type.
Read the file in resources and output MimeType
import java.io.File
import org.apache.tika.Tika
import org.apache.tika.io.TikaInputStream
import org.apache.tika.metadata.Metadata as TikaMetadata
fun main() {
val resourcesDir = File(System.getProperty("user.dir") + "/src/main/resources")
val metaData = TikaMetadata()
val tika = Tika()
resourcesDir.listFiles().map {
val tikaStream = TikaInputStream.get(it.toURI(), metaData)
//Extension is unified to lowerCase for sorting
it.name.split(".").last().toLowerCase() + " -> " + tika.detect(tikaStream, metaData)
}.sorted().forEach {
//Output after sorting
println(it)
}
}
This is the result of throwing in the files and samples that were in that area and turning them. It can be taken almost uniquely. I also rewrote the extension and tried it, and it worked pretty well.
7z -> application/x-7z-compressed
avi -> video/x-msvideo
docx -> application/vnd.openxmlformats-officedocument.wordprocessingml.document
exe -> application/x-dosexec
flv -> video/x-flv
html -> text/html
jpg -> image/jpeg
jpg -> image/jpeg
m3u -> text/plain
mkv -> video/x-matroska
mkv -> video/x-matroska
mkv -> video/x-matroska
mkv -> video/x-matroska
mov -> video/quicktime
mov -> video/quicktime
mov -> video/quicktime
mov -> video/quicktime
mp3 -> audio/mpeg
mp4 -> video/mp4
mp4 -> video/mp4
mp4 -> video/mp4
mp4 -> video/mp4
mp4 -> video/mp4
mp4 -> video/mp4
mp4 -> video/mp4
mp4 -> video/mp4
mp4 -> video/mp4
mp4 -> video/x-m4v
mpg -> video/mpeg
mpg -> video/mpeg
mpg -> video/mpeg
msi -> application/x-ms-installer
pdf -> application/pdf
png -> image/png
pptx -> application/vnd.openxmlformats-officedocument.presentationml.presentation
svg -> image/svg+xml
ts -> application/octet-stream
vcmf -> application/octet-stream
vob -> video/mpeg
webm -> video/webm
webm -> video/webm
webm -> video/webm
webm -> video/webm
zip -> application/zip
I used Tika this time, but the method using ʻURLConnection and
mime-util` is major in the sense that it comes out in the search.
However, these had difficulty in detection accuracy and maintenance continuation, so this time I used Tika as a trial.
-How to get ContentType from file header in Java \ | Hacknote -Providing video compression samples
Recommended Posts