Delete various whitespace characters [Python]

Overview

Here is a summary of how to delete various types of spaces such as half-width spaces and full-width spaces.

environment

macOS Catalina version 10.15.4 python 3.8.0

code

Delete line feed code, tabs, spaces, etc. all at once

Use str.split ()

#\u3000 is a full-width space
text = "a\u3000 b\t\nc\r\n"
text = ''.join(text.split())

Delete only line feed codes (\ r \ n and \ n) at once

Use str.splitlines ()

text = "a\u3000 b\t\nc\r\n"
text = ''.join(text.splitlines())

Delete some spaces (for example, full-width space, half-width space, tab) except line feed code at once

Use str.translate ()

text = "a\u3000 b\t\nc\r\n"
table = str.maketrans({
  '\u3000': '',
  ' ': '',
  '\t': ''
})
text = text.translate(table)

If there are many other characters you want to delete, it is easier to write the argument of str.maketrans () in comprehension notation.

text = "a\u3000 b\t\nc\r\nd\x0ce\x0bf"
table = str.maketrans({
    v: '' for v in '\u3000 \x0c\x0b\t' #Or['\u3000',' ','\x0c','\x0b','\t']
})
text = text.translate(table)

Supplement: Use of regular expressions

I've given you some advice on how to use regular expressions in comments, so I'll summarize them below. Thank you for your comment.

import re

#Delete line breaks, tabs, spaces, etc. all at once
text = "a\u3000\n\n b\t\nc\r\nd\x0ce\x0b\rf\r\n"
text = re.sub(r"\s", "", text)

#Line feed code (\r\n or\n) Delete only at once
text = "a\u3000\n\n b\t\nc\r\nd\x0ce\x0b\rf\r\n"
text = re.sub(r"[\r\n]", "", text)

#Delete some spaces (for example, full-width space, half-width space, tab) except line feed code at once
text = "a\u3000\n\n b\t\nc\r\nd\x0ce\x0b\rf\r\n"
text = re.sub(r"[\u3000 \t]", "", text)

reference

  1. Space is not just ""
  2. yohhoy's diary --Delete all whitespace characters
  3. Output, concatenate, split, delete, replace strings including line breaks in Python

Recommended Posts

Delete various whitespace characters [Python]
# 3 [python3] Various operators
Various Python visualization tools
Segfault Python with 33 characters
Various processing of Python
Various Python built-in string operations
About various encodings of Python 3
Manipulate various databases with Python
Display characters like AA in python
[Python] One-liner Stalin sort with 50 characters
Delete multiple elements in python list