Example)He said “Hello World!I said
From
Hello World!
Extract
Constructed with reference to the following
Regular expression: An expression that matches only the contents of parentheses
re.search(r"(?<=\“).*?(?=\”)", sentence)
Initially I tried to unify to half-width double quotes using python's full-width half-width conversion package `` `jaconv```, but that didn't work.
This is because jaconv.normalize handles double-byte double quotes as follows.
'”'=> '"'
'“' => '``'
Be careful because it is difficult to tell whether the double quotation is full-width or half-width, and which character code it is.
Double-byte double quotes are a bad civilization! </ B>
Recommended Posts