[PYTHON] The reason why the Regular Expression (RE) fail to parse .tex source.

nested commands

Example .tex source with nested commands.

\frac{1}{1+\frac{1}{1+\frac{1}{1+x}}}

In this case, it's impossible to search the othor of third bra's pair with RE.

(non-)greedy RE

Let's try!

use greedy match

a = r"\frac{1}{1+\frac{1}{1+\frac{1}{1+x}}}"
m = re.search(r"\\frac\{.*\}\{.*\}", a)

This match to first frac's braket.

'\\frac{1}{1+\\frac{1}{1+\\frac{1}{1+x}}}'

use non-greedy match

a = r"\frac{1}{1+\frac{1}{1+\frac{1}{1+x}}}"
m =re.search( r"\\frac\{.*?\}\{.*?\}", a)
m.group()

This match to

'\\frac{1}{1+\\frac{1}'

conclusion

Of cource you can maybe solve this problem using more complicated RE. However, everone hates to write intricately. I think that how to solve it is only parsing strings par one letter. Do you have any othor solusions?